Skip to main content

How to calculate cosine similarity?

I have two dictionaries one regular, one nested. What I want is to calculate the cosine similarity of the regular one with relation to each one of the nested dictionaries and store all the values to a list.

The regular one has the following structrure: query_weights ={'word1': float_num, 'word2': float_num} the nested one has the following structure : corpus_weights = {'id': {'word': float_num, 'word2': float_num}, 'id': {'word1': float_num, 'word2': float_num}

What I have tried so far is to calculate the dot product for each pair (regular dictionary with each nested)

 dot_product = [sum([np.dot(corpus_weights[doc][stem(query)], query_weights[stem(query)]) for query in term_dict]) for doc in tf_idf_dict.keys()]

My goal was to do the same for the products of lenghts and then zip and divide.

I try to calculate the lentgh of the vectors as such: vector_lenght= norm(list(query_weights.values())) * sum(norm(idf_scores for idf_scores in data.values() for text, data in corpus_weights.items()))

Unresolved reference 'data'



source https://stackoverflow.com/questions/75684733/how-to-calculate-cosine-similarity

Comments