python - How to create a dictionary of dictionary with these functions? -


I have a dictionary like this:

  dict = {in: [0.01, - 0.07, 0.09, -0.02], and: For [0.2, 0.3, 0.5, 0.6]: [0.87, 0.98, 0.54, 0.4]}  

I calculate cosine parity I want for each word for which I have the cosine similarity function which takes two vectors First of all, it will take the price for 'in' and 'and', then it's 'in' and 'to' and above Should be priced for.

I want to accumulate the result of the second one, where there should be a 'key', and with that key each coin should be a dictionary of the cosine parity value as I want The output is like this:

  {in: {0.4: 321, to: 0.218}, and: {in: 0.1245, to: 0.9876}, in: {in: 0.8764, and: 0.123}} Below is the code that is doing this:  
  def cosine_similarity (vec1, vec2): sum11, sum12, sum22 = 0, 0, 0 for the range i (lane (vec1 )): X = vec1 [i]; Y = vec2 [i] sum11 + = x * x sum22 + y = y sum12 + = x * y return sum12 / math.sqrt (sum11 * sum22) diff result indict (result, name, value, key c): new_dict = {} New_dict [keyC] = Value result in name: result [name] = new_dict else: result [name] = new_dict def extract (): result = {} res = {} with open ('file.txt') As text: For the line in the text: record = line.split () key = record [0] value = [value for the value in the [value] record [1:]] res [key] = the value for the key , Res.iteritems (value): value for temp = 0, res.iteritems in valueC (): if keyC == key: release floating = cosine_similarity (value, v AlueC) Results InDict (Results, Key, Temp, KeyC) Print Results  

But, it is giving results like this:

  {' ':' In ': 0.12241083209661485},' to ': {' in ': -0.0654517869126785},' to ':' in ': -0.5324142931780856},' in ': {' to ': -0.5324142931780856}} < / Code> 

I should be like this:

  {in: {0.4: 321, to: 0.218}, and: {in: 0.1245, to: 0.9876 }, In: {in: 0.8764, and: 0.123}}  

I think because in the result InDict function I am defining a new dictionary new dictionary which is important for internal dictionary To add your important values, but the results each time the function is called INDict, it is clear new_dict on the line new_dict = {} , and only adds a key value pairs.

How can I fix this ??

is not very elegant, but it works:

  Import math def cosine_similarity (vec1, vec2): sum11, sum12, sum22 = 0, 0, 0 in the range for i (lane (vc1)): x = vc 1; Y = vec2 [i] sum11 + = x * x sum22 + y = y sum12 + = x * y return sum12 / math.sqrt (sum11 * sum22) mydict = {"in": [0.01, -0.07, 0.09, - 0.02], "and": [0.2, 0.3, 0.5, 0.6], "to": [0.87, 0.98, 0.54, 0.4]} mydict_keys = mydict.keys () result = {} in my1data_keys for k1: temp_dict = {} In mydict_keys for: if k1! = K2: temp_dict [k2] = cosine_similarity (mydict [k1], mydict [k2]) results [k1] = temp_dict  

In addition, if you have large data structures, more efficient To calculate the cosine parallelism in the way, consider using scipy () or scikit-learn () (not only quick afterwards but memory is also favorable , Because you can feed it scipy.sparse matrix.


Comments

Popular posts from this blog

apache - 504 Gateway Time-out The server didn't respond in time. How to fix it? -

c# - .net WebSocket: CloseOutputAsync vs CloseAsync -

c++ - How to properly scale qgroupbox title with stylesheet for high resolution display? -