简体   繁体   中英

Sort a dictionary of with a 2d list by the first values

I am trying to sort my data files using a dictionary structure. This way, I would sort my dictionary by its first value content, for example, the first row/column of the dictionary value.

The issue I am having is that when using lambda to sort the dictionary it does not accept lambda item:item[1] , which I believe correspond to the value of the original_dict . By now, that is what I have:

original_dict = {'file1.txt': array([[ 9., 40., 50., 20.],[10., 40., 50., 20.]]), 
                 'file2.txt':array([[1., 2., 3., 4.],[2., 2., 3., 4.]]), 
                 'file3.txt': array([[0.1, 0.2, 0.3, 0.4],[0.2, 0.2, 0.3, 0.4]])}

d2 = {k: v for k, v in sorted(original_dict.items(), key=lambda item: item[1])}

Returns

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Desired output

d2 = {'file3.txt': array([[0.1, 0.2, 0.3, 0.4],[0.2, 0.2, 0.3, 0.4]]), 
      'file2.txt':array([[1., 2., 3., 4.],[2., 2., 3., 4.]]), 
      'file1.txt': array([[ 9., 40., 50., 20.],[10., 40., 50., 20.]])}

You were nearly there with your line of code.

What you are doing in the lambda function is to extract item[1]. This is the first entry of the (key, value) tuple, in this case you are extracting your array only. Now you cannot sort on the whole array, therefore you can do something like that:

d2 = {k: v for k, v in sorted(original_dict.items(), key=lambda item: item[1].flatten()[0])}

Here you flatten your array by using the np.array.flatten method and use the first entry to sort.

You get this error because your key= argument needs to access a numeric value. In your case, you are extracting a matrix. Therefore, when sorted() tries to sort your array, it tries to compare 2 numpy matrices -- your keys. It does not work with numpy because numpy generates a matrix of boolean when comparing matrices, and not a single numeric value. So python does not know how to handle that.

In your case, you need to think about what exact criteria you want to use. Is it the first value in the first row? Is it the sum of the values in the first row? The total matrix sum?

Here are some examples that can work for you:

# Sort by the matrix total sum
d2 = {k: v for k, v in sorted(original_dict.items(), key=lambda item: item[1].sum())}

# Sort by the first row sum
d2 = {k: v for k, v in sorted(original_dict.items(), key=lambda item: item[1][0].sum())}

# Sort by the first element of the first row
d2 = {k: v for k, v in sorted(original_dict.items(), key=lambda item: item[1][0, 0])}

What You Needed To do was convert the return value of the dictionary.items() method to a list. The return value is not subscriptable so that probably is where the error came from

original_dict = {'file1.txt': array([[ 9., 40., 50., 20.],[10., 40., 50., 20.]]), 
                 'file2.txt':array([[1., 2., 3., 4.],[2., 2., 3., 4.]]), 
                 'file3.txt': array([[0.1, 0.2, 0.3, 0.4],[0.2, 0.2, 0.3, 0.4]])}


d2 = {k: v for k, v in sorted(list(original_dict.items()), key=lambda item: item[1])}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM