简体   繁体   中英

How to sort 3 or more lists of lists by length of sublists

I have been working on a NLP task involving trigrams, that is, strings of 3 contiguous letters from a text corpus. I have three lists of lists. The first consists of frequently-occurring trigrams common to various combinations of two languages. The second consists of their counts in language 1. The third consists of the counts in language 3.

I would like to sort these lists, placing the list with the most trigrams in common at the top.

Let's look at a sample of these lists:

for i, j, k in zip(trigrams, lang1_counts, lang2_counts):
    print(i,j,k)

['er_', 'n_d', '_de', 'in_', 'en_'] [1087, 1213, 2038, 903, 3855] [2996, 969, 2226, 951, 3872]
['in_', '_in', 'er_'] [903, 937, 1087] [1101, 1369, 1080]
['et_', 'de_', '_de', '_en'] [1314, 2359, 2038, 769] [880, 2254, 2881, 787]

As you can see, that the first trigram lists are of length 5,3,4 respectively. I want to sort it so that it goes 5,4,3. For plotting, the counts of the trigrams must also be sorted. This is just a small sample; I have many more such lists. The lists of lists are of the same length.

I have tried these solutions so far but neither work:

trigrams, lang1_counts, lang2_counts = zip(*sorted(zip(trigrams, lang1_counts, lang2_counts), key=len, reverse=True))

trigrams, lang1_counts, lang2_counts = (list(t) for t in zip(*sorted(zip(trigrams, lang1_counts, lang2_counts), key=len, reverse=True)))

Can anyone see why they don't work and suggest something that would? The given methods do not raise errors; they just have no effect at all.

My references were:

How to sort list of lists according to length of sublists How to sort two lists (which reference each other) in the exact same way

try this

trigrams, lang1_counts, lang2_counts = zip(
    *sorted(zip(trigrams, lang1_counts, lang2_counts), key=lambda x: len(x[0]), reverse=True))

you should sort according to the len of the first element rather than the length of the zipped tuple which in these case 3 for all of them

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM