在python中的列表列表中計數元素

Question

希望你能幫我w /這個python函數：

def comparapal(lista):#lista is a list of lists where each list has 4 elements
  listaPalabras=[]
  for item in lista:
     if item[2] in eagles_dict.keys():# filter the list if the 3rd element corresponds to the key in the dictionary
        listaPalabras.append([item[1],item[2]]) #create a new list with elements 2 and 3

listaPalabras結果：

[
   ['bien', 'NP00000'],
   ['gracia', 'NCFP000'],
   ['estar', 'VAIP1S0'],
   ['bien', 'RG'],
   ['huevo', 'NCMS000'],
   ['calcio', 'NCMS000'],
   ['leche', 'NCFS000'],
   ['proteina', 'NCFS000'],
   ['francisco', 'NP00000'],
   ['ya', 'RG'],
   ['ser', 'VSIS3S0'],
   ['cosa', 'NCFS000']
]

我的問題是：如何比較每個列表的第一個元素，以便如果單詞相同，則比較它們的標簽，即第二個元素。

很抱歉，函數不明確，必須返回包含3個元素的列表列表：單詞，標簽和每個單詞的出現次數。 但是為了計數單詞，我需要比較帶有其他單詞的單詞，如果存在兩個或更多單詞，則比較標簽以找出差異。 如果標簽不同，則分別計算單詞。

結果-> [[['bien'，'NP00000'，1]，['bien'，'RG'，1]]->兩個相同的單詞，但是通過標簽比較將它們分別計數

Answer 1

import collections
inlist = [
   ['bien', 'NP00000'],
   ['gracia', 'NCFP000'],
   ['estar', 'VAIP1S0'],
   ['bien', 'RG'],
   ['huevo', 'NCMS000'],
   ['calcio', 'NCMS000'],
   ['leche', 'NCFS000'],
   ['proteina', 'NCFS000'],
   ['francisco', 'NP00000'],
   ['ya', 'RG'],
   ['ser', 'VSIS3S0'],
   ['cosa', 'NCFS000']
]
[(a,b,v) for (a,b),v in collections.Counter(map(tuple,inlist)).iteritems()]
#=>[('proteina', 'NCFS000', 1), ('francisco', 'NP00000', 1), ('ser', 'VSIS3S0', 1), ('bien', 'NP00000', 1), ('calcio', 'NCMS000', 1), ('estar', 'VAIP1S0', 1), ('huevo', 'NCMS000', 1), ('gracia', 'NCFP000', 1), ('bien', 'RG', 1), ('cosa', 'NCFS000', 1), ('ya', 'RG', 1), ('leche', 'NCFS000', 1)]

您要計算每對出現的次數。 counter表達式可以做到這一點。 列表理解將其格式化為三元組。

Answer 2

您需要什么具體輸出？ 我不知道您到底需要做什么，但是如果您想將與同一個單詞相關的項目分組，則可以將此結構轉換為字典並稍后進行操作

>>> new = {}
>>> for i,j in a: # <-- a = listaPalabras 
        if new.get(i) == None:
                new[i] = [j]
        else:
                new[i].append(j)

這將給我們：

{'francisco': ['NP00000'], 'ser': ['VSIS3S0'], 'cosa': ['NCFS000'], 'ya': ['RG'], 'bien': ['NP00000', 'RG'], 'estar': ['VAIP1S0'], 'calcio': ['NCMS000'], 'leche': ['NCFS000'], 'huevo': ['NCMS000'], 'gracia': ['NCFP000'], 'proteina': ['NCFS000']}

然后可以執行以下操作：

>>> for i in new:
        if len(new[i]) > 1:
                print "compare {this} and {that}".format(this=new[i][0],that=new[i][1])

將打印：

compare NP00000 and RG #for key bien

編輯：在第一步中，您也可以使用defaultdict，如Marcin在評論中所建議的，這看起來像這樣：

>>> d = defaultdict(list)
>>> for i,j in a:
        d.setdefault(i,[]).append(j)

EDIT2（對OP評論的回答）

for i in d:
    item = []
    item.append(i)
    item.extend(d[i])
    item.append(len(d[i]))
    result.append(item)

這給我們：

[['francisco', 'NP00000', 1], ['ser', 'VSIS3S0', 1], ['cosa', 'NCFS000', 1], ['ya', 'RG', 1], ['bien', 'NP00000', 'RG', 2], ['estar', 'VAIP1S0', 1], ['calcio', 'NCMS000', 1], ['leche', 'NCFS000', 1], ['huevo', 'NCMS000', 1], ['gracia', 'NCFP000', 1], ['proteina', 'NCFS000', 1]]

Answer 3

純粹基於列表的解決方案當然是可能的，但是需要附加的循環。 如果效率很重要，最好用dict代替listaPalabras 。

def comparapal(lista):
  listaPalabras=[]
  for item in lista:
     if item[2] in eagles_dict.keys():
        listaPalabras.append([item[1],item[2]])

  last_tt = [None, None]
  for tt in sorted(listaPalabras):
    if tt == last_tt:
      print "Observed %s twice" % tt
    elif tt[0] == last_tt[0]:
      print "Observed %s and %s" % (tt, last_tt)
    last_tt = tt

這將為您提供： Observed ['bien', 'RG'] and ['bien', 'NP00000']

如果這不符合您的目的，請指定您的問題。

在python中的列表列表中計數元素

問題描述

3 個解決方案

解決方案1
2 已采納 2013-08-09 20:21:00

解決方案2
1 2013-08-09 19:43:19

解決方案3
0 2013-08-09 19:54:25

在python中的列表列表中計數元素

問題描述

3 個解決方案

解決方案1 2 已采納 2013-08-09 20:21:00

解決方案2 1 2013-08-09 19:43:19

解決方案3 0 2013-08-09 19:54:25

解決方案1
2 已采納 2013-08-09 20:21:00

解決方案2
1 2013-08-09 19:43:19

解決方案3
0 2013-08-09 19:54:25