簡體   English   中英

如果第一個元組元素匹配,如何合並列表中的兩個元組?

[英]How to merge two tuples in a list if the first tuple elements match?

我有兩個以下形式的元組列表:

playerinfo = [(ansonca01,4,1871,1,RC1),(forceda01,44,1871,1,WS3),(mathebo01,68,1871,1,FW1)]

idmatch = [(ansonca01,Anson,Cap,05/06/1871),(aaroh101,Aaron,Hank,04/13/1954),(aarot101,Aaron,Tommie,04/10/1962)]

我想知道的是,我如何遍歷兩個列表,並且如果“ playerinfo”中元組的第一個元素與“ idmatch”中元組的第一個元素匹配,則將匹配的元組合並在一起以產生一個新列表元組? 形式:

merged_data = [(ansonca01,4,1871,1,RC1, Anson,Cap,05/06/1871),(...),(...), etc.] 

新的元組列表將使ID號與正確播放器的名字和姓氏匹配。

背景信息:我正在嘗試合並兩個棒球統計數據的CSV文件,但是其中一個包含所有相關統計數據的文件不包含球員姓名,僅包含參考編號(例如“ ansoc101”),而第二個文檔中包含參考編號一個列,另一列中相應玩家的名字和姓氏。

CSV的大小太大,無法手動執行(大約20,000個播放器),因此我正在嘗試使該過程自動化。

使用列表理解來遍歷您的列表:

[x + y[1:] for x in list1 for y in list2 if x[0] == y[0]]

我在列表上嘗試過:

list1 = [("this", 1, 2, 3), ("that", 1, 2, 3), ("other", 1, 2, 3)]
list2 = [("this", 5, 6, 7), ("that", 10, 11, 12), ("notother", 1, 2, 3)]

並得到:

[('this', 1, 2, 3, 5, 6, 7), ('that', 1, 2, 3, 10, 11, 12)]

那是你想要的嗎?

您可以首先創建一個字典來啟用快速ID號碼查找,然后通過列表理解非常有效地將兩個列表中的數據合並在一起:

import operator

playerinfo = [('ansonca01', 4, 1871, 1, 'RC1'),
              ('forceda01', 44, 1871, 1, 'WS3'),
              ('mathebo01', 68, 1871, 1, 'FW1')]

idmatch = [('ansonca01', 'Anson', 'Cap', '05/06/1871'),
           ('aaroh101', 'Aaron', 'Hank', '04/13/1954'),
           ('aarot101', 'Aaron', 'Tommie', '04/10/1962')]

id = operator.itemgetter(0)  # To get id field.

idinfo = {id(rec): rec[1:] for rec in idmatch}  # Dict for fast look-ups.

merged = [info + idinfo[id(info)] for info in playerinfo if id(info) in idinfo]

print(merged) # -> [('ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871')]

字典

  1. 循環訪問playerinfo列表並創建字典,其中key是元組中的第一項,value是所有項的列表。
  2. 第一步打印結果。
  3. 再次迭代idmatch列表,並檢查結果字典idmatch組的第一項與否。 如果存在,則通過列表擴展方法用新值擴展鍵的值。
  4. 打印第二步的結果。
  5. 從生成的字典創建輸出格式。

演示:

import pprint

playerinfo = [("ansonca01",4,1871,1,"RC1"),\
              ("forceda01",44,1871,1,"WS3"),\
              ("mathebo01",68,1871,1,"FW1")]

idmatch = [("ansonca01","Anson","Cap","05/06/1871"),\
           ("aaroh101","Aaron","Hank","04/13/1954"),\
           ("aarot101","Aaron","Tommie","04/10/1962")]

result = {}
for i in playerinfo:
    result[i[0]] =  list(i[:])

print "Debug Rsult1:"
pprint.pprint(result)

for i in idmatch:
    if i[0] in result:
        result[i[0]].extend(list(i[1:])) 

print "\nDebug Rsult2:"
pprint.pprint(result)

final_rs = []
for i,j in result.items():
    final_rs.append(tuple(j))

print "\nFinal result:"

pprint.pprint(final_rs)

輸出:

infogrid@infogrid-vivek:~/workspace/vtestproject$ python task4.py 
Debug Rsult1:
{'ansonca01': ['ansonca01', 4, 1871, 1, 'RC1'],
 'forceda01': ['forceda01', 44, 1871, 1, 'WS3'],
 'mathebo01': ['mathebo01', 68, 1871, 1, 'FW1']}

Debug Rsult2:
{'ansonca01': ['ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871'],
 'forceda01': ['forceda01', 44, 1871, 1, 'WS3'],
 'mathebo01': ['mathebo01', 68, 1871, 1, 'FW1']}

Final result:
[('ansonca01', 4, 1871, 1, 'RC1', 'Anson', 'Cap', '05/06/1871'),
 ('forceda01', 44, 1871, 1, 'WS3'),
 ('mathebo01', 68, 1871, 1, 'FW1')]
infogrid@infogrid-vivek:~/workspace/vtestproject$ 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM