數據字典的python列表

Question

我有一個字典對象{key,value}的列表，如下所示：

recd = [{'Type': 'status'}, {'Origin': 'I just earned the Rookie badge on #Yelp!'}, 
         {'Text': 'I just earned the Rookie badge on'}, {'URL': ''}, 
         {'ID': '95314179338158080'}, {'Time': 'Sun Jul 24 21:07:25 CDT 2011'},
         {'RetCount': '0'}, {'Favorite': 'false'},
         {'MentionedEntities': ''}, {'Hashtags': 'Yelp'}]

我嘗試了多種方法將其移動到pandas數據框對象，其中鍵是列名，值是記錄值。

s = pd.Series(data=recd)  ## try #1  
tweets = tweets.append(s, ignore_index=True)  

tweets = tweets.append(recd, ignore_index=True)  #try #2  

tweets.from_items(recd)  #try #3  

mylist = [item.split(',') for item in recd] #try #4 (stack overflow)  
tdf = pd.DataFrame(mylist)  

tweets.from_records(recd)  #try #5

tweets.concat(recd, axis=1, etc)  # tries 6-20

當然，這些都不起作用。 在這一點上，我嘗試了顯而易見的方法，並使用了所有各種columns= ， ignore_index等參數），我缺少了顯而易見的方法。 我通常使用結構化數據轉儲，所以這對我來說是新的。 我懷疑我沒有正確格式化數據，但是解決方案使我難以理解。

背景：我正在一次將一個非標准格式的大型已解析數據文件中的每個recd對象一次構建為一個完整的記錄，然后嘗試將其轉換為pandas數據框，在其中可以將其保存為任意數量的可用格式。 該過程還消除了許多數據錯誤。 執行此操作的代碼是：

 k = line.split(":",1)  
 key = str(k[0].strip())  
 val = str(k[1].strip())  
 if key in TweetFields:  
     d = {key : val}   # also tried d = [key:val]
     recd.append(d)

謝謝你的建議。

Answer 1

您可以使用dict理解將dict列表合並為一個dict。 然后將該字典傳遞給pd.DataFrame ：

In [105]: pd.DataFrame({key: [val] for dct in recd for key, val in dct.items()})
Out[105]: 
  Favorite Hashtags                 ID MentionedEntities  \
0    false     Yelp  95314179338158080                     

                                     Origin RetCount  \
0  I just earned the Rookie badge on #Yelp!        0   

                                Text                          Time    Type URL  
0  I just earned the Rookie badge on  Sun Jul 24 21:07:25 CDT 2011  status

雖然這解決了轉換類型的字典列表轉換成數據幀的單排的問題，這將是preferrable避免使用類型的字典列表，因為建設為每行一個新的數據幀是低效的。

如果您解釋原始數據的外觀（具有多於一行的數據）以及最終的DataFrame的外觀，則可能會得到更有用的答案。

Answer 2

如果只想轉換1個字典列表：

temp_df = pd.DataFrame([{key: value for dict in recd for key, value in dict.items()}])

但是，如果您打算使用這種構造來創建具有許多行的DF，則應將每條記錄的1個字典中的所有{key：values}連接起來，並將它們附加到列表中：

recd = [{'Type': 'status', 'Origin': 'I just earned the Rookie badge on #Yelp!', 
     'Text': 'I just earned the Rookie badge on', 'URL': '', 
     'ID': '95314179338158080', 'Time': 'Sun Jul 24 21:07:25 CDT 2011',
     'RetCount': '0', 'Favorite': 'false',
     'MentionedEntities': '', 'Hashtags': 'Yelp'}]

recd.append({'Type': 'status', 'Origin': 'BLAH BLAH', 
     'Text': 'One more on the road', 'URL': '', 
     'ID': 'NA', 'Time': 'NA',
     'RetCount': 'NA', 'Favorite': 'false',
     'MentionedEntities': '', 'Hashtags': 'Yelp'})

temp_df = pd.DataFrame(recd)

數據字典的python列表

問題描述

2 個解決方案

解決方案1
0 2015-10-10 21:27:57

解決方案2
0 2015-10-11 09:33:47

數據字典的python列表

問題描述

2 個解決方案

解決方案1 0 2015-10-10 21:27:57

解決方案2 0 2015-10-11 09:33:47

解決方案1
0 2015-10-10 21:27:57

解決方案2
0 2015-10-11 09:33:47