从字典列表中创建 Pandas DataFrame？每个字典在 DataFrame 中作为行？

Question

I have been through several posts, however, I am unable to sort out how to use each dictionary within a list of dictionaries to create a rows in a pandas Dataframe.我已经看过几篇文章，但是，我无法弄清楚如何使用字典列表中的每个字典在 pandas Dataframe 中创建行。 Specifically, I have two issues that my limited experience with dictionaries is unable to workaround.具体来说，我有两个问题，我对字典的有限经验无法解决。

So far I have separated each key and value into two columns however, what I am looking for is to create a row for each dictionary and use the key as the column name.到目前为止，我已将每个键和值分成两列，但是，我正在寻找的是为每个字典创建一行并将键用作列名。
Only the first key in each dictionary is unique, thus I would either like to drop it completely or only use the key as a value to populate a column under the name "id".只有每个字典中的第一个键是唯一的，因此我想完全删除它，或者只使用该键作为值来填充名为“id”的列。

Example List of Dictionaries (>500k in total):字典示例列表（总共> 500k）：

pep_list=[{'HV404': 'WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR',
          'gene': 'HV404',
          'aa_comp': {'W': 4,
       'V': 5,
       'L': 5,
       'S': 10,
       'Q': 3,
       'E': 1,
       'G': 5,
       'P': 2,
       'K': 1,
       'T': 2,
       'C': 1,
       'A': 1,
       'I': 1,
       'N': 1,
       'R': 1},
      'peptide': ['WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR'],
      'Length': 43,
      'z': 3,
      'Mass': 4557,
      'm/z': 1519.0}, 
    {'A0A0G2JNQ3': 'ISGNTSR',
          'gene': 'A0A0G2JNQ3',
          'aa_comp': {'I': 1, 'S': 2, 'G': 1, 'N': 1, 'T': 1, 'R': 1},
          'peptide': ['ISGNTSR'],
          'Length': 7,
          'z': 2,
          'Mass': 715,
          'm/z': 357.5},etc.]

Expected output:预期 output：

Dataframe = pd.DataFrame({values from dictionaries}, columns=["id", "gene", 'aa_comp', 'peptide', 'length', 'z', 'mass','m/z')

id ID	columns of keys键列
dictionary 1字典 1	values in seperate columns单独列中的值
dictionary 2字典 2	values in seperate columns单独列中的值

Thank you for any insight!感谢您的任何见解！

Answer 1

Whatever these things are不管这些东西是什么

{'HV404': 'WVLSQVQLQESGPGLVKPSGTLSLTCAVSGGSISSSNWWSWVR',}
{'A0A0G2JNQ3': 'ISGNTSR',}

are messing it up, plus it doesn't look like they are needed because the info is repeated.搞砸了，而且看起来不需要它们，因为信息是重复的。

If you want to take out a non-representative key you can do something like this如果你想取出一个非代表性的钥匙，你可以做这样的事情

key_intersect = set(pep_list[0].keys()).intersection(set(pep_list[1].keys()))
new_list_of_dictionaries = [{key:value for (key,value) in dicts.items() if key in key_intersect} for dicts in pep_list]
df = pd.DataFrame(new_list_of_dictionaries)

Pretty compact code, but you could unfurl it in loops if needed.非常紧凑的代码，但如果需要，您可以在循环中展开它。 Beware of blindly taking out the first element, unless it is an ordered dict the first element is not guaranteed to be the same.注意不要盲目地取出第一个元素，除非它是一个有序的字典，否则不保证第一个元素是相同的。

Answer 2

You can try this:你可以试试这个：

df = pd.DataFrame.from_dict(pep_list, orient='index').reset_index()

The orient changes the key to a column in the dataframe and reset_index is used to reset the index, although it may not be needed in your case. orient 将键更改为 dataframe 中的列，并且 reset_index 用于重置索引，尽管在您的情况下可能不需要它。

After that, you can filter out for the columns you want.之后，您可以过滤掉您想要的列。

从字典列表中创建 Pandas DataFrame？每个字典在 DataFrame 中作为行？

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-03-23 16:18:13

解决方案2
0 2021-03-23 15:46:14

从字典列表中创建 Pandas DataFrame？ 每个字典在 DataFrame 中作为行？

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-03-23 16:18:13

解决方案2 0 2021-03-23 15:46:14

从字典列表中创建 Pandas DataFrame？每个字典在 DataFrame 中作为行？

解决方案1
1 已采纳 2021-03-23 16:18:13

解决方案2
0 2021-03-23 15:46:14