简体   繁体   English

从字典列表创建Pandas Dataframe,其中某些键的值缺失

[英]Create Pandas Dataframe from List of Dictionaries with missing values for some keys

everyone. 大家。

Below is the code I'm using to parse a text file: 下面是我用来解析文本文件的代码:

import pandas as pd

tags = ['129','30','32','851','9730','9882'] 
rows = []

file = open('D:\\python\\redi_fix\\redi_august.txt','r') 
content = file.readlines() 
for line in content:
    for message in line.split('\t'):
        try:
            row_dict = {}
            tag,val = message.split('=')        
            if tag in tags:
                row_dict[tag]=val
                rows.append(row_dict)
        except:
            pass

Creating a pandas dataframe from rows yields the following result: 从行创建pandas数据帧会产生以下结果:

129     30      32      851     9730    9882
r170557 NaN     NaN     NaN     NaN     NaN
NaN     ARCA    NaN     NaN     NaN     NaN
NaN     NaN     100     NaN     NaN     NaN
r170557 NaN     NaN     NaN     NaN     NaN
NaN     ARCA    NaN     NaN     NaN     NaN
NaN     NaN     300     NaN     NaN     NaN

Looks like every value for a key is on a different row. 看起来密钥的每个值都在不同的行上。 The result I'm struggling to achieve is all values to be on the same row - see below for example: 我努力实现的结果是所有值都在同一行 - 见下面例如:

129     30      32      851     9730    9882
r170557 ARCA    100     NaN     NaN     NaN
r170557 ARCA    300     NaN     NaN     NaN

If you want to "collapse" your NaN s, you can perform a groupby + agg on first / last : 如果你想“折叠”你的NaN ,你可以在first / last上执行groupby + agg

df.groupby(df['129'].notnull().cumsum(), as_index=False).agg('first')

       129    30     32  851  9730  9882
0  r170557  ARCA  100.0  NaN   NaN   NaN
1  r170557  ARCA  300.0  NaN   NaN   NaN

Using your result dataframe, we need sorted and dropna 使用结果数据dropna ,我们需要sorteddropna

result.apply(lambda x : sorted(x,key=pd.isnull)).dropna(thresh=1)
Out[1171]: 
       129    30     32  851  9730  9882
0  r170557  ARCA  100.0  NaN   NaN   NaN
1  r170557  ARCA  300.0  NaN   NaN   NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从包含一些缺失键/值的嵌套字典列表构造一个 Pandas dataframe - Construct a Pandas dataframe from list of nested dictionaries containing some missing keys/values 从 Pandas 数据框中的字典列表中获取值 - Get values from a list of dictionaries in a Pandas Dataframe 在熊猫中如何从字典列表中创建数据框? - In pandas how to create a dataframe from a list of dictionaries? 使用Pandas DataFrame中其他两列的键和值创建字典列 - Create column of dictionaries with keys and values from other two columns in Pandas DataFrame 从字典列表中创建一个 pandas DataFrame,其中字典键设置为行标签 - Create a pandas DataFrame from a list of dictionaries with dictionary keys set as row labels 通过字典列表,字典键列创建熊猫数据框 - Creating a pandas Dataframe from a list of Dictionaries, dictionary keys as columns 从键列表和值列表创建字典列表? - create a list of dictionaries from a list of keys and a list of list of values? 如何从键列表和值列表创建字典列表 - How to create a list of dictionaries from a list of keys and a list of values 如何从 python 中字典列表的特定键创建 dataframe? - How to create a dataframe from specific keys of a List of dictionaries in python? 如何从具有相同键的字典列表中创建熊猫 dataframe。? - How to create a panda dataframe from a list of dictionaries with the same keys.?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM