从字典列表创建Pandas Dataframe，其中某些键的值缺失

Question

everyone. 大家。

Below is the code I'm using to parse a text file: 下面是我用来解析文本文件的代码：

import pandas as pd

tags = ['129','30','32','851','9730','9882'] 
rows = []

file = open('D:\\python\\redi_fix\\redi_august.txt','r') 
content = file.readlines() 
for line in content:
    for message in line.split('\t'):
        try:
            row_dict = {}
            tag,val = message.split('=')        
            if tag in tags:
                row_dict[tag]=val
                rows.append(row_dict)
        except:
            pass

Creating a pandas dataframe from rows yields the following result: 从行创建pandas数据帧会产生以下结果：

129     30      32      851     9730    9882
r170557 NaN     NaN     NaN     NaN     NaN
NaN     ARCA    NaN     NaN     NaN     NaN
NaN     NaN     100     NaN     NaN     NaN
r170557 NaN     NaN     NaN     NaN     NaN
NaN     ARCA    NaN     NaN     NaN     NaN
NaN     NaN     300     NaN     NaN     NaN

Looks like every value for a key is on a different row. 看起来密钥的每个值都在不同的行上。 The result I'm struggling to achieve is all values to be on the same row - see below for example: 我努力实现的结果是所有值都在同一行 - 见下面例如：

129     30      32      851     9730    9882
r170557 ARCA    100     NaN     NaN     NaN
r170557 ARCA    300     NaN     NaN     NaN

Answer 1

If you want to "collapse" your NaN s, you can perform a groupby + agg on first / last : 如果你想“折叠”你的NaN ，你可以在first / last上执行groupby + agg ：

df.groupby(df['129'].notnull().cumsum(), as_index=False).agg('first')

       129    30     32  851  9730  9882
0  r170557  ARCA  100.0  NaN   NaN   NaN
1  r170557  ARCA  300.0  NaN   NaN   NaN

Answer 2

Using your result dataframe, we need sorted and dropna 使用结果数据dropna ，我们需要sorted和dropna

result.apply(lambda x : sorted(x,key=pd.isnull)).dropna(thresh=1)
Out[1171]: 
       129    30     32  851  9730  9882
0  r170557  ARCA  100.0  NaN   NaN   NaN
1  r170557  ARCA  300.0  NaN   NaN   NaN

从字典列表创建Pandas Dataframe，其中某些键的值缺失

问题描述

2 个解决方案

解决方案1
4 2017-11-17 21:13:01

解决方案2
4 已采纳 2017-11-17 21:13:04

从字典列表创建Pandas Dataframe，其中某些键的值缺失

问题描述

2 个解决方案

解决方案1 4 2017-11-17 21:13:01

解决方案2 4 已采纳 2017-11-17 21:13:04

解决方案1
4 2017-11-17 21:13:01

解决方案2
4 已采纳 2017-11-17 21:13:04