简体   繁体   English

如何向具有不同列号的 Pandas 数据框添加新行?

[英]How to add new rows to a Pandas Data Frame with varying column numbers?

I want to add new rows in a Pandas data frame without considering the order and the number of columns in every new row.我想在 Pandas 数据帧中添加新行,而不考虑每个新行中的顺序和列数。

As I add new rows, I want my data frame to look like below.当我添加新行时,我希望我的数据框如下所示。 Every row can have different number of columns.每行可以有不同数量的列。

---- | 1    | 2    | 3    | 4 
row1 | data | data | 
row2 | data | data | data 
row3 | data | 
row4 | data | data | data | data 

Building pandas DataFrames one row at a time is typically very slow.一次一行地构建 pandas DataFrames 通常非常慢。 One solution is to first gather the data in a dictionary, and then turn it into a dataframe for further processing:一种解决方案是首先将数据收集到字典中,然后将其转换为 dataframe 进行进一步处理:

d = {
    'att1': ['a', 'b'],
    'att2': ['c', 'd', 'e'],
    'att3': ['f'],
    'att4': ['g', 'h', 'i', 'j'],
}
df = pd.DataFrame.from_dict(d, orient='index')

Which results in df containing:这导致df包含:

        0    1    2    3
att1    a    b    None None
att2    c    d    e    None
att3    f    None None None
att4    g    h    i    j

Or more in line with typical pandas formats, store the data in one long series where 'att1' is used as index for values 'a' and 'b', etc.:或者更符合典型的 pandas 格式,将数据存储在一个长系列中,其中“att1”用作值“a”和“b”等的索引:

series = df.stack().reset_index(level=1, drop=True)

which allows for easy selection of various attributes:它允许轻松选择各种属性:

series.loc[['att1', 'att3']]

returning:返回:

att1    a
att1    b
att3    f

In pandas you can concatenate new rows with an existing data frame (even if the new row has different number of columns) as below.在 pandas 中,您可以将新行与现有数据框连接起来(即使新行具有不同的列数),如下所示。

import pandas as pd

df = pd.DataFrame([list(range(5))])
new_row = pd.DataFrame([list(range(4))])
pd.concat([df,new_row], ignore_index=True, axis=0)

In the above code snippet, pd.concatenate function merges two data frames.在上面的代码片段中,pd.concatenate function 合并了两个数据帧。 If you provide the argument ignore_index=True, pandas will merge two data frames without considering their lengths.如果您提供参数 ignore_index=True,pandas 将合并两个数据帧而不考虑它们的长度。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM