简体   繁体   English

Python熊猫添加到数据框

[英]Python Pandas Adding to Dataframe

I have been working on a little pandas work. 我一直在做一些小熊猫的工作。 What I am trying and failing to do is make a simple data quality report. 我正在尝试但没有做的是制作简单的数据质量报告。 I have a Dataframe with columns that looks like this 我有一个带有看起来像这样的列的数据框

columns = ['Feature','count', 'Miss.%', 'Card.', 'Min', '1st Qrt.', 'Mean', 'Median', '3rd Qrt', 'Max', 'Std Div']
df2 = pd.DataFrame(index=cont_index, columns=columns)
df2.to_csv('/REPORT.csv')

I have then run through all the different columns and carried out calculations on each on. 然后,我遍历了所有不同的列,并对每个列进行了计算。 This all works and runs fine, the result is a row for each column that contains a value that matches up to the column headings. 所有这些都可以正常运行,并且结果是每列包含一行与该列标题匹配的值的行。

Example, list = ['Income',300,0.0,21,0.0,0.0,2,222, 0.0, 33.98,9,999, 20]

I am obtaining these values by looping through the different column names and then carrying out the functions for each heading. 我通过遍历不同的列名然后执行每个标题的功能来获取这些值。

What I am having issue with is adding these values into the Dataframe. 我遇到的问题是将这些值添加到Dataframe中。 I simply want to take each row as it is made and then insert it one by one into the dataframe. 我只是想按原样制作每一行,然后将其一一插入到数据框中。 When ever I try the resulting Dataframe isnt correct and the values dont line up correctly and sometimes arnt in the right position 每当我尝试所得的Dataframe不正确并且值未正确排列时,有时会在正确的位置学习

How do I do this? 我该怎么做呢?

There are at least two ways to do this: 至少有两种方法可以执行此操作:

1. using concat 1.使用concat

df1 = DataFrame(...)
df2 = df1.groupby(columns).agg({ column : function, ... }).reset_index()
combined = pd.concat([df1, df2])

2. using append 2.使用append

df1 = DataFrame(...)
df2 = df1.groupby(columns).agg({ column : function, ... }).reset_index()
combined = df1.append(df2)

Here, agg is used to generate the statistics for each each group, where columns is a list of columns used to group values. 在这里, agg用于生成每个组的统计信息,其中columns是用于对值进行分组的列的列表。 Of course you may generate the two dataframes any way you like. 当然,您可以按照自己喜欢的任何方式生成两个数据帧。

df2.loc['new_row'] = list

假设'new_row'不在cont_index

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM