简体   繁体   English

如何为 Pandas 中的一列创建“总计”行 Dataframe

[英]How To Create a "Total" Row for One Column in a Pandas Dataframe

So I've created a DF from file names I've pulled using the os module所以我从使用 os 模块提取的文件名创建了一个 DF

The file names include dollar amounts and I would like to be able to create a row that totals just the amount in that column of the DF (index 3)文件名包括美元金额,我希望能够创建一个总计 DF(索引 3)列中的金额的行

However, when I follow this code structure:但是,当我遵循此代码结构时:

File_Name.loc['Total'] = File_Name.sum()

I get this:我明白了:

                                                 Invoice  ...                                             Amount
30                                                  6515  ...                                             401.01
Total  0822OH082522KTR1987000084201987000084481987000...  ...  478.88550.0030.1032.3912.0432.521020.4729.1442...

I would love for it to look like this:我希望它看起来像这样:

         Invoice         Vendor   Amount
30          6515        Expense   401.01
Total                          198556.79

Any help would be much appreciated!任何帮助将非常感激!

The long number you get in Amount is probably the result of string concatenation:您在Amount中得到的长数字可能是字符串连接的结果:

'478.88' + '550.00' + '30.10' + '32.39'

outputs产出

478.88550.0030.1032.39

So, the first step will be to cast the column Amount to floats with File_Name['Amount'].astype('float') .因此,第一步是使用File_Name['Amount'].astype('float')将列Amount转换为浮点数。

You can add the sum of Amount and get the visual effect you are looking for with您可以添加Amount的总和并获得您正在寻找的视觉效果

df.loc['Total', 'Amount'] = df['Amount'].sum()
df.loc['Total'] = df.loc['Total'].fillna('')

Nevertheless, I would strongly recommend not using pandas as if it were Excel. While Excel style can be convenient when working under such a heavy interface, it's problematic from a programmatic point of view: now you'll have an extra data point with a large value in Amount and lots of null strings.尽管如此,我强烈建议不要像使用 Excel 那样使用 pandas。虽然 Excel 样式在如此繁重的界面下工作时很方便,但从编程的角度来看这是有问题的:现在你将有一个额外的数据点Amount和 null 字符串中的值。

Pandas just released (v1.5.0) a new feature in Styler for doing just this. Pandas 刚刚发布 (v1.5.0) Styler中的一项新功能,用于执行此操作。 The Styler is used for display of data whereas a DataFrame is essentially an efficient memory map of data. Styler用于显示数据,而DataFrame本质上是高效的 memory map 数据。 Therefore the ability the combine and structure different DataFrames for display purposes is useful.因此,为了显示目的而组合和构造不同的DataFrames的能力很有用。 The Styler allows for configuring formatting output differently for different tables. Styler允许为不同的表格配置不同的格式 output。 Eg a column might have integer values but the arithmetic mean is usually a float with multiple decimals.例如,一列可能有 integer 个值,但算术平均值通常是一个带有多个小数的浮点数。

See the docs for Styler.concat as it discusses this use case.请参阅Styler.concat的文档,因为它讨论了这个用例。 https://pandas.pydata.org/docs/dev/reference/api/pandas.io.formats.style.Styler.concat.html https://pandas.pydata.org/docs/dev/reference/api/pandas.io.formats.style.Styler.concat.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM