[英]How To Create a "Total" Row for One Column in a Pandas Dataframe
So I've created a DF from file names I've pulled using the os module所以我从使用 os 模块提取的文件名创建了一个 DF
The file names include dollar amounts and I would like to be able to create a row that totals just the amount in that column of the DF (index 3)文件名包括美元金额,我希望能够创建一个总计 DF(索引 3)列中的金额的行
However, when I follow this code structure:但是,当我遵循此代码结构时:
File_Name.loc['Total'] = File_Name.sum()
I get this:我明白了:
Invoice ... Amount
30 6515 ... 401.01
Total 0822OH082522KTR1987000084201987000084481987000... ... 478.88550.0030.1032.3912.0432.521020.4729.1442...
I would love for it to look like this:我希望它看起来像这样:
Invoice Vendor Amount
30 6515 Expense 401.01
Total 198556.79
Any help would be much appreciated!任何帮助将非常感激!
The long number you get in Amount
is probably the result of string concatenation:您在
Amount
中得到的长数字可能是字符串连接的结果:
'478.88' + '550.00' + '30.10' + '32.39'
outputs产出
478.88550.0030.1032.39
So, the first step will be to cast the column Amount
to floats with File_Name['Amount'].astype('float')
.因此,第一步是使用
File_Name['Amount'].astype('float')
将列Amount
转换为浮点数。
You can add the sum of Amount
and get the visual effect you are looking for with您可以添加
Amount
的总和并获得您正在寻找的视觉效果
df.loc['Total', 'Amount'] = df['Amount'].sum()
df.loc['Total'] = df.loc['Total'].fillna('')
Nevertheless, I would strongly recommend not using pandas as if it were Excel. While Excel style can be convenient when working under such a heavy interface, it's problematic from a programmatic point of view: now you'll have an extra data point with a large value in Amount
and lots of null strings.尽管如此,我强烈建议不要像使用 Excel 那样使用 pandas。虽然 Excel 样式在如此繁重的界面下工作时很方便,但从编程的角度来看这是有问题的:现在你将有一个额外的数据点
Amount
和 null 字符串中的值。
Pandas just released (v1.5.0) a new feature in Styler
for doing just this. Pandas 刚刚发布 (v1.5.0)
Styler
中的一项新功能,用于执行此操作。 The Styler
is used for display of data whereas a DataFrame
is essentially an efficient memory map of data. Styler
用于显示数据,而DataFrame
本质上是高效的 memory map 数据。 Therefore the ability the combine and structure different DataFrames
for display purposes is useful.因此,为了显示目的而组合和构造不同的
DataFrames
的能力很有用。 The Styler
allows for configuring formatting output differently for different tables. Styler
允许为不同的表格配置不同的格式 output。 Eg a column might have integer values but the arithmetic mean is usually a float with multiple decimals.例如,一列可能有 integer 个值,但算术平均值通常是一个带有多个小数的浮点数。
See the docs for Styler.concat
as it discusses this use case.请参阅
Styler.concat
的文档,因为它讨论了这个用例。 https://pandas.pydata.org/docs/dev/reference/api/pandas.io.formats.style.Styler.concat.html https://pandas.pydata.org/docs/dev/reference/api/pandas.io.formats.style.Styler.concat.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.