Pandas Pivot_Table分组值

Question

I have a large data set on a CSV ( Dataset ). 我在CSV（数据集）上有大量数据。 I want to create a pd.pivot_table to summarize the data by zip code, however, my data has lines that share the same zip code. 我想创建一个pd.pivot_table来按邮政编码对数据进行汇总，但是，我的数据包含共享相同邮政编码的行。

df = pd.read_csv('15zpallagi.csv')
df['A00100'] = df['A00100'].map('{:,.2f}'.format)
df.pivot_table(values='A00100', index='zipcode', aggfunc='sum')

When I run the code above to create a pivot_table, the value column includes several values, like if it was stacking on the sum with multiple values. 当我运行上面的代码以创建数据透视表时，值列包含多个值，就像它堆叠在具有多个值的总和上一样。

However if run the following code, I get the same values, but in an understandable format. 但是，如果运行以下代码，则将获得相同的值，但格式可以理解。

df.pivot_table(values='A00100', index='zipcode',columns='agi_stub', aggfunc='sum')

How can I create a pivot table that just adds the column A00100 and gives me a total by zip code? 如何创建仅添加列A00100并通过邮政编码提供总计的数据透视表？

Answer 1

You are likely seeing these inconsistencies because this line df['A00100'] = df['A00100'].map('{:,.2f}'.format) is converting your A00100 column to a string type instead of a float. 您可能会看到这些不一致之处，因为此行df['A00100'] = df['A00100'].map('{:,.2f}'.format)将A00100列转换为字符串类型，而不是浮点数。

Comment out that second line and try again to see if that fixes the issue. 注释掉第二行，然后重试，看是否能解决问题。

If you need to format the number to only show 2 decimals, do that after all of your transformations. 如果您需要将数字格式化为仅显示2个小数，请在所有转换之后执行此操作。

If you are rounding for some other reason (significant figures, etc.), use the Dataframe.round function instead of string formatting. 如果由于其他原因（有效数字等）而四舍五入，请使用Dataframe.round函数而不是字符串格式。

Pandas Pivot_Table分组值

问题描述

1 个解决方案

解决方案1
2 2018-08-14 04:21:06

Pandas Pivot_Table分组值

问题描述

1 个解决方案

解决方案1 2 2018-08-14 04:21:06

解决方案1
2 2018-08-14 04:21:06