Python Pandas - 检测列和数字格式

Question

Im using Pandas to manage csv.我使用 Pandas 来管理 csv。 Unfortunally I have columns with number that use "comma" as decimal separator:不幸的是，我有使用“逗号”作为小数分隔符的数字列：

Es.埃斯。 50,12 50,12

When I use convert_dtypes() function, this columns are converted to String and not number so the sort functions doesn't work properly.当我使用 convert_dtypes() 函数时，此列被转换为字符串而不是数字，因此排序函数无法正常工作。

Is there a way to specify "number format" of dataset so every number is considered like NNNN,DD instead of NNNN.DD有没有办法指定数据集的“数字格式”，因此每个数字都被视为 NNNN,DD 而不是 NNNN.DD

EXAMPLE:例子：

| Gross Amount | Item Number|
-----------------------------
|52,50         |   1       |
|498,00        |   2       |
|10,01         |   3       |
|1,74          |   4       |
|518,04        |   5       |
|2,10          |   6       |

AutoDetect return this:自动检测返回：

Gross Amount     string
Item Number       Int64

So When I order by "Gross Amount" it sort String and not number so, for example, "10,01" is printed before "2,10"所以当我按“总金额”排序时，它对字符串而不是数字进行排序，例如，“10,01”打印在“2,10”之前

Answer 1

When suppose the DataFrame is called df .假设 DataFrame 被称为df 。 you can use the code below.您可以使用下面的代码。

Code:代码：

for column in df.columns:
    if pd.api.types.is_string_dtype(df[column]):
        try:
            df[column] = df[column].str.replace(',', '.').astype('float')
        except:
            pass

Python Pandas - 检测列和数字格式

问题描述

1 个解决方案

解决方案1
0 2021-11-12 11:55:41

Python Pandas - 检测列和数字格式

问题描述

1 个解决方案

解决方案1 0 2021-11-12 11:55:41

解决方案1
0 2021-11-12 11:55:41