Pandas：根据现有创建新列，如果条件不匹配则返回现有列

Question

I have a dataset that contains a column with categorical values.我有一个数据集，其中包含一个具有分类值的列。 I need to standardize the column because some values are coded incorrectly.我需要对列进行标准化，因为某些值的编码不正确。 For example, '1.0' and '3.0' should be '01' and '03', respectively.例如，“1.0”和“3.0”应分别为“01”和“03”。 When the values are correct, however, I just need to return the value of the column I'm cleaning.但是，当值正确时，我只需要返回我正在清理的列的值。 I'd like to include the cleaned data in a new column.我想将清理后的数据包含在一个新列中。

I am relatively new to Python and Pandas.我对 Python 和 Pandas 比较陌生。 I usually work in R. I've tried various techniques I found on Stack, but I keep running into an issue when attempting to return the values from the original column if they are correct.我通常在 R 中工作。我尝试了在 Stack 上找到的各种技术，但是在尝试从原始列返回值是否正确时，我一直遇到问题。

Any assistance would be much appreciated!任何帮助将不胜感激！ Here's some sample data:以下是一些示例数据：

import pandas as pd
d = {'col1':['01','03','1.0','10.0','7.0','3.0']}
df = pd.DataFrame(data=d)

This returns ....这返回....

And I'm hoping to get ...而我希望得到...

    col1    col2  
0   01      01
1   03      03
2   1.0     01
3   10.0    10
4   7.0     07
5   3.0     03

Answer 1

You can convert the number column to float then to int and finally add leading zeros.您可以将数字列转换为浮点数，然后转换为 int，最后添加前导零。

df['col2'] = (df['col1']
              .astype(float).astype(int)
              .apply('{:0>2}'.format))

df['col3'] = (df['col1']
              .astype(float).astype(int).astype(str)
              .str.zfill(2))

print(df)

   col1 col2 col3
0    01   01   01
1    03   03   03
2   1.0   01   01
3  10.0   10   10
4   7.0   07   07
5   3.0   03   03

Answer 2

This is the style format approach where you individually style each column.这是您单独设置每一列的样式的样式格式方法。

Code:代码：

df['col2'] = df['col1']
df = df.astype(float)
df = df.style.format({'col1': "{:.1f}",'col2': "{:,.0f}"})
df

Output:输出：

    col1    col2
 0  1.0      1
 1  3.0      3
 2  1.0      1
 3  10.0    10
 4  7.0      7
 5  3.0      3

Pandas：根据现有创建新列，如果条件不匹配则返回现有列

问题描述

2 个解决方案

解决方案1
0 2022-06-01 11:32:33

解决方案2
0 2022-06-01 12:03:28

Pandas：根据现有创建新列，如果条件不匹配则返回现有列

问题描述

2 个解决方案

解决方案1 0 2022-06-01 11:32:33

解决方案2 0 2022-06-01 12:03:28

解决方案1
0 2022-06-01 11:32:33

解决方案2
0 2022-06-01 12:03:28