简体   繁体   English

删除前导零 pandas

[英]Remove leading zeroes pandas

For example I have such a data frame比如我有这样一个数据框

import pandas as pd
nums = {'amount': ['0324','S123','0010', None, '0030', 'SA40', 'SA24']}
df = pd.DataFrame(nums)

在此处输入图像描述

And I need to remove all leading zeroes and replace NONEs with zeros:我需要删除所有前导零并将 NONE 替换为零:

在此处输入图像描述

I did it with cycles but for large frames it works not fast enough.我是用循环来做的,但对于大框架来说,它的工作速度不够快。 I'd like to rewrite it using vectores我想用 vectores 重写它

you can try str.replace你可以试试str.replace

df['amount'].str.replace(r'^(0+)', '').fillna('0')
0     324
1    S123
2      10
3       0
4      30
5    SA40
6    SA24
Name: amount, dtype: object
df['amount'] = df['amount'].str.lstrip('0').fillna(value='0')

I see already nice answer from @Epsi95 though, you even can try with character set with regex我已经从@Epsi95 看到了很好的答案,你甚至可以尝试使用regex的字符集

>>> df['amount'].str.replace(r'^[0]*', '', regex=True).fillna('0')
0     324
1    S123
2      10
3       0
4      30
5    SA40
6    SA24

Explanation:解释:

^[0]*

^ asserts position at start of a line
Match a single character present in the list below [0]
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)

Step by step:一步步:

Remove all leading zeros:删除所有前导零:

Use str.lstrip which returns a copy of the string with leading characters removed (based on the string argument passed).使用str.lstrip返回删除前导字符的字符串副本(基于传递的字符串参数)。

Here,这里,

df['amount'] = df['amount'].str.lstrip('0')

For more, ( https://www.programiz.com/python-programming/methods/string/lstrip )有关更多信息,( https://www.programiz.com/python-programming/methods/string/lstrip

Replace None with zeros:用零替换无:

Use fill.na which works with others than None as well使用fill.na也可以与None以外的其他人一起使用

Here,这里,

df['amount'].fillna(value='0')

And for more: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html更多信息: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html

Result in one line:一行结果:

df['amount'] = df['amount'].str.lstrip('0').fillna(value='0')

If you need to ensure single 0 or the last 0 is not removed, you can use:如果您需要确保单个0或最后一个0不被删除,您可以使用:

df['amount'] = df['amount'].str.replace(r'^(0+)(?!$)', '', regex=True).fillna('0')

Regex (?!$) ensure the matching substring (leading zeroes) does not including the last 0 .正则表达式(?!$)确保匹配的 substring(前导零)不包括最后一个0 Thus, effectively keeping the last 0 .因此,有效地保留了最后一个0

Demo演示

Input Data输入数据

nums = {'amount': ['0324','S123','0010', None, '0030', 'SA40', 'SA24', '0', '000']}
df = pd.DataFrame(nums)

  amount
0   0324
1   S123
2   0010
3   None
4   0030
5   SA40
6   SA24
7      0           <==   Added a single 0 here
8    000           <==   Added a sequence of all 0's here

Output Output

print(df)

  amount
0    324
1   S123
2     10
3      0
4     30
5   SA40
6   SA24
7      0           <==  Single 0 is not removed  
8      0           <==  Last 0 is kept

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM