简体   繁体   English

从pandas dataframe中的字符串列中删除零

[英]Deleting zeros from string column in pandas dataframe

I have a column in my dataframe,where the values are something like this: 我的数据框中有一列,其中的值如下所示:

col1:
    00000000000012VG
    00000000000014SG
    00000000000014VG
    00000000000010SG
    20000000000933LG
    20000000000951LG
    20000000000957LG
    20000000000963LG
    20000000000909LG
    20000000000992LG

I want to delete all zeros: 我想删除所有零:

a)that are in front of other numbers and letters(For example in case of 00000000000010SG I want to delete this part 000000000000 and keep 10SG ). a)在其他数字和字母前面(例如,在00000000000010SG情况下,我想删除此部分000000000000并保留10SG )。

b) In cases like 20000000000992LG I want to delete this part 0000000000 and unite 2 with 992LG . b)在类似20000000000992LG情况下,我想删除此部分0000000000并将2992LG联合。

str.stprip('0') solves only part a), as I checked. 正如我检查的那样,str.stprip('0')只解决了a)部分。

But what is the right solution for both cases? 但这两种情况的正确解决方案是什么?

I would recommend something similar to Ed's answer, but using regex to ensure that not all 0s are replaced, and the eliminate the need to hardcode the number of 0s. 我会推荐类似于Ed的答案,但是使用正则表达式来确保不是所有的 0都被替换,并且不需要对0的数字进行硬编码。

In [2426]: df.col1.str.replace(r'[0]{2,}', '', 1)
Out[2426]: 
0      12VG
1      14SG
2      14VG
3      10SG
4    2933LG
5    2951LG
6    2957LG
7    2963LG
8    2909LG
9    2992LG
Name: col1, dtype: object

Only the first string of 0s is replaced. 仅替换第一个0字符串。

Thanks to @jezrael for pointing out a small bug in my answer. 感谢@jezrael在我的回答中指出了一个小错误。

You can just do 你可以这样做

In[9]:
df['col1'] = df['col1'].str.replace('000000000000','')
df['col1'] = df['col1'].str.replace('0000000000','')
df

Out[9]: 
         col1
0        12VG
1        14SG
2        14VG
3        10SG
4      2933LG
5      2951LG
6      2957LG
7      2963LG
8      2909LG
9      2992LG

This will replace a fixed number of 0 s with a blank space, this isn't dynamic but for your given dataset this is the simplest thing to do unless you can explains better the pattern 这将用空格替换固定数量的0秒,这不是动态的,但对于给定的数据集,这是最简单的事情,除非你能更好地解释模式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM