Python：在Pandas Dataframe中切片一个列字符串

Question

我正在尝试修改列字符串。 我想删除列中不需要的数字并将修改后的值另存为新列。

这是 SQL 中的示例：

使用cast(substring(EMP_NM,0,CHARINDEX(' ',EMP_NM))as int)我得到以下结果。

我怎样才能在 python 中做到这一点，我只能在新列中获得数字的“4253332”部分？

df.['EMP_NM'] = df['EMP_NM'].str.slice(0, 9) -- This does not give the result I want as some values in the column can be defined as below:

009201135 0000000000 0000000000 0000000000 0000000000

0006892203 0000000000 0000000000 0000000000 0000000000

任何帮助，将不胜感激。

Answer 1

尝试这个：


df['EMP_NM'] = df['EMP_NM'].astype(str).str[0:7]

如果此数据字段作为 integer 读入python ，则前导“0”将被自动删除。 因此你可以只索引保留前 7 个字符。

Answer 2

这个怎么样：

df['EMP_NM'] = df['EMP_NM'].str.replace('0','')

或者，如果您还有类似009201135 0000000000 32331 0000000000 0000000000类的可能编号。 这意味着，代码的其他部分也不同于零，这应该会更好：

df['EMP_fNM'] = df['EMP_NM'].str.split()[0][0].strip('0')