简体   繁体   English

替换熊猫数据框中的多个字符

[英]replace mutliple characters in pandas dataframe

trying to remove one character试图删除一个字符

I scraped this data from the web and want to remove all the non-integer characters in the second column so that I can do maths on it.我从网上抓取了这些数据,并想删除第二列中的所有非整数字符,以便我可以对其进行数学运算。

Is there another way to remove all brackets and comma in one line有没有另一种方法可以删除一行中的所有括号和逗号

You may strip off parentheses and commas using str.replace with the character class [(),] .您可以使用带有字符类[(),] str.replace去除括号和逗号。 Then, use to_numeric() later when you want to work with this text column as numeric data:然后,当您想将此文本列用作数字数据时,稍后使用to_numeric()

df['pop'] = df['pop'].str.replace('[(),]+', '', regex=True)

i suggest you this also: it creates a new column for each element:我也建议你这样做:它为每个元素创建一个新列:

df['pop1'] = pd.to_numeric(df['pop'].str.split(r'\D').str.get(1))
df['pop2'] = pd.to_numeric(df['pop'].str.split(r'\D').str.get(2))
df['pop3'] = pd.to_numeric(df['pop'].str.split(r'\D').str.get(3))

it suggests that you always have the same nbr of element in "pop".它表明您在“pop”中始终具有相同的 nbr 元素。 With the same technique you can also create a list of integer in pop column.使用相同的技术,您还可以在弹出列中创建整数列表。 Depends of how you wanna work on it.取决于你想如何工作。 Like this for example:像这样例如:

pop4 = []
for i in df['pop']:
    newstr = ''.join((ch if ch in '0123456789.-e'else ' ') for ch in i)
    listOfNumbers = [float(x) for x in newstr.split()]    
    pop4.append(listOfNumbers)

df['pop4']=pop4

在此处输入图片说明

OFC you can int or float... OFC你可以int或float ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM