简体   繁体   English

Pandas删除字符后的列中的所有字符串

[英]Pandas remove all of a string in a column after a character

So I have a data set with over 500 rows where one of the columns has values like this: 所以我有一个超过500行的数据集,其中一列的值如下所示:

df: DF:

         column1

 0    a{'...'}  
 1    b{'...'}
 2    c{'...'}  
 3    d{'...'}  

I want to remove everything within and including the {} . 我想删除{}内的所有内容。

I have been looking at this question, Pandas delete parts of string after specified character inside a dataframe and tried the solutions there but I keep getting errors(And I am aware that StringIO is now io.StringIO ). 我一直在看这个问题, Pandas删除数据帧中指定字符后的字符串部分并尝试了解决方案,但我一直收到错误(我知道StringIO现在是io.StringIO )。

I've tried 我试过了

df.column1 = df.column1.str.split('{')[0]

but get the error message: KeyError: 0 and don't really understand what that means 但得到错误消息: KeyError: 0并不真正理解这意味着什么

I've also tried: 我也尝试过:

df.column1 = df.column1.str.split(pat='{')

But this only seems deletes the '{' so I'm left with 但这似乎只删除了'{'所以我留下了

      column1

 0    a'...'}   
 1    b'...'}
 2    c'...'}   
 3    d'...'}   

Also I'm not sure if it's important but the column is an object type. 此外,我不确定它是否重要但列是object类型。 Can anyone tell me what I'm doing wrong and how to fix the issue??? 任何人都可以告诉我我做错了什么以及如何解决问题???

You can using replace 你可以使用replace

df['column1'].str.replace(r"\{.*\}","")
Out[385]: 
0    a
1    b
2    c
3    d
Name: column1, dtype: object

You can also use pandas.DataFrame.replace and pass a dictionary that specifies what to do for various columns. 您还可以使用pandas.DataFrame.replace并传递一个字典,指定对各种列执行的操作。

Using @Wen's regex pattern 使用@Wen的正则表达式模式

df.replace(dict(column1={'\{.*\}': ''}), regex=True)

  column1
0       a
1       b
2       c
3       d

In the spirit of @pault, you can also use pandas.Series.str.extract 本着@pault的精神,你也可以使用pandas.Series.str.extract

df.column1.str.extract('([^\{]+)', expand=False)

  column1
0       a
1       b
2       c
3       d

A little late (@Wen's solution is great), but you can use pandas.Series.str.split() as in your original attempt. 有点晚了(@ Wen的解决方案很棒),但您可以像原始尝试一样使用pandas.Series.str.split() You were close- you just need to set expand=True . 你很亲密 - 你只需要设置expand=True

df["column1"] = df["column1"].str.split("{", expand=True)[0]
#  column1
#0       a
#1       b
#2       c
#3       d

Using .apply 使用.apply

df = pd.DataFrame({"a":["a{'...'}", "b{'...'}"]})
df["a"] = df["a"].apply(lambda x: x.split('{')[0])
print df

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何删除 Pandas 中某个字符后的所有字符串或数字? - How to remove all string or numbers after certain Character in Pandas? 如何删除熊猫列中包括某个字符在内的所有字符? - How can one remove all characters including and after a certain character in a pandas column? 对于 pandas 中的列中的所有值,如何从字符串中删除字符并将 rest 转换为 integer 或十进制? - How to remove a character from a string and convert the rest into integer or decimal, for all values in a column in pandas? Pandas:删除 dataframe 列中特定字符之前的所有字符 - Pandas: Remove all characters before a specific character in a dataframe column Pandas 数据框列删除第一个特定字符之前的字符串 - Pandas dataframe column remove string before the first specific character 如何在熊猫列中的特殊字符之前删除字符串的一部分? - How to remove part of string ahead of special character in a column in Pandas? 如何删除 pandas 中字符前面的所有字符串值? - How to remove all string values that precede a character in pandas? Pandas 删除指定字符序列后的部分字符串 - Pandas remove parts of string after specified character sequence 如何删除 Pandas Dataframe 列中所有值的字符串的最后一个字符? - How to delete the last character of a string for all values in a Pandas Dataframe column? pandas/regex: 去掉连字符或括号字符后的字符串(包括) carry string in the comma after the comma in pandas dataframe - pandas/regex: Remove the string after the hyphen or parenthesis character (including) carry string after the comma in pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM