[英]How to effeciently remove character if available at the beginning and end of a string in Pandas?
The idea is to remove full stop, commas, quotation if it is available at the beginning and last string in Pandas.如果在 Pandas 的开头和最后一个字符串中可用,则删除句号、逗号、引号。
Given a df
as below给定一个
df
如下
data = {'Name': ['"Tom hola.', '"nick"', 'krish here .','oh my *']}
The expected output is预期的 output 是
Tom hola
nick
krish here
oh my
I tried the following code, but it did not work as intended我尝试了以下代码,但它没有按预期工作
import pandas as pd
df = pd.DataFrame(data)
df['Name'] = df['Name'].str[-1:].replace({"\. ": "Na"},regex=True)
May I know how this objective can be achieved?我可以知道如何实现这个目标吗?
Also, can the approach extended for it to be applied across different columns?此外,该方法是否可以扩展以应用于不同的列?
You can use pd.Series.str.replace
if you want replace only colum else use df.replace
.如果您只想替换列,则可以使用
pd.Series.str.replace
,否则使用df.replace
。
# Using `pd.Series.str.replace`
df['Name'] = df['Name'].str.replace(r'\.$','')
df Name
0 Tom hola
1 secondx //
2 nick
3 krish here
# Using `df.replace`
df.replace(r'\.$', '', regex=True)
Name
0 Tom hola
1 secondx //
2 nick
3 krish here
regex101
regex101
EDIT:编辑:
You can use pd.Series.str.strip
to strip "
, .
and *
您可以使用
pd.Series.str.strip
剥离"
、 .
和*
df['Name'].str.strip(r'\"\.\*')
0 Tom hola
1 nick
2 krish here
3 oh my
Name: Name, dtype: object
# OR
df.Name.str.replace(r'^\W+|(.*?)\W+$',r'\1') # Replaces only values in `Name`
# df.replace(r'^\W+|(.*?)\W+$',r'\1',regex=True) Replaces for whole df
use (\W)*$
if you want to match all specials characters at the end of the string如果要匹配字符串末尾的所有特殊字符,请使用
(\W)*$
df = pd.DataFrame({'Name': ['Tom hola.', 'secondx //', 'nick', 'krish here .']})
df['Name'] = df['Name'].replace({r'(\W)*$': ""}, regex=True)
Output: Output:
Name
0 Tom hola
1 secondx
2 nick
3 krish here
You can use https://regex101.com to test and better understand what your regex is doing您可以使用https://regex101.com来测试并更好地了解您的正则表达式在做什么
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.