简体   繁体   中英

Why replace substring does not work in Pandas dataframe?

I try to replace everywhere the symbols " - in the start line and end line:

dtnew.applymap(lambda x: x.replace('^-', ''))
dtnew.applymap(lambda x: x.replace('^"', ''))

But the output dataframe has these symbols

well, if performance is NOT an issue you can iterate over columns and rows and use a simple replace (see below). Again, I would only use this if the dataframe is not enormous and you have no concern for performance.

for column in df.columns:
    for i in df.index:    
        df[column][i] = df[column][i].replace('-','').replace('"','')

Assuming this example and that you only want to replace the leading character(s):

df = pd.DataFrame([['- abc', 'def -'], ['" ghi-', '--jkl']])

        0      1
0   - abc  def -
1  " ghi-  --jkl

Use str.lstrip .

df2 = df.apply(lambda c: c.str.lstrip('- "'))

output:

      0      1
0   abc  def -
1  ghi-    jkl

# as list: [['abc', 'def -'], ['ghi-', 'jkl']]

For only the first character, use str.replace :

df2 = df.apply(lambda c: c.str.replace('^[- "]', '', regex=True))

output:

       0      1
0    abc  def -
1   ghi-   -jkl

# as list: [[' abc', 'def -'], [' ghi-', '-jkl']]

generalization:

  • to strip both start and end, use str.strip

  • to remove all characters (anywhere): df.apply(lambda c: c.str.replace('[- "]', '', regex=True))

  • to remove first or last matching character: df.apply(lambda c: c.str.replace('(^[- "]|[- "]$)', '', regex=True))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM