简体   繁体   中英

Removing square brackets from Dataframe

I have Following as adtaset in dataframe format , i need to remove the square brackets From the data. How can we proceed can anyone help

   From             TO
   [wrestle]        engage in a wrestling match
   [write]          communicate or express by writing
   [write]          publish
   [spell]          write
   [compose]        write music

Expected output is:

   From             TO
   wrestle      engage in a wrestling match
   write       communicate or express by writing
   write       publish
   spell       write

Use str.strip if string s:

print (type(df.loc[0, 'From']))
<class 'str'>

df['From'] = df['From'].str.strip('[]')

... and if list s convert them by str.join :

print (type(df.loc[0, 'From']))
<class 'list'>

df['From'] = df['From'].str.join(', ')

Thank you @juanpa.arrivillaga for suggestion if one item list s:

df['From'] = df['From'].str[0]

what is possible check by:

print (type(df.loc[0, 'From']))
<class 'list'>

print (df['From'].str.len().eq(1).all())
True

print (df)
      From                                 TO
0  wrestle        engage in a wrestling match
1    write  communicate or express by writing
2    write                            publish
3    spell                              write
4  compose                        write music

Suppose you have this dataframe:

df = pd.DataFrame({'Region':['New York','Los Angeles','Chicago'], 'State': ['NY [new york]', '[California]', 'IL']})

Which will be like this:

        Region          State
0     New York  NY [new york]
1  Los Angeles   [California]
2      Chicago             IL

To just remove the square brackets you need the following lines:

df['State'] = df['State'].str.replace(r"\[","")
df['State'] = df['State'].str.replace(r"\]","")

The result:

        Region        State
0     New York  NY new york
1  Los Angeles   California
2      Chicago           IL

If you want to remove square bracket with every thing between them:

df['State'] = df['State'].str.replace(r"\[.*\]","")
df['State'] = df['State'].str.replace(r" \[.*\]","")

The first line just deletes the characters between square brackets, the second line considers the space before character, so to make sure you are doing it safe it's better to run both of these lines.

By applying these two lines on the original df:

        Region State
0     New York    NY
1  Los Angeles      
2      Chicago    IL

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM