简体   繁体   中英

create new df from existing df in pandas - python

What should be the optimized pandas command to create a new data frame from existing data frame that have only 1 column named val with the following transformation.

Input:

1_2_3
1_2_3_4
1_2_3_4_5

Output:

2
2_3
2_3_4

Remove everything till first underscore (including _) and also remove everything after last _ (including _)

You can use str.replace with a regex that matches characters up to and including the first _ and from the last _ to the end of string, replacing both those parts with nothing:

df['val'] = df['val'].str.replace('^[^_]*_(.*)_[^_]*$', r'\1')

Output:

     val
0      2
1    2_3
2  2_3_4

If you want that single column in a new dataframe, you can convert it to one using to_frame :

df2 = df['val'].str.replace('^[^_]*_(.*)_[^_]*$', r'\1').to_frame()

Another way with str slicing after split:

df['val'].str.split("_").str[1:-1].str.join("_")

0        2
1      2_3
2    2_3_4

Split the string by the charcters between start of string r1 and r2 end of string

where r1=digit_ and r2=_digit

df.a.str.split('(?<=^\d\_)(.*?)(?=\_\d+$)').str[1]

You can find the first and the last _ using str.find and str.rfind and then you can get the substring from it.

df['val'] = [x[x.find('_')+1:x.rfind('_')] for x in df['val']]

Output:

     val
0      2
1    2_3
2  2_3_4

You can do it using the replace method

df.vals = df.vals.str.replace(r'^1_', '').str.replace(r'_\d$', '')

I'm passing 2 regex, first one finds the substring 1_ and replaces it with empty string, the second one finds substrings with an underscore followed by a number at the end of the string (That's what the '$' means) with an empty string.

Regex-related questions are always fun.

I'll throw one more to the mix. Here's str.extract :

df['new_val'] = df['val'].str.extract('_(.+)_')

Output:

         val  new_val
0      1_2_3        2
1    1_2_3_4      2_3
2  1_2_3_4_5    2_3_4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM