create new df from existing df in pandas - python

Question

What should be the optimized pandas command to create a new data frame from existing data frame that have only 1 column named val with the following transformation.

Input:

1_2_3
1_2_3_4
1_2_3_4_5

Output:

2
2_3
2_3_4

Remove everything till first underscore (including _) and also remove everything after last _ (including _)

Answer 1

You can use str.replace with a regex that matches characters up to and including the first _ and from the last _ to the end of string, replacing both those parts with nothing:

df['val'] = df['val'].str.replace('^[^_]*_(.*)_[^_]*$', r'\1')

Output:

If you want that single column in a new dataframe, you can convert it to one using to_frame :

df2 = df['val'].str.replace('^[^_]*_(.*)_[^_]*$', r'\1').to_frame()

Answer 2

Another way with str slicing after split:

df['val'].str.split("_").str[1:-1].str.join("_")

0        2
1      2_3
2    2_3_4

Answer 3

Split the string by the charcters between start of string r1 and r2 end of string

where r1=digit_ and r2=_digit

df.a.str.split('(?<=^\d\_)(.*?)(?=\_\d+$)').str[1]

Answer 4

You can find the first and the last _ using str.find and str.rfind and then you can get the substring from it.

df['val'] = [x[x.find('_')+1:x.rfind('_')] for x in df['val']]

Output:

Answer 5

You can do it using the replace method

df.vals = df.vals.str.replace(r'^1_', '').str.replace(r'_\d$', '')

I'm passing 2 regex, first one finds the substring 1_ and replaces it with empty string, the second one finds substrings with an underscore followed by a number at the end of the string (That's what the '$' means) with an empty string.

Answer 6

Regex-related questions are always fun.

I'll throw one more to the mix. Here's str.extract :

df['new_val'] = df['val'].str.extract('_(.+)_')

Output:

         val  new_val
0      1_2_3        2
1    1_2_3_4      2_3
2  1_2_3_4_5    2_3_4

create new df from existing df in pandas - python

Question

6 answers

solution1
3 2021-02-12 03:32:54

solution2
1 2021-02-12 03:40:01

solution3
1 2021-02-12 03:41:30

solution4
1 2021-02-12 03:43:04

solution5
1 2021-02-12 03:44:48

solution6
1 2021-02-12 04:01:06

create new df from existing df in pandas - python

Question

6 answers

solution1 3 2021-02-12 03:32:54

solution2 1 2021-02-12 03:40:01

solution3 1 2021-02-12 03:41:30

solution4 1 2021-02-12 03:43:04

solution5 1 2021-02-12 03:44:48

solution6 1 2021-02-12 04:01:06

solution1
3 2021-02-12 03:32:54

solution2
1 2021-02-12 03:40:01

solution3
1 2021-02-12 03:41:30

solution4
1 2021-02-12 03:43:04

solution5
1 2021-02-12 03:44:48

solution6
1 2021-02-12 04:01:06