Replace all occurrences but first of a repeating string in a column using pandas

Question

I have a column, in a pandas dataframe, in which sometimes there is a repeating string:

col1	col2
1	hello
2	bye
3	hello
4	morning
5	night
6	hello

Would I would like to do is to modify all but the first occurence of "hello" in "hello again". So the first occurence of hello remains the same.

col1	col2
1	hello
2	bye
3	hello again
4	morning
5	night
6	hello again

Answer 1

You can use

df['col2'] = df['col2'].str.split(expand=True)[0]

split() by default split at space, and the expand=true creates two variables instead of two lists

Answer 2

You can find the indices of the rows containing "hello" and then modify all but the first occurrence using pandas.DataFrame. : ：

In [1]: import pandas as pd
In [2]: df = pd.DataFrame(data={'col1': [1, 2, 3, 4, 5, 6],
   ...:                         'col2': ['hello', 'bye', 'hello', 'morning', 'night', 'hello']})
In [3]: df
Out[3]: 
   col1     col2
0     1    hello
1     2      bye
2     3    hello
3     4  morning
4     5    night
5     6    hello
In [4]: hello_indices = df.index[df['col2'] == 'hello']
In [5]: hello_indices
Out[5]: Int64Index([0, 2, 5], dtype='int64')
In [6]: df.loc[hello_indices[1:],'col2'] = 'hello again'
In [7]: df
Out[7]: 
   col1         col2
0     1        hello
1     2          bye
2     3  hello again
3     4      morning
4     5        night
5     6  hello again

Replace all occurrences but first of a repeating string in a column using pandas

Question

2 answers

solution1
-1 2022-01-20 12:25:44

solution2
-1 ACCPTED 2022-01-20 12:34:30

Replace all occurrences but first of a repeating string in a column using pandas

Question

2 answers

solution1 -1 2022-01-20 12:25:44

solution2 -1 ACCPTED 2022-01-20 12:34:30

solution1
-1 2022-01-20 12:25:44

solution2
-1 ACCPTED 2022-01-20 12:34:30