简体   繁体   中英

pandas: increment based on a condition in another column

I have a dataframe that has one column only like the following.(a minimal example)

import pandas as pd

dataframe =pd.DataFrame({'text': ['##weather','how is today?', 'we go out', '##rain',
                     'my day is rainy', 'I am not feeling well','rainy 
                    blues','##flower','the blue flower', 'she likes red',
                    'this flower is nice']})

I would like to add a second column called 'id' and increment every time the row contains '##'. so my desired output would be,

                    text  id
0              ##weather  100
1          how is today?  100
2              we go out  100
3                 ##rain  101
4        my day is rainy  101
5  I am not feeling well  101
6            rainy blues  101
7                ##flower 102
8         the blue flower 102
9           she likes red 102
10    this flower is nice 102

so far i have done the following which does not return the right output as i want.

dataframe['id']= 100
dataframe.loc[dataframe['text'].str.contains('## intent:'), 'id'] += 1

You can try groupby with ngroup

m = dataframe['text'].str.contains('##').cumsum()

dataframe['id'] = dataframe.groupby(m).ngroup() + 100
print(dataframe)

                     text   id
0               ##weather  100
1           how is today?  100
2               we go out  100
3                  ##rain  101
4         my day is rainy  101
5   I am not feeling well  101
6                   rainy  101
7                   blues  101
8                ##flower  102
9         the blue flower  102
10          she likes red  102
11    this flower is nice  102

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM