简体   繁体   中英

Changing the values in every second row - pandas

I'm aiming to change the value of every second row in a pandas df. Using below, I'm hoping to change the every second row in Group . Each set of two rows in Group will always be duplicated. The manner in which this will occur is random though. As in, I won't be changing every second row to a specific value. The method it get's changed depends on the value in the previous row.

Specifically, each group of rows will either contain the unique value from GR1 or GR2 . I'm hoping to change the second value in Group to whatever the first value isn't. So using below, the first row will either be A or B . Therefore the subsequent row will be the opposite value.

Note: There will only every be two duplicated rows for each period of Time . Also, the unique values within GR1 or GR2 will be different depending on the dataset so I'm hoping to account for this.

df = pd.DataFrame({      
    'Time' : [1,1,2,2,3,3,4,4],    
    'GR1' : ['A','A','A','A','A','A','A','A'],
    'GR2' : ['B','B','B','B','B','B','B','B'],
    'Group' : ['A','A','B','B','B','B','A','A'],    
   })

GR1 = df['GR1'].unique()
GR2 = df['GR2'].unique() 

groups = [y for x in [GR1, GR2] for y in x] 

df['Group'] = np.where(df.index % 2, groups[0], groups[1])

df:

   Time GR1 GR2 Group
0     1   A   B     A 
1     1   A   B     A # first row is from GR1 so this row is GR2
2     2   A   B     B 
3     2   A   B     B # first row is from GR2 so this row is GR1
4     3   A   B     B 
5     3   A   B     B # first row is from GR2 so this row is GR1
6     4   A   B     A 
7     4   A   B     A # first row is from GR1 so this row is GR2

out:

   Time GR1 GR2 Group
0     1   A   B     B
1     1   A   B     A
2     2   A   B     B
3     2   A   B     A
4     3   A   B     B
5     3   A   B     A
6     4   A   B     B
7     4   A   B     A

intended output:

   Time GR1 GR2 Group
0     1   A   B     A
1     1   A   B     B
2     2   A   B     B
3     2   A   B     A
4     3   A   B     B
5     3   A   B     A
6     4   A   B     A
7     4   A   B     B

The idea is to get the second rows for each of the last three columns, do a comparism based on your logic, and replace the original dataframe with the outcome of the logic.

DT = df.copy()

DT.iloc[1::2, -1] = np.nan 

# the second rows will be filled with the values from the previous row
DT = DT.ffill()

In [252]: gr1 = DT.iloc[1::2, 1]

In [253]: gr2 = DT.iloc[1::2, 2]

In [258]: check = DT.iloc[1::2, -1]
 
In [260]: bool1 = gr1==check

In [261]: bool2 = gr2==check

In [264]: condlist = [bool1, bool2]

In [265]: choicelist = [gr2, gr1]

In [267]: DT.iloc[1::2, -1] = np.select(condlist, choicelist)

In [268]: DT
Out[268]: 
   Time GR1 GR2 Group
0     1   A   B     A
1     1   A   B     B
2     2   A   B     B
3     2   A   B     A
4     3   A   B     B
5     3   A   B     A
6     4   A   B     A
7     4   A   B     B

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM