简体   繁体   中英

Change values in each cell based on a comparison with another cell using pandas

I want to compare two values in column 0 to the values in all the other columns and change to values of those columns appropriately. I have 4329 rows x 197 columns.

From this:

  0  1  2  3
0 G  G  G  T
1 A  A  G  A
2 C  C  C  C
3 T  A  T  G

To this:

  0  1  2  3
0 G  1  1  0
1 A  1  0  1
2 C  1  1  1
3 T  0  1  0

I've tried a nested for loop, which does not work and is slow.

for index, row in df.iterrows():        
    for name, value in row.iteritems():
        if name == 0:
            c = value
            continue
        if value == c:
            value = 1
        else:
            value = 0

I haven't been able to piece together a way to use apply or applymap for the problem.

Here's an approach with iloc and eq :

df.iloc[:,1:] = df.iloc[:,1:].eq(df.iloc[:,0], axis=0).astype(int)

Output:

   0  1  2  3
0  G  1  1  0
1  A  1  0  1
2  C  1  1  1
3  T  0  1  0
df = pandas.DataFrame([['G',  'G',  'G',  'T'],
      ['A',  'A',  'G',  'A'],
      ['C',  'C',  'C',  'C'],
      ['T',  'A',  'T',  'G']])

df2 = df[0] + df.apply(lambda c:df[0]==c)[[1,2,3]].astype(int)
print(df2)

I guess ... theres probably a better way though

you could also do something like

df.apply(lambda c:(df[0]==c).astype(int) if c.name > 0 else c)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM