I want to compare two values in column 0 to the values in all the other columns and change to values of those columns appropriately. I have 4329 rows x 197 columns.
From this:
0 1 2 3
0 G G G T
1 A A G A
2 C C C C
3 T A T G
To this:
0 1 2 3
0 G 1 1 0
1 A 1 0 1
2 C 1 1 1
3 T 0 1 0
I've tried a nested for loop, which does not work and is slow.
for index, row in df.iterrows():
for name, value in row.iteritems():
if name == 0:
c = value
continue
if value == c:
value = 1
else:
value = 0
I haven't been able to piece together a way to use apply or applymap for the problem.
Here's an approach with iloc
and eq
:
df.iloc[:,1:] = df.iloc[:,1:].eq(df.iloc[:,0], axis=0).astype(int)
Output:
0 1 2 3
0 G 1 1 0
1 A 1 0 1
2 C 1 1 1
3 T 0 1 0
df = pandas.DataFrame([['G', 'G', 'G', 'T'],
['A', 'A', 'G', 'A'],
['C', 'C', 'C', 'C'],
['T', 'A', 'T', 'G']])
df2 = df[0] + df.apply(lambda c:df[0]==c)[[1,2,3]].astype(int)
print(df2)
I guess ... theres probably a better way though
you could also do something like
df.apply(lambda c:(df[0]==c).astype(int) if c.name > 0 else c)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.