I have a table with more than 2000 entries represented in the following way:
Rows: Id doggo floofer pupper puppo
3 None floofer None None
5 None None pupper None
6 doggo None None None
I want to add a new column called doggo_stage which stores the value of either of the 4 column if it's true:
Example:
floofer has entry floofer: doggo_stage-> floofer
pupper has entry pupper: doggo_stage -> pupper
Rows: Id doggo floofer pupper puppo doggo_stage
3 None floofer None None floofer
5 None None pupper None pupper
6 doggo None None None doggo
lambda? functions or loops.
EDIT: I should have added this before but I have 24 columns in total, the ones I added needs to be used for the new column that I require. Basically, I am having trouble with choosing these specific columns in the for loop.
dogo_stage = []
for column in df[['doggo','floofer','pupper','puppo']]:
for i in range(len(df)):
if df[column][i] is not None:
dogo_stage.append(df[column][i])
df['dogo_stage'] = dogo_stage
IIUC, you can use max
with axis=1
:
df = df.set_index('Id')
df['dogo_stage'] = df.max(axis=1)
df.reset_index()
Output:
Id doggo floofer pupper puppo dogo_stage
0 3 None floofer None None floofer
1 5 None None pupper None pupper
2 6 doggo None None None doggo
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.