df
shape square
shape circle
animal NaN
NaN dog
NaN cat
NaN fish
color red
color blue
desired_df
shape square
shape circle
animal dog
animal cat
animal fish
color red
color blue
I have a df contains information that needs to be normalized.
I have noticed a pattern that indicates how to join the columns and normalize the data.
If in Col1 != NaN and Col2 == NaN and directly in the following row Col1 == NaN and Col2 != NaN, then then values from Col1 and Col2 should be joined. This continues until arriving to a row that contains values Col1 != NaN and Col2 !=NaN .
Is there a way to solve this in pandas
?
The first step that I am thinking of is to create an additional column in order containing True/False values in order to determine what columns to join, however, once doing that, I am not sure how to assign the value in Col1 to all of the relevant values in Col2.
Any suggestions to arrive at desired result?
If your identified pattern is a heuristic which, nevertheless, I struggle to follow, you can instead try pd.Series.ffill
and pd.Series.bfill
to reach your desired result:
df[0] = df[0].ffill()
df[1] = df[1].bfill()
Then drop duplicates:
df = df.drop_duplicates()
print(df)
0 1
0 shape square
1 shape circle
2 animal dog
4 animal cat
5 animal fish
6 color red
7 color blue
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.