I have a DataFrame where the first column represents a Color and the second represents the Description of an item. Unfortunately, some of the information of Color
column were combined into the Description
column as you can see below:
data = {"Color": [None, "Red", "Blue", "Green", None],
"Description": ["Red T-Shirt", "Skirt", "Pants", "Underwear", "Blue Cap"]}
df = pd.DataFrame(data)
| Color | Description |
|-------|-------------|
|None |Red T-Shirt |
|Blue |Pants |
|Green |Underwear |
|None |Blue Cap |
First I splitted the Description
column on space with:
df["Description"] = df["Description"].apply(lambda x: x.split(" "))
And what I wanted to do is replace None
values on Color
with first element of Description
where Color is None
. The code I used was:
colors = ["Red", "Blue", "Green"]
df["Color"] = df["Color"].where(df["Color"] != None, df["Description"][0])
df["Color"] = df["Color"].apply(lambda x: x if x in colors else "Color N/A")
My code is returning follow information:
| Color | Description |
|-------|------------------|
|None |["Red", "T-Shirt"]|
|Blue |["Pants"] |
|Green |["Underwear"] |
|None |["Blue", "Cap"] |
But should return:
| Color | Description |
|-------|------------------|
|Red |["Red", "T-Shirt"]|
|Blue |["Pants"] |
|Green |["Underwear"] |
|Blue |["Blue", "Cap"] |
Any idea which mistake I did?
Try this -
Split the 2nd column based on space character, and then use np.where to fill the Null values in column 'Color'.
df['Description'] = df['Description'].str.split(' ')
df['Color'] = np.where(df['Color'].isna() , df['Description'].str[0], df['Color'])
print(df)
You can do with apply()
on row by setting axis=1
. Detail is return the first value of Description
column if Color
column value is None
.
df["Description"] = df["Description"].apply(lambda x: x.split(" "))
df['Color'] = df.apply(lambda row: row['Description'][0] if row['Color'] == None else row['Color'], axis=1)
print(df)
Color Description
0 Red [Red, T-Shirt]
1 Red [Skirt]
2 Blue [Pants]
3 Green [Underwear]
4 Blue [Blue, Cap]
You can use loc function, where you can update a column based on any given condition
data.loc[data['Color'].isnull(), 'color'] = data['Description'][0]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.