[英]How to add values of two column to dictionary for mapping
I have a data of two column like this我有这样的两列数据
data.head()
GroupId Planet
0 0008 Europa
1 0008 Europa
2 0009 Mars
3 0010 Earth
4 0011 Earth
5 0012 Earth
6 0012 NaN
Pl.net column has missing values but same groupid shares same pl.net. Pl.net 列有缺失值,但相同的 groupid 共享相同的 pl.net。 How can i create dictionary for mapping to fill na like { '0008': 'Europa', '0009': 'Mars...}我如何创建用于映射的字典以填充 na,如 { '0008': 'Europa', '0009': 'Mars...}
You can remove NaN
values and do dictionary mapping.您可以删除NaN
值并进行字典映射。
data = data.dropna()
final_df = {k:v for k,v in list(set(zip(data['GroupId'], data['Planet'])))}
final_df
df.drop_duplicates(subset=['GroupId'], keep='first').set_index('GroupId').to_dict()['Planet']
Here's what you need, I have added sample dataframe with NaN
as per your description -这是您需要的,我根据您的描述添加了带有NaN
的示例 dataframe -
import numpy as np
import pandas as pd
# dataframe
planet_data = {
"GroupId": ["0008", "0008", "0009", "0010", "0008", "0011", "0012"],
"Planet": ["Europa", "Europa", "Mars", "Earth", np.nan, "Earth", "Earth"]
}
data = pd.DataFrame(planet_data)
# create a dictionary
df = data.copy(deep=True)
df = df.dropna()
df = df.drop_duplicates()
planet_map = dict(zip(df["GroupId"], df["Planet"]))
# replace missing values using dictionary
data["Planet"] = data["GroupId"].apply(lambda x: planet_map[x])
data
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.