简体   繁体   English

如何将两列的值添加到字典中进行映射

[英]How to add values of two column to dictionary for mapping

I have a data of two column like this我有这样的两列数据

data.head()
   GroupId  Planet
0    0008   Europa
1    0008   Europa
2    0009   Mars
3    0010   Earth
4    0011   Earth
5    0012   Earth
6    0012   NaN

Pl.net column has missing values but same groupid shares same pl.net. Pl.net 列有缺失值,但相同的 groupid 共享相同的 pl.net。 How can i create dictionary for mapping to fill na like { '0008': 'Europa', '0009': 'Mars...}我如何创建用于映射的字典以填充 na,如 { '0008': 'Europa', '0009': 'Mars...}

You can remove NaN values and do dictionary mapping.您可以删除NaN值并进行字典映射。

data = data.dropna()
final_df = {k:v for k,v in list(set(zip(data['GroupId'], data['Planet'])))}
final_df
df.drop_duplicates(subset=['GroupId'], keep='first').set_index('GroupId').to_dict()['Planet']

Here's what you need, I have added sample dataframe with NaN as per your description -这是您需要的,我根据您的描述添加了带有NaN的示例 dataframe -

import numpy as np
import pandas as pd

# dataframe
planet_data = {
    "GroupId": ["0008", "0008", "0009", "0010", "0008", "0011", "0012"],
    "Planet":  ["Europa", "Europa", "Mars", "Earth", np.nan, "Earth", "Earth"]
}

data = pd.DataFrame(planet_data)

# create a dictionary
df = data.copy(deep=True)
df = df.dropna()
df = df.drop_duplicates()
planet_map = dict(zip(df["GroupId"], df["Planet"]))

# replace missing values using dictionary
data["Planet"] = data["GroupId"].apply(lambda x: planet_map[x])
data

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM