简体   繁体   English

根据 Python 中第 1 列中的值,用中值字符串填充第 2 列中的 NaN

[英]Fill NaN in column 2 with median string based on value in column 1 in Python

I have 2 columns of data with the first column being product codes (all filled out) and the second column with product description.我有 2 列数据,第一列是产品代码(全部填写),第二列是产品描述。

The first column has all the product codes filled out but there are some rows where the product description (second column) is missing.第一列填写了所有产品代码,但有些行缺少产品描述(第二列)。

For example row 200 has a product code of 145 but the description on that row is empty (NaN).例如,第 200 行的产品代码为 145,但该行的描述为空 (NaN)。 However, there are other rows with product code 145 where the description exists, which is "laptop".但是,存在描述的其他行的产品代码为 145,即“笔记本电脑”。 I would like to have the description of row 200 to be filled with "laptop" because that's the description for that product code.我想让第 200 行的描述用“笔记本电脑”填充,因为那是该产品代码的描述。

I want to find a solution where I can fill out all NaN values in the second column (product description) based on the first column (product code).我想找到一个解决方案,我可以根据第一列(产品代码)填写第二列(产品描述)中的所有 NaN 值。

Please help.请帮忙。

First, decide on a function that takes descriptions and picks out one of them.首先,决定一个 function 接受描述并挑选其中一个。 You could use min , max , mode , define you own get_desc , etc. Then you can separate the dataframe by product code with groupby and apply whatever function you decided on: df.groupby('product code').apply(get_desc) or df.groupby('product code')['product description'].apply(get_desc) depending on whether get_desc takes a dataframe or column as input.您可以使用minmaxmode ,定义您自己的get_desc等。然后您可以将 dataframe 与groupby分开,并应用您决定的任何 function: df.groupby('product code').apply(get_desc) df.groupby('product code')['product description'].apply(get_desc)取决于get_desc是否将 dataframe 或列作为输入。 Then you can merge the resulting dataframe with your original dataframe.然后,您可以将生成的 dataframe 与原始 dataframe 合并。 You can either replace the entire original product description column with the product description column of the groupby output, or have merge create a new column, then fillna the old product description with the new product description.您可以将整个原始产品描述列替换为 groupby output 的产品描述列,或者合并创建一个新列,然后用新的产品描述填充旧的产品描述。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM