简体   繁体   English

Pandas 根据另一列填充列中的空单元格

[英]Pandas to fill empty cells in column according to another column

A dataframe looks like this, and I want to fill the empty cells in the 'Date' column (when the "Area" is West or North), with content in "Year" column plus "0601". dataframe 看起来像这样,我想填充“日期”列中的空单元格(当“区域”为西或北时),“年”列中的内容加上“0601”。

在此处输入图像描述

Wanted result is as follows:想要的结果如下:

在此处输入图像描述

What I have tried:我试过的:

from io import StringIO
import pandas as pd


csvfile = StringIO(
"""
Name    Area    Date    Year
David   West        2014
Mike    North   20220919    2022
Kate    West        2017
Lilly   East    20221226    2022
Peter   North   20221226    2022
Cara    Middle      2016

""")

df = pd.read_csv(csvfile, sep = '\t', engine='python')


L1 = ['West','North']
m1 = df['Date'].isnull()
m2 = df['Area'].isin(L1)

df['Date'] = df['Date'].mask(m1 & m2, df['Year'] + '0601')      # Try_1

df['Date'] = np.where(np.where(m1 & m2, df['Year'] + '0601'))   # Try_2

Both Try_1 and Try_2 pop the same error. Try_1 和 Try_2 都弹出相同的错误。

What's the right way to write the lines?台词的正确写法是什么? Thank you.谢谢你。

Traceback (most recent call last):
  File "C:\Python38\lib\site-packages\pandas\core\ops\array_ops.py", line 142, in _na_arithmetic_op
    result = expressions.evaluate(op, left, right)
  File "C:\Python38\lib\site-packages\pandas\core\computation\expressions.py", line 235, in evaluate
    return _evaluate(op, op_str, a, b)  # type: ignore[misc]
  File "C:\Python38\lib\site-packages\pandas\core\computation\expressions.py", line 69, in _evaluate_standard
    return op(a, b)
numpy.core._exceptions.UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U21'), dtype('<U21')) -> dtype('<U21')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\My Documents\Scripts\(Desktop) WSS 20200323\GG.py", line 336, in <module>
    df['Date'] = np.where(np.where(m1 & m2, df['Year'] + '0601'))                   # try 2
  File "C:\Python38\lib\site-packages\pandas\core\ops\common.py", line 65, in new_method
    return method(self, other)
  File "C:\Python38\lib\site-packages\pandas\core\arraylike.py", line 89, in __add__
    return self._arith_method(other, operator.add)
  File "C:\Python38\lib\site-packages\pandas\core\series.py", line 4998, in _arith_method
    result = ops.arithmetic_op(lvalues, rvalues, op)
  File "C:\Python38\lib\site-packages\pandas\core\ops\array_ops.py", line 189, in arithmetic_op
    res_values = _na_arithmetic_op(lvalues, rvalues, op)
  File "C:\Python38\lib\site-packages\pandas\core\ops\array_ops.py", line 149, in _na_arithmetic_op
    result = _masked_arith_op(left, right, op)
  File "C:\Python38\lib\site-packages\pandas\core\ops\array_ops.py", line 111, in _masked_arith_op
    result[mask] = op(xrav[mask], y)
numpy.core._exceptions.UFuncTypeError: ufunc 'add' did not contain a loop with signature matching types (dtype('<U21'), dtype('<U21')) -> dtype('<U21')

You example works find, provided you have strings:如果您有字符串,您可以找到示例作品:

csvfile = StringIO("""
Name    Area        Date    Year
David   West         NaN    2014
Mike    North   20220919    2022
Kate    West         NaN    2017
Lilly   East    20221226    2022
Peter   North   20221226    2022
Cara    Middle       NaN    2016
""")

df = pd.read_csv(csvfile, sep = '\s+', engine='python', dtype='str')


L1 = ['West','North']
m1 = df['Date'].isnull()
m2 = df['Area'].isin(L1)

df['Date'] = df['Date'].mask(m1 & m2, df['Year'] + '0601')

print(df)

If year is not a string:如果年份不是字符串:

df['Date'] = df['Date'].mask(m1 & m2, df['Year'].astype(str) + '0601')

Output: Output:

    Name    Area      Date  Year
0  David    West  20140601  2014
1   Mike   North  20220919  2022
2   Kate    West  20170601  2017
3  Lilly    East  20221226  2022
4  Peter   North  20221226  2022
5   Cara  Middle       NaN  2016

If you have numeric data:如果您有数字数据:

df['Date'] = df['Date'].mask(m1 & m2, df['Year'].mul(10000) + 601)

Output: Output:

    Name    Area        Date  Year
0  David    West  20140601.0  2014
1   Mike   North  20220919.0  2022
2   Kate    West  20170601.0  2017
3  Lilly    East  20221226.0  2022
4  Peter   North  20221226.0  2022
5   Cara  Middle         NaN  2016

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据来自另一列的信息用熊猫填充一个空列 - Fill an empty column according to information from another column with pandas 根据另一列 Pandas,如果它们不为空,则用它们的邻居填充空单元格 - Fill the empty cells with their neighbours if they are not empty based on another column Pandas 通过与另一个数据名列匹配来填充 pandas dataframe 列的空白单元格 - Fill blank cells of a pandas dataframe column by matching with another datafame column 如何创建熊猫列并根据另一列中的值填充值 - How to create pandas columns and fill with values according to values in another column Pandas 根据 DataFrame 中的另一列填充 NA 的增量值 - Pandas fill incremental values for NA's according to another column in the DataFrame 用熊猫数据框中另一列的相同值填充空值 - fill up empty values with same value of another column in pandas dataframe 如何根据 pandas dataframe 中的另一列在一列中填充 null 值? - How can I fill null values in one column according to another column in pandas dataframe? 用其他列的值填充列中的空单元格 - Fill empty cells in column with value of other columns 用熊猫的索引间隔填充列中的单元格 - Fill cells in the column with the index interval in pandas 用 Python 中的另一个列值填充空白单元格 - Fill blank cells with another column value in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM