繁体   English   中英

使用pandas在csv文件的同一行上填充下一列值的行中的空值

[英]Fill empty values from a row with the value of next column on the same row on csv file with pandas

我有这种类型的DataFrame

name     surname       middle

Frank    Doe           NaN
John     Nan           Wood
Jack     Putt          Nan
Frank    Nan           Joyce

我想在“姓氏”列上的NaN相同行值上移动“中间”值。 我怎样才能做到这一点? 我尝试使用fillna方法,但没有得到任何结果。 这是我的代码:

import os
from pandas.io.parsers import read_csv


for csvFilename in os.listdir('.'):
   if not csvFilename.endswith('.csv'):
      continue
data=read_csv(csvFilename)
filtered_data["surname"].fillna(filtered_data["middle"].mean(),inplace=True)
filtered_data.to_csv('output.csv' , index=False)

条件列翻转

使用pd.isnull() ,可以有条件地重新排列列。

import pandas as pd
from cStringIO import StringIO

# Create fake DataFrame... you can read this in however you like
df = pd.read_table(StringIO('''
name     surname       middle
Frank    Doe           NaN
John     NaN           Wood
Jack     Putt          NaN
Frank    NaN           Joyce'''), sep='\s+')

print 'Original DataFrame:'
print df
print

# Assign the middle name to any surname with a NaN
df.loc[pd.isnull(df['surname']), 'surname'] = df[pd.isnull(df['surname'])]['middle']

print 'Manipulated DataFrame:'
print df
print

Original DataFrame:
    name surname middle
0  Frank     Doe    NaN
1   John     NaN   Wood
2   Jack    Putt    NaN
3  Frank     NaN  Joyce

Manipulated DataFrame:
    name surname middle
0  Frank     Doe    NaN
1   John    Wood   Wood
2   Jack    Putt    NaN
3  Frank   Joyce  Joyce

我认为有一种更简单的方法:

df['surname'] = df['middle'].combine_first(df['surname'])
print(df)

输出:

    name surname middle
0  Frank     Doe    NaN
1   John    Wood   Wood
2   Jack    Putt    NaN
3  Frank   Joyce  Joyce

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM