[英]Fill values in a column of a particular row with the value of same column from another row based on a condition on second column in Pandas
[英]Fill empty values from a row with the value of next column on the same row on csv file with pandas
我有这种类型的DataFrame
name surname middle
Frank Doe NaN
John Nan Wood
Jack Putt Nan
Frank Nan Joyce
我想在“姓氏”列上的NaN相同行值上移动“中间”值。 我怎样才能做到这一点? 我尝试使用fillna方法,但没有得到任何结果。 这是我的代码:
import os
from pandas.io.parsers import read_csv
for csvFilename in os.listdir('.'):
if not csvFilename.endswith('.csv'):
continue
data=read_csv(csvFilename)
filtered_data["surname"].fillna(filtered_data["middle"].mean(),inplace=True)
filtered_data.to_csv('output.csv' , index=False)
使用pd.isnull()
,可以有条件地重新排列列。
import pandas as pd
from cStringIO import StringIO
# Create fake DataFrame... you can read this in however you like
df = pd.read_table(StringIO('''
name surname middle
Frank Doe NaN
John NaN Wood
Jack Putt NaN
Frank NaN Joyce'''), sep='\s+')
print 'Original DataFrame:'
print df
print
# Assign the middle name to any surname with a NaN
df.loc[pd.isnull(df['surname']), 'surname'] = df[pd.isnull(df['surname'])]['middle']
print 'Manipulated DataFrame:'
print df
print
Original DataFrame:
name surname middle
0 Frank Doe NaN
1 John NaN Wood
2 Jack Putt NaN
3 Frank NaN Joyce
Manipulated DataFrame:
name surname middle
0 Frank Doe NaN
1 John Wood Wood
2 Jack Putt NaN
3 Frank Joyce Joyce
我认为有一种更简单的方法:
df['surname'] = df['middle'].combine_first(df['surname'])
print(df)
输出:
name surname middle
0 Frank Doe NaN
1 John Wood Wood
2 Jack Putt NaN
3 Frank Joyce Joyce
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.