简体   繁体   English

基于第三个值的 New_target 列

[英]New_target column based on 3rd value

I have a Dataframe:我有一个 Dataframe:

source    target              
jan       feb                               
mar       apr                 
jun                       
feb       aug                                            
apr       jul                                            
oct       dec                     
aug       nov       
dec       may                               

The output dataframe would be: output dataframe 将是:

source    target    new_target              
jan       feb       aug                        
mar       apr       jul                  
jun                              
feb       aug       nov                                     
apr       jul       jul                                           
oct       dec       may              
aug       nov       nov
dec       may       may

So the new_target column will have 3rd value: ie (trace followed between source and target jan->feb->aug->nov , since aug is 3rd value, it is the output in new_target column)所以new_target列将有第三个值:即(源和目标之间的跟踪jan->feb->aug->nov ,因为aug是第三个值,它是new_target列中的 output )

Edit:编辑:

source    target    new_target              
jan       feb       aug                        
mar       apr       jul                  
jun                              
feb       aug       nov                                     
apr       jul                                                  
oct       dec       may              
aug       nov       
dec       may       

Use Series.map with Series created by DataFrame.set_index and then Series.fillna :Series.map与由DataFrame.set_indexSeries.fillna创建的Series一起使用:

s = df.set_index(['source'])['target']
#if possible duplicates in source
#s = df.drop_duplicates('source').set_index(['source'])['target']
df['new_target'] = df['target'].map(s).fillna(df['target'])
print (df)
  source target new_target
0    jan    feb        aug
1    mar    apr        jul
2    jun                  
3    feb    aug        nov
4    apr    jul        jul
5    oct    dec        may
6    aug    nov        nov
7    dec    may        may

EDIT:编辑:

s = df.set_index(['source'])['target']
#if possible duplicates in source
#s = df.drop_duplicates('source').set_index(['source'])['target']
df['new_target'] = df['target'].map(s)
print (df)
  source target new_target
0    jan    feb        aug
1    mar    apr        jul
2    jun               NaN
3    feb    aug        nov
4    apr    jul        NaN
5    oct    dec        may
6    aug    nov        NaN
7    dec    may        NaN
d = df.dropna().set_index('source').target.to_dict()
df['new_target'] = df.target.apply(lambda x: d.get(x,x))

    source  target  new_target
0   jan     feb     aug
1   mar     apr     jul
2   jun 
3   feb     aug     nov
4   apr     jul     jul
5   oct     dec     may
6   aug     nov     nov
7   dec     may     may

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 是否有一种SQL语法会根据同一表中第3列的相等值搜索2列来创建新列? - Is there a SQL syntax that will create new column by searching in 2 columns based on equal value of 3rd column in same table? 根据 Python 中的第三列值设置 Plotly 条形颜色 - Set Plotly Bar Color Based on 3rd Column Value in Python DataFrame:根据第 3 列中的值确定的动态列更新一列值 - DataFrame: update one column value based on dynamic column determined by value in 3rd column Python 或 Excel:如何比较 2 列,然后在新列中写入第 3 列的值? - Python or Excel: How can you compare 2 columns and then write the value of a 3rd column in a new column? Pandas:如果来自第三列的字符串值,则根据另一列的值创建列 - Pandas : Create columns based on values of another column if string value from 3rd column 比较来自相同 pandas dataframe 的 2 列的值和基于比较的第 3 列的返回值 - comparing values of 2 columns from same pandas dataframe & returning value of 3rd column based on comparison Python Pandas - 检查两列中的值,对第三列求和 - Python Pandas - check value in two columns, sum the 3rd column 根据多个其他列的值选择和更改第三个值 - Selecting and alter 3rd value based on multiple other columns' values 基于模式的新目标列 - New target column based on a pattern 是否有任何 R/shell/Perl/python 代码根据第一列和第二列计算第三列的平均值 - Is there any R/ shell/Perl/python code to calculate average of 3rd column based on first and 2nd column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM