[英]Merging data from a separate .csv file using Pandas
I want to create two new columns in job_transitions_sample.csv and add the wage data from wage_data_sample.csv for both Title 1 and Title 2:我想在 job_transitions_sample.csv 中创建两个新列,并为标题 1 和标题 2 添加来自 wage_data_sample.csv 的工资数据:
job_transitions_sample.csv: job_transitions_sample.csv:
Title 1 Title 2 Count
0 administrative assistant office manager 20
1 accountant cashier 1
2 accountant financial analyst 22
4 accountant senior accountant 23
6 accounting clerk bookkeeper 11
7 accounts payable clerk accounts receivable clerk 8
8 administrative assistant accounting clerk 8
9 administrative assistant administrative clerk 12
...
wage_data_sample.csv wage_data_sample.csv
title wage
0 cashier 17.00
1 sandwich artist 18.50
2 dishwasher 20.00
3 babysitter 20.00
4 barista 21.50
5 housekeeper 21.50
6 retail sales associate 23.00
7 bartender 23.50
8 cleaner 23.50
9 line cook 23.50
10 pizza cook 23.50
...
I want the end result to look like this:我希望最终结果看起来像这样:
Title 1 Title 2 Count Wage of Title 1 Wage of Title 2
0 administrative assistant office manager 20 NaN NaN
1 accountant cashier 1 NaN NaN
2 accountant financial analyst 22 NaN NaN
...
I'm thinking of using dictionaries then try to iterate every column but is there a more elegant built in solution?我正在考虑使用字典然后尝试迭代每一列但是是否有更优雅的内置解决方案? This is my code so far:到目前为止,这是我的代码:
wage_data = pd.read_csv('wage_data_sample.csv')
dict = dict(zip(wage_data.title, wage_data.wage))
Use Series.map
by dictionary d
- cannot use dict
for varialbe name, because python code name:通过字典d
使用Series.map
- 不能使用dict
作为变量名称,因为 python 代号:
df = pd.read_csv('job_transitions_sample.csv')
wage_data = pd.read_csv('wage_data_sample.csv')
d = dict(zip(wage_data.title, wage_data.wage))
df['Wage of Title 1'] = df['Title 1'].map(d)
df['Wage of Title 2'] = df['Title 2'].map(d)
You can try with 2 merge
con the 2 different Titles subsequentely.您可以随后尝试对 2 个不同的标题进行 2 次merge
。
For example, let be例如,让
df1: job_transitions_sample.csv df1: job_transitions_sample.csv
df2: wage_data_sample.csv df2: wage_data_sample.csv
df1.merge(df2, left_on='Title 1', right_on='title',suffixes=('', 'Wage of')).merge(df2, left_on='Title 2', right_on='title',suffixes=('', 'Wage of')) df1.merge(df2, left_on='Title 1', right_on='title',suffixes=('', 'Wage of')).merge(df2, left_on='Title 2', right_on='title',suffixes =('', '工资'))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.