使用 Pandas 合并来自单独的 .csv 文件的数据

Question

I want to create two new columns in job_transitions_sample.csv and add the wage data from wage_data_sample.csv for both Title 1 and Title 2:我想在 job_transitions_sample.csv 中创建两个新列，并为标题 1 和标题 2 添加来自 wage_data_sample.csv 的工资数据：

job_transitions_sample.csv: job_transitions_sample.csv：

                     Title 1                    Title 2  Count
0   administrative assistant             office manager     20
1                 accountant                    cashier      1
2                 accountant          financial analyst     22
4                 accountant          senior accountant     23
6           accounting clerk                 bookkeeper     11
7     accounts payable clerk  accounts receivable clerk      8
8   administrative assistant           accounting clerk      8
9   administrative assistant       administrative clerk     12
...

wage_data_sample.csv wage_data_sample.csv

                      title   wage
0                   cashier  17.00
1           sandwich artist  18.50
2                dishwasher  20.00
3                babysitter  20.00
4                   barista  21.50
5               housekeeper  21.50
6    retail sales associate  23.00
7                 bartender  23.50
8                   cleaner  23.50
9                 line cook  23.50
10               pizza cook  23.50
...

I want the end result to look like this:我希望最终结果看起来像这样：

                      Title 1             Title 2  Count  Wage of Title 1  Wage of Title 2
0    administrative assistant      office manager     20              NaN              NaN
1                  accountant             cashier      1              NaN              NaN
2                  accountant   financial analyst     22              NaN              NaN
...

I'm thinking of using dictionaries then try to iterate every column but is there a more elegant built in solution?我正在考虑使用字典然后尝试迭代每一列但是是否有更优雅的内置解决方案？ This is my code so far:到目前为止，这是我的代码：

wage_data = pd.read_csv('wage_data_sample.csv')
dict = dict(zip(wage_data.title, wage_data.wage))

Answer 1

Use Series.map by dictionary d - cannot use dict for varialbe name, because python code name:通过字典d使用Series.map - 不能使用dict作为变量名称，因为 python 代号：

df = pd.read_csv('job_transitions_sample.csv')
wage_data = pd.read_csv('wage_data_sample.csv')

d = dict(zip(wage_data.title, wage_data.wage))
df['Wage of Title 1'] = df['Title 1'].map(d)
df['Wage of Title 2'] = df['Title 2'].map(d)

Answer 2

You can try with 2 merge con the 2 different Titles subsequentely.您可以随后尝试对 2 个不同的标题进行 2 次merge 。

For example, let be例如，让

df1: job_transitions_sample.csv df1: job_transitions_sample.csv
df2: wage_data_sample.csv df2: wage_data_sample.csv
df1.merge(df2, left_on='Title 1', right_on='title',suffixes=('', 'Wage of')).merge(df2, left_on='Title 2', right_on='title',suffixes=('', 'Wage of')) df1.merge(df2, left_on='Title 1', right_on='title',suffixes=('', 'Wage of')).merge(df2, left_on='Title 2', right_on='title',suffixes =('', '工资'))

使用 Pandas 合并来自单独的 .csv 文件的数据

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-03-02 08:26:12

解决方案2
0 2022-03-02 08:23:01

使用 Pandas 合并来自单独的 .csv 文件的数据

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-03-02 08:26:12

解决方案2 0 2022-03-02 08:23:01

解决方案1
1 已采纳 2022-03-02 08:26:12

解决方案2
0 2022-03-02 08:23:01