简体   繁体   English

如何将 3 个 Pandas 数据帧合并到第 4 个数据帧以匹配列值名称?

[英]How to Merge 3 Pandas DataFrames to a 4th DataFrame to match column value Name?

I have one main DataFrame titled nflLineups.我有一个名为 nflLineups 的主要 DataFrame。

I am looking to merge 3 more DataFrames: dfPass, dfRush, dfReceive with the first DF, nflLineups.我希望合并另外 3 个数据帧:dfPass、dfRush、dfReceive 与第一个 DF、nflLineups。

So far nothing I have tried has worked.到目前为止,我尝试过的一切都没有奏效。 Tried appending, concatenation, and merging -- merge how='left', how='outer', on = 'Name', etc.尝试追加、连接和合并——合并 how='left'、how='outer'、on = 'Name' 等。

My goal is to have one large output that merges the data on Name but maintains all of the columns and their respective values.我的目标是有一个大的输出来合并 Name 上的数据,但保留所有列及其各自的值。

The main output should have the following columns: Name, Team, Position, passYrds, rushYrds, recYrds.主要输出应包含以下列:姓名、团队、职位、passYrds、rushYrds、recYrds。 I would just like the stat data (pass, rush, rec) to fill in their respective rows next to the player's name in nflLineups.我只想让统计数据(传球、冲刺、记录)在 nflLineups 中玩家姓名旁边填写各自的行。 Not every player has data in every category so those values should be left blank (n/a).并非每个玩家都有每个类别的数据,因此这些值应留空 (n/a)。

I see that there are some merging examples on Stack but have yet to find code that I can use successfully.我看到 Stack 上有一些合并示例,但还没有找到可以成功使用的代码。 Spent the last 2 days messing with this and could use some help if possible.在过去的 2 天里一直在搞这个,如果可能的话可以使用一些帮助。 Still learning how to merge data and consider myself a relatively new with Python.仍在学习如何合并数据并认为自己是 Python 的新手。

Any help would be greatly appreciated.任何帮助将不胜感激。

Here is my code so far:到目前为止,这是我的代码:

import pandas as pd

nflLineups = pd.DataFrame([{'Name': 'Teddy', 'Team': 'DEN', 'Position': 'QB'},
                        {'Name': 'Melvin', 'Team': 'DEN', 'Position': 'RB'},
                        {'Name': 'Courtland', 'Team': 'DEN', 'Position': 'WR'},
                        {'Name': 'Tim', 'Team': 'DEN', 'Position': 'WR'},
                        {'Name': 'Kendal', 'Team': 'DEN', 'Position': 'WR'},
                        {'Name': 'Noah', 'Team': 'DEN', 'Position': 'TE'},
                        
                        {'Name': 'Case', 'Team': 'CLE', 'Position': 'QB'},
                        {'Name': 'D Ernest', 'Team': 'CLE', 'Position': 'RB'},
                        {'Name': 'Odell', 'Team': 'CLE', 'Position': 'WR'},
                        {'Name': 'Jarvis', 'Team': 'CLE', 'Position': 'WR'},
                        {'Name': 'Donovan', 'Team': 'CLE', 'Position': 'WR'},
                        {'Name': 'Austin', 'Team': 'CLE', 'Position': 'TE'},])


dfPass = pd.DataFrame([{'Name': 'Teddy', 'Team': 'DEN', 'Position': 'QB', 'passYrds': 1500},
                        {'Name': 'Case', 'Team': 'CLE', 'Position': 'QB', 'passYrds': 1350}])


dfRun = pd.DataFrame([{'Name': 'Teddy', 'Team': 'DEN', 'Position': 'QB', 'rushYrds': 45},
                        {'Name': 'D Ernest', 'Team': 'CLE', 'Position': 'RB', 'rushYrds': 350}])


dfReceive = pd.DataFrame([{'Name': 'D Ernest', 'Team': 'CLE', 'Position': 'RB', 'recYrds': 68},
                        {'Name': 'Jarvis', 'Team': 'CLE', 'Position': 'WR', 'recYrds': 250}])

IIUC, one way using pandas.DataFrame.groupby.first after pandas.concat . IIUC,一种在pandas.concat之后使用pandas.DataFrame.groupby.first pandas.concat

Note that I assumed Team and Position are same for each Name .请注意,我假设每个Name TeamPosition都相同。

df = pd.concat([nflLineups, dfPass, dfRun, dfReceive])
df = df.groupby("Name", sort=False).first()

Output:输出:

          Team Position  passYrds  rushYrds  recYrds
Name                                                
Teddy      DEN       QB    1500.0      45.0      NaN
Melvin     DEN       RB       NaN       NaN      NaN
Courtland  DEN       WR       NaN       NaN      NaN
Tim        DEN       WR       NaN       NaN      NaN
Kendal     DEN       WR       NaN       NaN      NaN
Noah       DEN       TE       NaN       NaN      NaN
Case       CLE       QB    1350.0       NaN      NaN
D Ernest   CLE       RB       NaN     350.0     68.0
Odell      CLE       WR       NaN       NaN      NaN
Jarvis     CLE       WR       NaN       NaN    250.0
Donovan    CLE       WR       NaN       NaN      NaN
Austin     CLE       TE       NaN       NaN      NaN

Or with merge:或合并:

df_main = nflLineups.merge(dfPass, how='left', on=['Name', 'Team', 'Position']).merge(dfRun, how='left', on=['Name', 'Team', 'Position']).merge(dfReceive, how='left', on=['Name', 'Team', 'Position'])

Output:输出:

         Name Team Position  passYrds  rushYrds  recYrds
0       Teddy  DEN       QB    1500.0      45.0      NaN
1      Melvin  DEN       RB       NaN       NaN      NaN
2   Courtland  DEN       WR       NaN       NaN      NaN
3         Tim  DEN       WR       NaN       NaN      NaN
4      Kendal  DEN       WR       NaN       NaN      NaN
5        Noah  DEN       TE       NaN       NaN      NaN
6        Case  CLE       QB    1350.0       NaN      NaN
7    D Ernest  CLE       RB       NaN     350.0     68.0
8       Odell  CLE       WR       NaN       NaN      NaN
9      Jarvis  CLE       WR       NaN       NaN    250.0
10    Donovan  CLE       WR       NaN       NaN      NaN
11     Austin  CLE       TE       NaN       NaN      NaN

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何每第 4 行将 append 列总计到 pandas dataframe? - How to append column total to pandas dataframe every 4th row? 如何在不使用多个 OR 的情况下将单个条件应用于数据框中的列列表并将值添加到第 4 列 - How to apply single condition to a list of columns in a dataframe and add value to 4th column without using multiple OR's 如何合并具有相同列名的Pandas DataFrame? - How to merge pandas DataFrame with same column name? 通过日期时间列中的部分匹配合并 Pandas DataFrames - Merge pandas DataFrames by partial match in datetime column 对于 pandas dataframe 中的列,从当前行计算前 4、8 和 12 行的列值的平均值? - For a column in pandas dataframe, calculate mean of column values in previous 4th, 8th and 12th row from the present row? 熊猫如何将父数据框与2个子数据框合并 - Pandas how to merge a parent dataframe with 2 child dataframes 如何按列值将 Pandas 数据帧拆分/切片为多个数据帧? - How to split/slice a Pandas dataframe into multiple dataframes by column value? 如何根据列值将pandas数据帧划分为更小的数据帧? - How to divide a pandas dataframe into smaller dataframes, based on a column value? 匹配两个数据框并填充pandas中的列值 - Match two dataframes and fill the column value in pandas 合并基于 substring 的 Pandas 数据帧或在另一个 Dataframe 中部分匹配 - Merge Pandas Dataframes based on substring or partial match in another Dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM