简体   繁体   English

如何根据 pandas dataframe 中的匹配条件对整行进行 append?

[英]How to append entire rows based on matching conditions in a pandas dataframe?

I have a dataframe which looks like the:我有一个 dataframe 看起来像:

import pandas as pd
df_ref = pd.DataFrame({'district':['A Nzo DM','A Nzo DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM'],
'visit_date':['2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-08-31','2021-08-31','2021-08-31'],
'province':['EC','EC','NC','NC','NC','NC','NC'],
'age_group':['35-49','50-59','18-34','35-49','18-34','35-49','Unidentified'],
'sex':['Male','Female','Female','Male','Female','Male','Female'],
'vaccinations':[1,5,6,8,9,10,14]})

初始表 The data is going to be used in data visualization software I need each district for each 'visit_date (already sampled to month) to be mapped [![enter image description here][1]][1] whereby each Sex (Male and Female) has these age groups mapped to it (18-34,35-49,50-59,60+,Unidentified) for each month = ( visit_date`).数据将用于数据可视化软件我需要每个district的每个'visit_date (already sampled to month) to be mapped [![enter image description here][1]][1] whereby each性别(Male and Female) has these age groups mapped to it (18-34,35-49,50-59,60+,Unidentified) for each month = ( visit_date`)。 The result would be:结果将是:

maz = {'district':['A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM',
'A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM'],
'visit_date':['2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
                '2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
                '2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
                '2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
                '2021-08-31','2021-08-31','2021-08-31','2021-08-31','2021-08-31',
                '2021-08-31','2021-08-31','2021-08-31','2021-08-31','2021-08-31'],
'province':['EC','EC','EC','EC','EC',
            'EC','EC','EC','EC','EC',
            'NC','NC','NC','NC','NC',
            'NC','NC','NC','NC','NC',
            'NC','NC','NC','NC','NC',
            'NC','NC','NC','NC','NC'],
'age_group':['18-34','35-49','50-59','60+','Unidentified',
                '18-34','35-49','50-59','60+','Unidentified',
                '18-34','35-49','50-59','60+','Unidentified',
                '18-34','35-49','50-59','60+','Unidentified',
                '18-34','35-49','50-59','60+','Unidentified',
                '18-34','35-49','50-59','60+','Unidentified'],
'sex':['Male','Female','Male','Female',
       'Male','Female','Male','Female',
       'Male','Female','Male','Female',
       'Male','Female','Male','Female',
       'Male','Female','Male','Female',
       'Male','Female','Male','Female',
       'Male','Female','Male','Female',
       'Male','Female'],}
df_output = pd.DataFrame(maz)

输出

IIUC, you need a product of 'age_group' and 'sex' columns and then a 'cross' merge with rest of the columns and then drop duplicates IIUC,您需要“age_group”和“sex”列的乘积,然后与列的 rest 进行“交叉”合并,然后删除重复项

t = pd.DataFrame(
    itertools.product(df_ref["age_group"], df_ref["sex"]), columns=["age_group", "sex"]
).drop_duplicates(ignore_index=True)
out = pd.merge(
    df_ref[["district", "visit_date", "province"]], t, how="cross"
).drop_duplicates(ignore_index=True)

print(out):打印出):

Note: This doesn't have 60+ because input dataframe doesn't have it.注意:这没有 60+,因为输入 dataframe 没有它。

            district  visit_date province     age_group     sex
0           A Nzo DM  2021-07-31       EC         35-49    Male
1           A Nzo DM  2021-07-31       EC         35-49  Female
2           A Nzo DM  2021-07-31       EC         50-59    Male
3           A Nzo DM  2021-07-31       EC         50-59  Female
4           A Nzo DM  2021-07-31       EC         18-34    Male
5           A Nzo DM  2021-07-31       EC         18-34  Female
6           A Nzo DM  2021-07-31       EC  Unidentified    Male
7           A Nzo DM  2021-07-31       EC  Unidentified  Female
8   uMgungundlovu DM  2021-07-31       NC         35-49    Male
9   uMgungundlovu DM  2021-07-31       NC         35-49  Female
10  uMgungundlovu DM  2021-07-31       NC         50-59    Male
11  uMgungundlovu DM  2021-07-31       NC         50-59  Female
12  uMgungundlovu DM  2021-07-31       NC         18-34    Male
13  uMgungundlovu DM  2021-07-31       NC         18-34  Female
14  uMgungundlovu DM  2021-07-31       NC  Unidentified    Male
15  uMgungundlovu DM  2021-07-31       NC  Unidentified  Female
16  uMgungundlovu DM  2021-08-31       NC         35-49    Male
17  uMgungundlovu DM  2021-08-31       NC         35-49  Female
18  uMgungundlovu DM  2021-08-31       NC         50-59    Male
19  uMgungundlovu DM  2021-08-31       NC         50-59  Female
20  uMgungundlovu DM  2021-08-31       NC         18-34    Male
21  uMgungundlovu DM  2021-08-31       NC         18-34  Female
22  uMgungundlovu DM  2021-08-31       NC  Unidentified    Male
23  uMgungundlovu DM  2021-08-31       NC  Unidentified  Female

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据基于另一个数据帧的条件提取熊猫数据帧的行 - How to extract rows of a pandas dataframe according to conditions based on another dataframe 如何根据考虑其他 dataframe 的条件删除 pandas dataframe 行 - How to drop pandas dataframe rows based on conditions that consider other dataframe 熊猫:从DataFrame匹配条件中删除行 - Pandas: Delete Rows From DataFrame Matching Conditions 如何根据pandas数据框中的多列值条件排除行? - How to exclude rows based on multi column value conditions in pandas dataframe? 如何根据复杂条件删除特定的 pandas dataframe 行 - How to drop specific pandas dataframe rows based on complex conditions 如何根据 Pandas 中的条件创建 dataframe 行的修改副本? - How to create modified copy of dataframe rows based on conditions in Pandas? 如何根据这些条件“合并” Pandas DataFrame 中的行 - How can I “merge” rows in a Pandas DataFrame based on these conditions 如何基于多列中的字符串匹配 Pandas dataframe 中的 select 行 - How to select rows in Pandas dataframe based on string matching in multiple columns 根据条件选择 pandas dataframe 上的行 - selecting rows on pandas dataframe based on conditions 如何在条件匹配三行的情况下迭代熊猫数据框中的选定行? - How to iterate through selected rows in pandas dataframe with conditions matching three rows?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM