[英]How to extract rows of a pandas dataframe according to conditions based on another dataframe
[英]How to append entire rows based on matching conditions in a pandas dataframe?
我有一个 dataframe 看起来像:
import pandas as pd
df_ref = pd.DataFrame({'district':['A Nzo DM','A Nzo DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM'],
'visit_date':['2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-08-31','2021-08-31','2021-08-31'],
'province':['EC','EC','NC','NC','NC','NC','NC'],
'age_group':['35-49','50-59','18-34','35-49','18-34','35-49','Unidentified'],
'sex':['Male','Female','Female','Male','Female','Male','Female'],
'vaccinations':[1,5,6,8,9,10,14]})
数据将用于数据可视化软件我需要每个district
的每个'visit_date (already sampled to month) to be mapped [![enter image description here][1]][1] whereby each
性别(Male and Female) has these age groups mapped to it (18-34,35-49,50-59,60+,Unidentified) for each month = (
visit_date`)。 结果将是:
maz = {'district':['A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM',
'A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM','A Nzo DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM',
'uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM','uMgungundlovu DM'],
'visit_date':['2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
'2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
'2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
'2021-07-31','2021-07-31','2021-07-31','2021-07-31','2021-07-31',
'2021-08-31','2021-08-31','2021-08-31','2021-08-31','2021-08-31',
'2021-08-31','2021-08-31','2021-08-31','2021-08-31','2021-08-31'],
'province':['EC','EC','EC','EC','EC',
'EC','EC','EC','EC','EC',
'NC','NC','NC','NC','NC',
'NC','NC','NC','NC','NC',
'NC','NC','NC','NC','NC',
'NC','NC','NC','NC','NC'],
'age_group':['18-34','35-49','50-59','60+','Unidentified',
'18-34','35-49','50-59','60+','Unidentified',
'18-34','35-49','50-59','60+','Unidentified',
'18-34','35-49','50-59','60+','Unidentified',
'18-34','35-49','50-59','60+','Unidentified',
'18-34','35-49','50-59','60+','Unidentified'],
'sex':['Male','Female','Male','Female',
'Male','Female','Male','Female',
'Male','Female','Male','Female',
'Male','Female','Male','Female',
'Male','Female','Male','Female',
'Male','Female','Male','Female',
'Male','Female','Male','Female',
'Male','Female'],}
df_output = pd.DataFrame(maz)
IIUC,您需要“age_group”和“sex”列的乘积,然后与列的 rest 进行“交叉”合并,然后删除重复项
t = pd.DataFrame(
itertools.product(df_ref["age_group"], df_ref["sex"]), columns=["age_group", "sex"]
).drop_duplicates(ignore_index=True)
out = pd.merge(
df_ref[["district", "visit_date", "province"]], t, how="cross"
).drop_duplicates(ignore_index=True)
打印出):
注意:这没有 60+,因为输入 dataframe 没有它。
district visit_date province age_group sex
0 A Nzo DM 2021-07-31 EC 35-49 Male
1 A Nzo DM 2021-07-31 EC 35-49 Female
2 A Nzo DM 2021-07-31 EC 50-59 Male
3 A Nzo DM 2021-07-31 EC 50-59 Female
4 A Nzo DM 2021-07-31 EC 18-34 Male
5 A Nzo DM 2021-07-31 EC 18-34 Female
6 A Nzo DM 2021-07-31 EC Unidentified Male
7 A Nzo DM 2021-07-31 EC Unidentified Female
8 uMgungundlovu DM 2021-07-31 NC 35-49 Male
9 uMgungundlovu DM 2021-07-31 NC 35-49 Female
10 uMgungundlovu DM 2021-07-31 NC 50-59 Male
11 uMgungundlovu DM 2021-07-31 NC 50-59 Female
12 uMgungundlovu DM 2021-07-31 NC 18-34 Male
13 uMgungundlovu DM 2021-07-31 NC 18-34 Female
14 uMgungundlovu DM 2021-07-31 NC Unidentified Male
15 uMgungundlovu DM 2021-07-31 NC Unidentified Female
16 uMgungundlovu DM 2021-08-31 NC 35-49 Male
17 uMgungundlovu DM 2021-08-31 NC 35-49 Female
18 uMgungundlovu DM 2021-08-31 NC 50-59 Male
19 uMgungundlovu DM 2021-08-31 NC 50-59 Female
20 uMgungundlovu DM 2021-08-31 NC 18-34 Male
21 uMgungundlovu DM 2021-08-31 NC 18-34 Female
22 uMgungundlovu DM 2021-08-31 NC Unidentified Male
23 uMgungundlovu DM 2021-08-31 NC Unidentified Female
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.