[英]Python Pandas Read multiple SAS files from a list into separate dataframes
I'm reading a bunch of SAS files like so: 我正在读取许多SAS文件,如下所示:
demography = pd.read_sas("demography.sas7bdat", encoding = 'latin-1') adverse_event_ds = pd.read_sas("adverse_event_ds.sas7bdat", encoding = 'latin-1') rpt10344 = pd.read_sas("rpt10344.sas7bdat", encoding = 'latin-1') vaccine_administration = pd.read_sas("vaccine_administration.sas7bdat", encoding = 'latin-1') lab_tests_blood_chemistry_ds = pd.read_sas("lab_tests_blood_chemistry_ds.sas7bdat", encoding = 'latin-1') lab_tests_hematology_ds = pd.read_sas("lab_tests_hematology_ds.sas7bdat", encoding = 'latin-1') lab_tests_miscellaneous_ds = pd.read_sas("lab_tests_miscellaneous_ds.sas7bdat", encoding = 'latin-1') vital_signs = pd.read_sas("vital_signs.sas7bdat", encoding = 'latin-1')
I want to be able to replace it with something like this: 我希望能够将其替换为以下内容:
datasets = ["demography", "adverse_event_ds", "rpt10344", "vaccine_administration", "lab_tests_blood_chemistry_ds", "lab_tests_hematology_ds", "lab_tests_miscellaneous_ds", "vital_signs"]
for dataset in datasets: dataset = pd.read_sas(dataset+".sas7bdat", encoding = 'latin-1')
But when I do something like: demography.info()
但是当我做类似的事情时:
demography.info()
I get: NameError: name 'demography' is not defined
我得到:
NameError: name 'demography' is not defined
What's happening under the hood and how can I fix this? 到底发生了什么,我该如何解决?
this is assigning to dataset
on every iteration rather than creating the new variables (eg demography
, rpt10344
, etc). 这是在每次迭代时分配给
dataset
,而不是创建新变量(例如, demography
, rpt10344
等)。
i'd use a dataset dictionary as follows: 我将使用数据集字典,如下所示:
dsd = {}
for dataset in datasets:
dsd[dataset] = pd.read_sas(dataset+".sas7bdat", encoding = 'latin-1')
or a more pythonic route: 或更Python的路线:
dsd = { d : pd.read_sas(d + ".sas7bdat", encoding = 'latin-1') for d in datasets }
I'd strongly advise against assigning to individual variable names for reasons explained here and here but if you absolutely must you can use 我强烈建议您不要出于此处和此处说明的原因而为各个变量名分配变量,但是如果您绝对必须可以使用
for d in datasets:
globals()[d] = pd.read_sas(d + ".sas7bdat", encoding = 'latin-1')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.