[英]How do I loop variable names based on values in a list
I have this list with five heights in it and I want to put it in a loop to create five separate dataframes indexed by these numbers.我有一个包含五个高度的列表,我想把它放在一个循环中,以创建由这些数字索引的五个单独的数据框。 This would include creating a column name based on different height, reading a csv file and assigning the colNames to it, and finally dropping the unused columns.
这将包括根据不同的高度创建列名,读取 csv 文件并为其分配 colNames,最后删除未使用的列。 I have multiple blocks of the same code to do this but I want to learn how to do it with a loop so I can clean up my script.
我有多个相同代码块来执行此操作,但我想学习如何使用循环来执行此操作,以便清理我的脚本。
I get a NameError: name 'colNames' is not defined.我得到一个 NameError: name 'colNames' is not defined。
i = 0
height = ['0', '5', '15', '25', '50']
while i < len(height):
colNames["height{}".format(i)] = ["A", "B_%s" % height, "C", "D"]
df["height{}".format(i)] = pd.read_csv("test%s.csv" % height, names = colNames["height{}".format(i)])
df["height{}".format(i)].drop(labels = ["A", "C"],axis = 1, inplace = True)
i += 1
Expected results预期成绩
colNames0 = ["A", "B_0", "C", "D"]
df0 = pd.read_csv("test0.csv", names = colNames0])
df0.drop(labels = ["A", "C"], axis = 1, inplace = True)
...
colNames50 = ["A", "B_0", "C", "D"]
df50 = pd.read_csv("test50.csv", names = colNames50])
df50.drop(labels = ["A", "C"], axis = 1, inplace = True)
Trying to name separate DataFrames in this way is a bit unwieldy in Python, but here is how I might go about writing a loop for the problem you pose:在 Python 中尝试以这种方式命名单独的 DataFrame 有点笨拙,但我可能会为 go 编写一个循环来解决您提出的问题:
dflist = []
for num, height in enumerate(['0', '5', '15', '25', '50']):
dflist.append(pd.read_csv('test{}.csv'.format(height), names=['A', 'B{}'.format(height), 'C', 'D'])[['B{}'.format(height), 'D']])
You would not have DataFrames named df0, df5, ..., but will rather have a list of DataFrames.您不会有名为 df0、df5、... 的 DataFrame,而是会有一个 DataFrame 列表。 Unless there is a reason to save the various column names, you can just name your columns directly in the call to pd.read_csv.
除非有理由保存各种列名,否则您可以直接在对 pd.read_csv 的调用中命名您的列。 Additionally, selecting only the columns you want to keep at the end of the line is a little more streamlined than dropping the others in a separate command.
此外,与将其他列放在单独的命令中相比,仅选择要保留在行尾的列要简化一些。 As a side note,
作为旁注,
df['newname'] = value
is a way to make a new column in an existing DataFrame, not a way to define a DataFrame.是一种在现有 DataFrame 中创建新列的方法,而不是定义 DataFrame 的方法。
The reason you are getting a NameError is because the syntax您收到 NameError 的原因是因为语法
colNames[x] = value
assumes you are trying to assign the value to a pre-existing object named "colNames".假设您正在尝试将值分配给名为“colNames”的预先存在的 object。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.