简体   繁体   English

Pandas / Python:替换多列中的多个值

[英]Pandas/Python: Replace multiple values in multiple columns

All, I have an analytical csv file with 190 columns and 902 rows. 全部,我有一个190列和902行的分析csv文件。 I need to recode values in several columns (18 to be exact) from it's current 1-5 Likert scaling to 0-4 Likert scaling. 我需要重新编码几列中的值(准确地说是18),从当前的1-5 Likert缩放到0-4 Likert缩放。

I've tried using replace: 我尝试过使用替换:

df.replace({'Job_Performance1': {1:0, 2:1, 3:2, 4:3, 5:4}}, inplace=True)

But that throws a Value Error: "Replacement not allowed with overlapping keys and values" 但是这会引发一个值错误: “重叠键和值不允许替换”

I can use map: 我可以使用地图:

df['job_perf1'] = df.Job_Performance1.map({1:0, 2:1, 3:2, 4:3, 5:4})

But, I know there has to be a more efficient way to accomplish this since this use case is standard in statistical analysis and statistical software eg SPSS 但是,我知道必须有一种更有效的方法来实现这一点,因为这个用例是统计分析和统计软件的标准,例如SPSS

I've reviewed multiple questions on StackOverFlow but none of them quite fit my use case. 我已经回顾了StackOverFlow上的多个问题,但它们都不适合我的用例。 eg Pandas - replacing column values , pandas replace multiple values one column , Python pandas: replace values multiple columns matching multiple columns from another dataframe 例如Pandas - 替换列值pandas将一个列替换为多个值Python pandas:替换多个列匹配另一个数据帧中的多个列的值

Suggestions? 建议?

You can simply subtract a scalar value from your column which is in effect what you're doing here: 您可以简单地从列中减去标量值,这实际上就是您在此处执行的操作:

df['job_perf1'] = df['job_perf1'] - 1

Also as you need to do this on 18 cols, then I'd construct a list of the 18 column names and just subtract 1 from all of them at once: 另外,当您需要在18列上执行此操作时,我将构建18个列名称的列表,并立即从所有列中减去1

df[col_list] = df[col_list] - 1

No need for a mapping. 无需映射。 This can be done as a vector addition, since effectively, what you're doing, is subtracting 1 from each value. 这可以作为向量加法来完成,因为有效地,你正在做的是从每个值中减去1 This works elegantly: 这很优雅:

df['job_perf1'] = df['Job_Performance1'] - numpy.ones(len(df['Job_Performance1']))

Or, without numpy : 或者,没有numpy

df['job_perf1'] = df['Job_Performance1'] - [1] * len(df['Job_Performance1'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM