Pandas：使一列的值等于另一列的值

Question

Hopefully a very simple question from a Pandas newbie.希望来自 Pandas 新手的一个非常简单的问题。

How can I make the value of one column equal the value of another in a dataframe?如何使数据框中一列的值等于另一列的值？ Replace the value in every row.替换每一行中的值。 No conditionals, etc.无条件等。

Context:语境：

I have two CSV's, loaded into dataframe 'a' and dataframe 'b' respectively.我有两个 CSV，分别加载到数据框 'a' 和数据框 'b' 中。

These CSVs are basically the same, except 'a' has a field that was improperly carried forward from another process - floats were rounded to ints.这些 CSV 基本相同，除了 'a' 有一个字段从另一个过程中不正确地结转 - 浮点数四舍五入为整数。 Not my script, can't influence it, I just have the CSVs now.不是我的脚本，无法影响它，我现在只有 CSV。

In reality I probably have 2mil rows and about 60-70 columns in the merged dataframe - so if it's possible to address the columns by their header (in the example these are Col1 and xyz_Col1), that would sure help.实际上，我在合并的数据框中可能有 200 万行和大约 60-70 列 - 因此，如果可以通过列标题（在示例中这些是 Col1 和 xyz_Col1）来解决这些列，那肯定会有所帮助。

I have joined the CSVs on their common field, so now I have a scenario where I have a dataframe that can be represented by the following:我已经在他们的公共领域加入了 CSV，所以现在我有一个场景，我有一个可以由以下内容表示的数据框：

+--------+------+--------+------------+----------+----------+
| CellID | Col1 |  Col2  | xyz_CellID | xyz_Col1 | xyz_Col2 |
+--------+------+--------+------------+----------+----------+
|      1 |    0 | apple  |          1 | 0.23     | apple    |
|      2 |    0 | orange |          2 | 0.45     | orange   |
|      3 |    1 | banana |          3 | 0.68     | banana   |
+--------+------+--------+------------+----------+----------+

The result should be such that Col1 = xyz_Col1:结果应该是 Col1 = xyz_Col1：

+--------+------+--------+------------+----------+----------+
| CellID | Col1 |  Col2  | xyz_CellID | xyz_Col1 | xyz_Col2 |
+--------+------+--------+------------+----------+----------+
|      1 | 0.23 | apple  |          1 | 0.23     | apple    |
|      2 | 0.45 | orange |          2 | 0.45     | orange   |
|      3 | 0.68 | banana |          3 | 0.68     | banana   |
+--------+------+--------+------------+----------+----------+

What I have in code so far:到目前为止我在代码中的内容：

import pandas as pd

a = pd.read_csv('csv1.csv')
b = pd.read_csv('csv2.csv')
#b = b.dropna(axis=1) drop any unnamed fields

#defind 'b' cols by adding an xyz_ prefix as xyz is unique
b = b.add_prefix('xyz_')

#Join the dataframes into a new dataframe named merged
merged = pd.merge(a, b, left_on='Col1', right_on='xyz_Col1')

merged.head(5)

#This is where the xyz_Col1 to Col1 code goes...

#drop unwanted cols
merged = merged[merged.columns.drop(list(merged.filter(regex='xyz')))]

#output to file
merged.to_csv("output.csv", index=False)

Thanks谢谢

Answer 1

merged['col1'] = merged['xyz_Col1']

或者

merged.loc[:, 'col1'] = merged.loc[:, 'xyz_Col1']

Pandas：使一列的值等于另一列的值

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-02-13 10:38:52

Pandas：使一列的值等于另一列的值

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-02-13 10:38:52

解决方案1
1 已采纳 2020-02-13 10:38:52