简体   繁体   中英

How to add column without adding column to its original data frame in pandas dataframe?

I copies a dataframe , and then add a column to copied one dataframe , but this will lead to add column to orignal dataframe.

X_train_1 = X_train
X_train_1["class_label"] = y_train
print(X_train.columns)

As stated here , you need to copy the dataframe. Check this minimal sample:

import pandas as pd

X_train = pd.DataFrame([{'a': 1, 'b': 2}, {'a': 2, 'b': 3}, {'a': 3, 'b': 4}, {'a': 4, 'b': 5}])
X_train_1 = X_train.copy()
print(X_train_1)
X_train_1["class_label"] = ['one', 'two', 'three', 'four']
print(X_train)

When you write

X_train_1 = X_train

It basically assign the variable by reference ant not by value. So whatever change you have make to new variable it actually modify the original. Same behaviour you will observe if you try doing this with lists for example. As suggested by others make a copy using

X_train_1 = X_train.copy().

while copying a dataframe, you should use copy method to copy the dataframe rather than just assigining new dataframe. The following code won't lead to any modification in the original dataframe.

X_train_1 = X_train.copy()
X_train_1["class_label"] = y_train
print(X_train.columns)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM