Example data in python 3.5:
import pandas as pd
df=pd.DataFrame({"A":["x","y","z","t","f"],
"B":[1,2,1,2,4]})
This gives me a dataframe with 2 columns "A" and "B". I then want to add a third column "C" that contains the value of "A" and "B" concatenated and separated by "_".
Following the suggestion from this answer I can do it like this.
for i in range(0,len(df["A"])):
df.loc[i,"C"]=df.loc[i,"A"]+"_"+str(df.loc[i,"B"])
I get the result I want but it seems convoluted for such a simple task.
In R this would be done like this:
df<-data.frame(A=c("x","y","z","t","f"),
B=c(1,2,1,2,4))
df$C<-paste(df$A,df$B,sep="_")
Another thread suggested the use of the "%" operator but I can't get it to work.
Is there a better alternative?
You can just add the columns together but for 'B' you need to cast the type using astype(str)
:
In [115]:
df['C'] = df['A'] + '_' + df['B'].astype(str)
df
Out[115]:
A B C
0 x 1 x_1
1 y 2 y_2
2 z 1 z_1
3 t 2 t_2
4 f 4 f_4
This is a vectorised approach and will scale much better than looping over every row for large dfs
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.