[英]Dataframe - Merge columns from csv and excel file
Hi there stack overflow community,嗨,堆栈溢出社区,
I have the following dataframe in an excel:我在excel中有以下数据框:
sparte sparten status stati gesellschaft gesellschaften
10 Krankenvoll B beantragt 0 - Allgemein -
11 Reisekranken A aktiv 10000 nordinvest
12 Krankenkasse N beitragsfrei M552D SV SparkassenVersicherung
and the folliwing column for merging in a csv:以及用于在 csv 中合并的以下列:
sparten status gesellschaft
10 B 0
11 A 10000
12 N M552D
to merge some columns from an excel and a csv file I'm using the following code:要合并 excel 和 csv 文件中的一些列,我使用以下代码:
df1 = pd.read_csv(r'path', sep=',').drop(columns = ['risiko'])
df2 = pd.read_excel(r'path')
df3 = pd.merge(df1,df2[['status','stati']],on='status', how='left').drop(columns = ['status'])
df4 = df3.merge(df2[['sparte','sparten']],on='sparte', how='left').drop(columns = ['sparte'])
It works fine for me, but now i want to me merge the following column:它对我来说很好,但现在我想合并以下列:
df4 = df3.merge(df2[['gesellschaft','gesellschaften']],on='gesellschaft', how='left')
print(df4)
...and it does not work. ......它不起作用。 It merges only the cells with this format
M552D
, but leaves the cells with numbers untouched.它仅合并具有此格式
M552D
的单元格,但保留带有数字的单元格不变。 I don't understand what I'm doing wrong.我不明白我做错了什么。 If I try to put
how='right'
the merge works, but the other columns disappear.如果我尝试将
how='right'
合并工作,但其他列消失。
Maybe someone has an idea what is happening here!也许有人知道这里发生了什么! Thanks for any hint!
感谢您的任何提示!
The problem is that the geselschaft
column contains only strings in df1
which is loaded with read_csv
, because the column is not fully numeric.问题是
geselschaft
列仅包含用read_csv
加载的df1
中的字符串,因为该列不是完全数字的。 But in df2
which is loaded with read_excel
, it contains a mix of int and string values.但是在加载了
read_excel
的df2
中,它包含 int 和 string 值的混合。 And at Pandas level and int and a string cannot be equal.在 Pandas 级别,int 和字符串不能相等。
A possible workaround is to force a string conversion at merge time:一种可能的解决方法是在合并时强制进行字符串转换:
df4 = df3.merge(df2[['gesellschaft','gesellschaften']], left_on='gesellschaft',
right_on = df2['gesellschaft'].astype('str'), how='left')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.