[英]Merge two dataframes on two columns
I have 2 dataframes: 我有2个数据框:
dfBB dfBB
Rank, Song, Artist, Year 排名,歌曲,艺术家,年份
and dfMap 和dfMap
Artist, Song, SongId, ArtistId 艺术家,歌曲,SongId,ArtistId
I would like to merge them together on the basis of Artist and Song ie: where they match I add the extra columns otherwise 0: 我想在Artist和Song的基础上将它们合并在一起,即:在它们匹配的地方,我添加了额外的列,否则为0:
Artist, Song, SongId, ArtistId, Rank, Year 艺术家,歌曲,SongId,ArtistId,排名,年份
I am foreseeing another problem where the artist or song might be spelled incorrectly. 我预见到艺术家或歌曲的拼写可能不正确的另一个问题。 Maybe I can check similarity? 也许我可以检查相似性? Not too sure how to go about it. 不太确定如何去做。
For the merging I tried: 对于合并,我尝试过:
merged = pd.merge(dfMap, dfBB, on='Artist' and 'Song', how='outer')
but got: 但得到:
Artist_x, Song, SongId, ArtistId, Rank, Artist_y, Rank
merged = pd.merge(dfMap, dfBB, on=['Artist','Song'], how='outer')
You can use a list or array of values to merge two dataframes. 您可以使用值的列表或数组来合并两个数据框。 I would recommend checking the documentation 我建议检查文档
With regards to misspellings, you're gonna need to do some cleaning on your own. 关于拼写错误,您将需要自己进行一些清理。 You may want to check out difflib 您可能要签出difflib
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.