简体   繁体   English

是否可以合并 python 中具有相似名称(列)的 csv 个文件?

[英]Is it possible to merge csv files that have similar names(column) in python?

I have 3 CSV files that i want to merge.我有 3 个 CSV 文件要合并。 First one having the column header being movies_title, release_date, genre Second one having show_id, type, title Third one having title, rating,ratingLevel.第一个具有 header 列的是 movies_title、release_date、genre 第二个具有 show_id、type、title 第三个具有 title、rating、ratingLevel。

Is there a way to merge so it will be like title, release_date,genre, show_id, type, rating, ratingLevel?有没有办法合并,使其像标题、发布日期、流派、显示 ID、类型、评级、评级级别一样?

Assuming df1, df2 & df3 are three dataframes loaded in python;假设 df1, df2 & df3 是加载到 python 的三个数据帧;

Solution1:解决方案1:

First maybe you can rename title column for df1 & then merge all 3 like this;首先,也许您可以重命名 df1 的标题列,然后像这样合并所有 3 个;

df1.rename(columns={'movies_title':'title'}, inplace=True)

from functools import reduce
dfs = [df1, df2, df3]
df_final = reduce(lambda left,right: pd.merge(left,right,on='title'), dfs)

Solution2:解决方案2:

If you don't wish to rename any of your columns then this shall work;如果您不想重命名任何列,那么这将起作用;

df_final = df1.merge(df2,how="left",left_on="movies_title",right_on="title")
del df_final["movies_title"]
df_final = df_final.merge(df3,how="left",on="title")

Solution3 Not sure if this is the best way to get similar columns from 2 different data frames, but you can do fuzzy matching of all combination of columns of 2 dfs then do the merging; Solution3不确定这是否是从 2 个不同数据帧中获取相似列的最佳方法,但是您可以对 2 个 dfs 的列的所有组合进行模糊匹配,然后进行合并;

col1 = ["movies_title", "release_date", "genre"] # df1.columns
col2 = ["show_id", "type", "title"] # df2.columns
from fuzzywuzzy import fuzz
lst_col,num = [], [] 
for i in col1: 
    for j in col2:
        lst_col.append([i,j])
        num.append(fuzz.ratio(i,j))    
best_match = lst_col[num.index(max(num))]

# Output of best_match:
# ['movies_title', 'title']

df_final = df1.merge(df2,how="left",left_on=best_match[0],right_on=best_match[1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM