简体   繁体   English

如何使用熊猫在CSV文件之间复制匹配列?

[英]How to Copy the Matching Columns between CSV Files Using Pandas?

I have two dataframes(f1_df and f2_df): 我有两个数据框(f1_df和f2_df):

f1_df looks like: f1_df看起来像:

ID,Name,Gender
1,Smith,M
2,John,M

f2_df looks like: f2_df看起来像:

name,gender,city,id

Problem: 问题:

I want the code to compare the header of f1_df with f2_df by itself and copy the data of the matching columns using panda. 我希望代码自己将f1_df的标头与f2_df进行比较,并使用panda复制匹配列的数据。

Output: 输出:

the output should be like this: 输出应该是这样的:

name,gender,city,id  # name,gender,and id are the only matching columns btw   f1_df and f2_df 
Smith,M, ,1          # the data copied for name, gender, and id columns 
John,M, ,2

I am new to Pandas and not sure how to handle the problem. 我是熊猫的新手,不确定如何处理该问题。 I have tried to do an inner join to the matching columns, but that did not work. 我试图对匹配的列进行内部联接,但这没有用。

Here is what I have so far: 这是我到目前为止的内容:

import pandas as pd

f1_df = pd.read_csv("file1.csv")
f2_df = pd.read_csv("file2.csv")

for i in f1_df:
    for j in f2_df:
        i = i.lower()
        if i == j:
            joined = f1_df.join(f2_df)
print joined

Any idea how to solve this? 任何想法如何解决这个问题?

try this if you want to merge / join your DFs on common columns: 如果要合并/加入公共列上的DF,请尝试以下方法:

first lets convert all columns to lower case: 首先让我们将所有列转换为小写:

df1.columns = df1.columns.str.lower()
df2.columns = df2.columns.str.lower()

now we can join on common columns 现在我们可以加入共同的专栏

common_cols = df2.columns.intersection(df1.columns).tolist()
joined = df1.set_index(common_cols).join(df2.set_index(common_cols)).reset_index()

Output: 输出:

In [259]: joined
Out[259]:
   id   name gender city
0   1  Smith      M  NaN
1   2   John      M  NaN

export to CSV: 导出为CSV:

In [262]: joined.to_csv('c:/temp/joined.csv', index=False)

c:/temp/joined.csv: c:/temp/joined.csv:

id,name,gender,city
1,Smith,M,
2,John,M,

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Python(最好是 Pandas?)在两个 csv 文件之间匹配数据 - Matching data between two csv files using Python (preferably Pandas?) 如何在 Python 中使用 Pandas 连接 CSV 文件中的列 - How to join columns in CSV files using Pandas in Python Pandas 对多个 csv 文件中匹配日期的列求和 - Pandas sum columns on matching dates in multiple csv files 匹配2个CSV文件(分列) - Matching 2 CSV Files (According to columns) 使用 Pandas 如何根据来自不同 csv 文件的两列复制行值 - Using Pandas how to copy row value based on two Columns from different csv file 如何将多个csv文件中的行复制到pandas中的新文件? - How to copy rows from multiple csv files to new files in pandas? 熊猫如何添加计数器以匹配两个数据框列之间的行 - Pandas how to add the counters for matching rows between two dataframe columns 如何从多个 CSV 文件中读取特定列,并使用 Python 跳过某些文件中不存在的列 Pandas - How to read specific columns from mulitple CSV files, and skip columns that do not exist in some of the files using Python Pandas 熊猫:如何使用文件名水平合并多个CSV(键,值)文件并在生成的DF中命名“值”列 - Pandas: how to merge horizontally multiple CSV (key,value) files and name `value` columns in the resulting DF using filenames 使用pandas比较具有不同列数的大型CSV文件 - using pandas to compare large CSV files with different numbers of columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM