簡體   English   中英

如何在大熊貓中合並多個csv的列而不使用picking_x或_y,而是選擇具有信息的列

[英]How do I merge more than one column for csv's in pandas without picking_x or _y but instead picking the one that has the information

我正在嘗試合並兩個csv,而不必從_x或_y中選擇值。

MetaData1
Sample_name   TITLE
Cody        Chicken Pox
Claudia     Chicken Pox
Alex        Chicken Pox
Steven      Chicken Pox
Mom         Chicken Pox
Dad     

MetaData2
Sample_name    TITLE       Geo_Loc    DESCRIPTION
Dad         Chicken Pox     Earth       people
Me          Chicken Pox     Earth       people
Roger       Chicken Pox     Earth       people
Ben         Chicken Pox     Earth       people

合並在一起看起來像這樣:

Merged Metadata 
Sample_name    TITLE             Geo_Loc                 DESCRIPTION
Cody        Chicken Pox   Missing:Not Applicable    Missing:Not Applicable
Claudia     Chicken Pox   Missing:Not Applicable    Missing:Not Applicable
Alex        Chicken Pox   Missing:Not Applicable    Missing:Not Applicable
Steven      Chicken Pox   Missing:Not Applicable    Missing:Not Applicable
Mom         Chicken Pox   Missing:Not Applicable    Missing:Not Applicable
Dad         Chicken Pox     Earth                   people
Me          Chicken Pox     Earth                   people
Roger       Chicken Pox     Earth                   people
Ben         Chicken Pox     Earth                   people

到目前為止,我的代碼如下

#Merging two or more csv files using pandas 
#Duplicate line for more than one csv file 
File_one = panda.read_csv('/Users/c1carpenter/Desktop/Test.txt', sep='\t', header=0, dtype=str)
File_two = panda.read_csv('/Users/c1carpenter/Desktop/Test2.txt', sep='\t', header=0, dtype=str)
Merge_File = panda.merge(File_one, File_two, how='outer', on='Sample_name')

但是,如果我有100列,其中50列最終是重復的。如何合並它們而不丟失數據。 並且必須分別鍵入每個標題? 像下面。

# Cleanup to merge duplicate non-index column
mm['TITLE'] = mm[['TITLE_x', 'TITLE_y']].fillna('').sum(axis=1)
mm.drop(['TITLE_x','TITLE_y'], axis=1, inplace=True)

合並之前,您可以調整第二個數據框,使其與第一個數據框沒有任何重復的列。

df2_to_merge = df2[[col for col in df2.columns if col not in df1.columns]]

然后將df1與df2合並,如指定的那樣。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM