I have:
$ cat file1.csv (tab delimited)
R923E06 273911 2990492 2970203 F Resistant
R923F06 273910 2990492 2970203 F Resistant
R923H02 273894 2970600 2990171 M Resistant
and:
$ cat file2.txt (space delimited and it's a large file)
R923E06 CC GG TT AA ...
R923F06 GG TT AA CC ...
R923H02 TT GG CC AA ...
How can I replace of first column in file2.txt
with all of 6 column in file1.csv
?
Using join
you can do this:
join <(sed -e 's/\t/ /g' file1.csv) <(cat file2.txt)
sed
to change tabs to space
join
to joining lines of two files on a common field.
Output:
R923E06 273911 2990492 2970203 F Resistant CC GG TT AA ...
R923F06 273910 2990492 2970203 F Resistant GG TT AA CC ...
R923H02 273894 2970600 2990171 M Resistant TT GG CC AA ...
Take a look at this AWK example:
awk 'FNR == NR { d[$1] = $0; next } { $1 = d[$1] } 1' file1.csv file2.txt
Here I replace first column in file2.txt
with corresponding line (6 columns) of file1.csv
.
Output:
R923E06 273911 2990492 2970203 F Resistant CC GG TT AA ...
R923F06 273910 2990492 2970203 F Resistant GG TT AA CC ...
R923H02 273894 2970600 2990171 M Resistant TT GG CC AA ...
If you want everything tab-separated in the result, you can add gsub(/[[:space:]]/,"\\t")
to replace any space or tab with tab:
awk 'FNR == NR { d[$1] = $0; next } { $1 = d[$1]; gsub(/[[:space:]]/,"\t") } 1' file1.csv file2.txt
#import pandas
import pandas as pd
#read file1.csv
#set index_col as false if file has delimiters at the end
file1 = pd.read_csv( 'file1.csv', ' ', index_col = False, names =
['1','2','3','4','5','6']);
#read file2.txt, read_csv can read txt files as well
#set index_col as false if file has delimiters at the end
file2 = pd.read_csv( 'file2.csv', ' ', index_col = False, names =
['1','2','3','4','5']);
#drop first column
file2.drop( '1', axis = 1, inplace = True )
#concat both frames
final = pd.concat([file1, file2], axis = 1)
#you might end up with mixed column names you can change it by using
final.columns = ['col1', 'col2', ....]
#save as csv
final.to_csv('out.csv',sep='\t')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.