[英]merge the two files
I have two files in this format 我有两种格式的文件
1.txt 1.txt
1233445555,
4333333322,
12223344343
22333444337
33443445555
2.txt 2.txt
42202123456,
42202234567,
42203234568,
42204356789,
what I want is to to take the position of first column in file 2 by comparing the first column of file 1, if the first string in column 1 of file 2 is found in file 1, output should give the position of that row in file1 我想要的是通过比较文件1的第一列来确定文件2中第一列的位置,如果在文件1中找到了文件2的第一列中的第一个字符串,则输出应给出文件1中该行的位置
from my awk command i was able to sort the file as per the column 1 of 2.csv, but not able to find the position of each row 从我的awk命令中,我能够按照2.csv的第1列对文件进行排序,但无法找到每一行的位置
awk -F, 'FNR==NR {a[$1]=$0; next}; $1 in a {print a[$1]}' 1.csv 2.csv > 3.txt
cat 3.csv
38202123456
48202234567
672032345682
76204356789
88205443456
First create a dictionary key => row index from second file, using a dictionary comprehension, and indexes starting at 1. 首先,使用字典理解,从第二个文件创建一个字典键=>行索引,索引从1开始。
Then open file 1 and lookup for the key in file 2. If found write data & position, using writerows
and a generator comprehension as arguments, so performance is maximized. 然后打开文件1并在文件2中查找键。如果找到了写数据和位置,则使用writerows
和生成器理解作为参数,从而使性能最大化。
import csv
# create row => index dictionary
with open("file2.csv") as f2:
# we only need first row
# so we discard the rest using *_ special syntax when unpacking rows
d = {first_cell:i+1 for i,(first_cell,*_) in enumerate(csv.reader(f2))}
# write output
with open("file1.csv") as f1, open("3.csv","w",newline='') as f3:
csv.writer(f3).writerows([k,"{} -position {}".format(v,d[k])] for k,v in csv.reader(f1) if k in d)
note: python 2 users should replace: 注意:python 2用户应替换:
{first_cell:i+1 for i,(first_cell,*_) in enumerate(csv.reader(f2))}
by {row[0]:i+1 for i,row in enumerate(csv.reader(f2))}
{first_cell:i+1 for i,(first_cell,*_) in enumerate(csv.reader(f2))}
, {row[0]:i+1 for i,row in enumerate(csv.reader(f2))}
open("3.csv","w",newline='')
by open("3.csv","wb")
通过open("3.csv","wb")
open("3.csv","w",newline='')
open("3.csv","wb")
Please find below a solution to the initial problem based on the Python's index method . 请在下面找到基于Python的index方法的初始问题的解决方案。
# Reading the CSV files
with open( '1.csv' ) as f1:
rows_1 = f1.read().split('\n')
with open( '2.csv' ) as f2:
rows_2 = f2.read().split('\n')
# Extracting the colmuns of each file
for i in xrange( len( rows_1) ):
rows_1[i] = rows_1[i].split(',')
# ...note that from the second CSV file we need only the first column
for i in xrange( len( rows_2) ):
rows_2[i] = rows_2[i].split(',')[0]
# Comparing the data
res = []
for r in rows_1:
try:
res.append( r[0] + ',' + r[1] +
' -position ' + str( rows_2.index( r[0] )+1 ) )
except:
pass
# Saving the result in a new file
with open('3.csv', 'w') as f3:
for r in res:
f3.write( r+'\n' )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.