[英]Compare string in one txt file to string in another
I'm working on a script that will be used in a daily automation. 我正在研究将用于日常自动化的脚本。
I have 2 files, one is a static file containing a list of cusips. 我有2个文件,一个是包含cusips列表的静态文件。 The 2nd file is a data file that looks something like this:
第二个文件是一个数据文件,看起来像这样:
<Imports/>
<InterpretFXRates/>
<SCXList date="20170309">
<SCX type="cs" iso="USD" symbol="SPLS" cusip="855030102" name="STAPLES INC COM" issuer="us" record="20170324" maturity="20170413" intdiv=".48" sap="NR" moody="NR" apinternalid="USD855030102" action="a"/>
<SCX type="cs" iso="USD" symbol="ARE" cusip="015271109" name="ALEXANDRIA REAL ESTATE EQUITIES INC COM" issuer="us" record="20170331" maturity="20170417" intdiv="3.32" sap="NR" moody="NR" apinternalid="USD015271109" action="a"/>
<SCX type="cs" iso="USD" symbol="AMGN" cusip="031162100" name="AMGEN INC COM" issuer="us" record="20170517" maturity="20170608" intdiv="4.6" sap="NR" moody="NR" apinternalid="USD031162100" action="a"/>
So what I'm trying to do is to iterate through each cusip of the static file and check to see if it is in any of the lines above. 所以我想做的是遍历静态文件的每个cusip,并检查它是否在上面的任何一行中。 If it is found, then we will delete the line from the new file.
如果找到了,那么我们将从新文件中删除该行。
import csv
bond_list = 'BondFilterList.txt' #containes list of cusips
dataport_file = 'test.scx' #contained the <SCX... data
output_file = 'out.scx'
data = []
with open(bond_list, 'r') as bl, open(dataport_file, 'r') as df:
for cusip in bl:
lines = [y.strip() for y in df]
for line in lines:
if cusip in line:
print("Matched")
else:
data.append(line)
with open(output_file, "w") as output:
writer = csv.writer(output, lineterminator = '\n', escapechar = ' ', quoting = csv.QUOTE_NONE)
for x in data:
writer.writerow([x])
output.close
I'm definitely missing something because my if
statement is always returning False
. 我肯定错过了一些东西,因为我的
if
语句总是返回False
。
for cusip in bl:
for line in (y.strip() for y in df):
this is a double loop on 2 file iterators. 这是2个文件迭代器上的双循环。 The inner loop works fine only the first time.
内部循环仅在第一次时工作良好。 The other times it doesn't even enter because
df
reached end of file. 其他时间甚至没有输入,因为
df
到达文件末尾。
Rewrite: 改写:
lines = [y.strip() for y in df] # listcomp not gencomp: compute a real list
for cusip in bl:
for line in lines:
Here is my final code, thanks again for the help Jean. 这是我的最终代码,再次感谢Jean。
import csv
bond_list = 'BondFilterList.txt' #containes list of cusips
dataport_file = 'test.scx' #contained the <SCX... data
output_file = 'out.scx'
data = []
with open(bond_list, 'r') as bl, open(dataport_file, 'r') as df:
lines = [y.strip() for y in df]
cusips = [cusip.upper().strip() for cusip in bl]
for line in lines:
if not any(cusip in line for cusip in cusips):
data.append(line)
with open(output_file, "w") as output:
writer = csv.writer(output, lineterminator = '\n', escapechar = ' ', quoting = csv.QUOTE_NONE)
for x in data:
writer.writerow([x])
output.close
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.