[英]Attribute error list object has no attribute strip
I am writing some code to tag a file, which looks at the previous line to see if there is a SCI tag, and if so, tag the current line with SCI_NXT in a fifth column (in a tab delimited file). 我正在编写一些代码来标记文件,该文件会查看前一行以查看是否有SCI标记,如果有,请在第五列(在制表符分隔的文件中)用SCI_NXT标记当前行。
However, I get the attribute error that I am trying to strip a list (at line previous_line = split_line(previous_line) when the first line which is not a one item line is the object of the variable. This I understand is because it is writing the lines as lists, not as strings, but I do not understand how I might go about rectifying this. I have tried using "extend" but this resulted in the first line being written as each character being a different element, which is also not what I am looking to do. 但是,我得到的属性错误是我试图剥离列表(当不是对象行的第一行是变量的对象时,在previous_line = split_line(previous_line)行),我理解这是因为它正在写这些行是列表,而不是字符串,但是我不知道如何解决这个问题,我尝试使用“ extend”,但这导致第一行被写入,因为每个字符是一个不同的元素,这也不是我要做什么。
Here is the test text I am working on: 这是我正在处理的测试文本:
</s>
<s>
Diptera NP Diptera-n SCI
was VBD be-v
the DT the-x
most RBS most-a
common JJ common-j
prey NN prey-n
among IN among-i
the DT the-x
insects NNS insect-n
potentially RB potentially-a
available JJ available-j
to IN to-i
Here is the code: 这是代码:
"""Tags a file with NEXT_SCI in extra feature column. Reads and writes vert files.
"""
import json
#from pip._vendor.pyparsing import line
VFILE = 'test_next.vert'
def split_line(line):
"""Split a line into five parts, word, tag, lempos, ti, sci"""
# TODO: Speak to Diana about the spaces in the vert file - do they mean
# anything?
line = line.strip().split()
if len(line) == 1:
word = line[0]
pos, lempos, tag = None, None, None
elif len(line) == 3:
word, pos, lempos = line
tag = None
elif len(line) == 4:
word, pos, lempos, tag = line
return [word, pos, lempos, tag]
def tag_next_sci(lines):
"""Loops through lines of original document to add to new file (tagged)
"""
taggedlines = []
for line in lines:
taggedlines.append(tagline_next_sci(line, taggedlines))
return taggedlines
def tagline_next_sci(line, taggedlines):
"""Assigns an indicator tag to a line
"""
#<> are structural and do not need to be considered for feature tags so can be committed directly
if line.startswith('<'):
return line
#look back at previous line to see if SCI, if so tag current line
previous_line = taggedlines[-1]
previous_line = split_line(previous_line)
line = split_line(line)
#look at last column. if SCI, print line, go to next line and add tag in final column ("\t\t\tNXT_SCI\n")
if previous_line[-1] == "SCI":
if len(line) == 3:
print(line + "\t\t\tSCI_MOD\n")
return(line + "\t\t\tSCI_MOD\n")
if len(line) == 4:
print(line + "\t\tSCI_MOD\n")
return(line + "\t\tSCI_MOD\n")
return line
def read_vfile(fname):
"""Reads a vert file
"""
with open(fname, 'r') as vfile:
lines = vfile.readlines()
return lines
def write_vfile(fname, taggedlines):
"""Writes a vert file
"""
# write to file
with open(fname, 'w') as outfile:
outfile.writelines(taggedlines)
def tag_vert_sci_next(fname, fname_out):
"""Creates a new file with tags
"""
# read vertical file
lines = read_vfile(fname)
# tag file
taggedlines = tag_next_sci(lines)
# call write file
write_vfile(fname_out, taggedlines)
def main(fname, fname_out):
#call sci_next tagging
tag_vert_sci_next('test_next.vert', fname_out)
if __name__ == "__main__":
main('test_next.vert', 'zenodo_tagged_SCI_MOD.vert')
And the trackback error: 和引用错误:
Traceback (most recent call last):
File "/home/sandra/git/trophic/tagging/tagging_NEXT.py", line 123, in <module>
main('test_next.vert', 'zenodo_tagged_SCI_MOD.vert')
File "/home/sandra/git/trophic/tagging/tagging_NEXT.py", line 120, in main
tag_vert_sci_next('test_next.vert', fname_out)
File "/home/sandra/git/trophic/tagging/tagging_NEXT.py", line 78, in tag_vert_sci_next
taggedlines = tag_next_sci(lines)
File "/home/sandra/git/trophic/tagging/tagging_NEXT.py", line 31, in tag_next_sci
taggedlines.append(tagline_next_sci(line, taggedlines))
File "/home/sandra/git/trophic/tagging/tagging_NEXT.py", line 43, in tagline_next_sci
previous_line = split_line(previous_line)
File "/home/sandra/git/trophic/tagging/tagging_NEXT.py", line 14, in split_line
line = line.strip().split()
AttributeError: 'list' object has no attribute 'strip'
Your issue seems to be that tagline_next_sci
sometimes returns a list and not a string. 您的问题似乎是tagline_next_sci
有时返回列表而不是字符串。 For example, I tried putting a print inside the function to see what was going on; 例如,我尝试在函数中放入打印内容以查看发生了什么;
...
def tagline_next_sci(line, taggedlines):
print('taggedlines', taggedlines)
"""Assigns an indicator tag to a line
"""
...
and got the output 得到了输出
taggedlines []
taggedlines ['</s>\n']
taggedlines ['</s>\n', '<s>\n']
taggedlines ['</s>\n', '<s>\n', ['Diptera', 'NP', 'Diptera-n', 'SCI']]
So you should check at the bottom of the function to make sure you always return a string, and maybe do a "\\t".join(line)
if you need to puzzle together your list to a string, with something like 因此,您应该检查函数的底部以确保始终返回字符串,如果需要将列表与字符串混淆在一起,可以使用"\\t".join(line)
,例如
return line if isinstance(line, str) else "\t".join(line)
Thank you all for your help. 谢谢大家的帮助。 Here is the code I ended up with: 这是我最终得到的代码:
"""Tags a file with SCI_MOD in extra feature column. Reads and writes vert files.
"""
import json
VFILE = 'zenodotaggedWS_ALL.vert'
def split_line(line):
"""Split a line into its parts"""
line = line.strip().split()
if len(line) == 1:
word = line[0]
pos, lempos, tag ="", "", ""
elif len(line) == 3:
word, pos, lempos = line
tag = ""
elif len(line) == 4:
word, pos, lempos, tag = line
return [word, pos, lempos, tag]
def tag_next_sci(lines):
"""Loops through lines of original document to add to new file (tagged)
"""
taggedlines = []
for line in lines:
taggedlines.append(tagline_next_sci(line, taggedlines))
return taggedlines
def tagline_next_sci(line, taggedlines):
"""Assigns an indicator tag to a line
"""
#<> are structural and do not need to be considered for feature tags so can be committed directly
if line.startswith('<'):
return line
#look back at previous line to see if SCI, if so tag current line
previous_line = taggedlines[-1]
previous_line = split_line(previous_line)
line = split_line(line)
if previous_line[2] == "SCI-n":
print("\t".join(line) + "\tSCI_MOD\n")
return "\t".join(line) + "\tSCI_MOD\n"
return line + "\n" if isinstance(line, str) else "\t".join(line) + "\n"
def read_vfile(fname):
"""Reads a vert file
"""
with open(fname, 'r') as vfile:
lines = vfile.readlines()
return lines
def write_vfile(fname, taggedlines):
"""Writes a vert file
"""
# write to file
with open(fname, 'w') as outfile:
outfile.writelines(taggedlines)
def tag_vert_sci_next(fname, fname_out):
"""Creates a new file with tags
"""
# vertical file location
# make list of species names
# read vertical file
lines = read_vfile(fname)
# tag file
taggedlines = tag_next_sci(lines)
# call write file
write_vfile(fname_out, taggedlines)
def main(fname, fname_out):
#call sci_next tagging
tag_vert_sci_next('zenodotaggedWS_ALL.vert', fname_out)
if __name__ == "__main__":
main('zenodotaggedWS_ALL.vert', 'zenodo_tagged_SCIMOD2.vert')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.