[英]Python: Why does list import using quotation marks, how can I avoid this/get rid of them?
Really simple question here but bugging me for long enough to ask.这里的问题非常简单,但困扰我足够长的时间来问。 Code looks like this:
代码如下所示:
f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
keygenomes = []
for keyline in f4:
keygenomes.append(keyline[:-1])
the genomekey2.txt file format looks like this genomekey2.txt 文件格式如下所示
['Prochlorococcus marinus str. MIT 9202']
['Prochlorococcus marinus str. NATL1A']
['Synechococcus sp. RS9917']
['Nostoc sp. PCC 7120']
['Synechococcus sp. JA-2-3B'a(2-13)']
The problem being when I print the genomekey list it has all of the entries I want but with quotation marks around each of the [ ] found within the list.问题是当我打印基因组密钥列表时,它包含我想要的所有条目,但在列表中找到的每个 [ ] 周围都有引号。 I want to get rid of the quotation marks so I can compare it with another list but so far haven't found a way.
我想去掉引号,这样我就可以将它与另一个列表进行比较,但到目前为止还没有找到办法。 I tried...
我试过...
for a in keygenomes:
a.replace('"', '')
But that didn't seem to work.但这似乎没有用。 I would rather a solution where it just doesn't add the quotation marks on at all.
我宁愿一个解决方案,它根本不添加引号。 What are they for anyway and which part of the code (.append, .readline()) is responsible for adding them?
它们到底有什么用,代码的哪一部分(.append,.readline())负责添加它们? Massively beginner question here but you guys seem pretty nice.
这里有大量初学者问题,但你们看起来很不错。
Edit: I eventually want to compare it with a list which is formatted as such编辑:我最终想将它与这样格式化的列表进行比较
[['Arthrospira maxima CS-328'], ['Prochlorococcus marinus str. [['Arthrospira maxima CS-328'], ['Prochlorococcus marinus str。 MIT 9301'], ['Synechococcus sp.
MIT 9301'], ['Synechococcus sp. CC9605'], ['Synechococcus sp.
CC9605'], ['Synechococcus sp. WH 5701'], ['Synechococcus sp.
WH 5701'], ['Synechococcus sp. CB0205'], ['Prochlorococcus marinus str.
CB0205'], ['Prochlorococcus marinus 海峡。 MIT 9313'], ['Synechococcus sp.
MIT 9313'], ['Synechococcus sp. JA-3-3Ab'], ['Trichodesmium erythraeum IMS101'], ['Synechococcus sp.
JA-3-3Ab'], ['红毛霉 IMS101'], ['Synechococcus sp. PCC 7335'], ['Trichodesmium erythraeum IMS101'], ...
PCC 7335'], ['红毛霉 IMS101'], ...
Edit: So I think I got something to work with a combination of answers, thank you all for your help, The quotations were interfering with the list comparison so I just added them on to the first list as well, even though I think it's only mimicking the list being entered as a string (of which I now think I understand the distinction) it seems to work编辑:所以我想我得到了一些答案的组合,谢谢大家的帮助,报价干扰了列表比较所以我也将它们添加到第一个列表中,即使我认为它只是模仿作为字符串输入的列表(我现在认为我理解其中的区别)它似乎有效
f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
keygenomes = []
for keyline in f4:
keygenomes.append(keyline[:-1])
specieslist = " ".join(["%s" % el for el in specieslist])
nonconservedlist = [i for i in keygenomes if i not in specieslist]
Edit: Yeah the above worked but the more elegant solution I found here (http://forums.devshed.com/python-programming-11/convert-string-to-list-71857.html) after understanding the problem better thanks to your guys help is like this:编辑:是的,上面的方法有效,但在更好地理解问题后,我在这里找到了更优雅的解决方案(http://forums.devshed.com/python-programming-11/convert-string-to-list-71857.html)感谢你们的帮助是这样的:
for keyline in f4:
keyline = eval(keyline)
keygenomes.append(keyline)
Thanks!谢谢!
Based on what you want to compare your list to, it seems like you are wanting a list of lists and not a list of strings.... Maybe this?根据您想将列表与之进行比较的内容,您似乎想要一个列表列表而不是一个字符串列表……也许是这个?
f4 = open("genomekey2.txt", 'rb')
keygenomes = []
for keyline in f4.readlines():
if keyline:
keygenomes.append(eval(keyline.strip()))
You are going to have issues with lines line this:您将遇到以下行的问题:
['Synechococcus sp. JA-2-3B'a(2-13)']
The quotes are not correct and it will break the eval.引号不正确,它会破坏评估。 Is it possible to mix the quotes?
是否可以混合引号? Like this instead...
而是像这样...
["Synechococcus sp. JA-2-3B'a(2-13)"]
A quick and dirty solution is to skip the first two and last two chars of the line一个快速而肮脏的解决方案是跳过该行的前两个和最后两个字符
f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
keygenomes = []
for keyline in f4:
# CHANGE HERE
keygenomes.append(keyline[2:-2])
otherwise use a regexp like否则使用正则表达式
g = re.match(("^\['(?P<value>.*)'\]"), "['Synechococcus sp. JA-2-3B'a(2-13)']")
g.group(1)
"Synechococcus sp. JA-2-3B'a(2-13)"
a.replace(...)
returns the modified string, it doesn't modify a
. a.replace(...)
返回修改后的字符串,它不会修改a
。
Therefore you need to actually replace the entries in your array, or fix them before you put them in your array.因此,您需要实际替换数组中的条目,或者在将它们放入数组之前修复它们。
keygenomes = [ a.replace('"', '') for a in keygenomes ]
Edit:编辑:
I think I had not read the question carefully enough - the "
comes when you print a string - it's not part of the string itself.我想我没有足够仔细地阅读这个问题 - 当你打印一个字符串时
"
出现 - 它不是字符串本身的一部分。
Your replace
is using the wrong string;您的
replace
使用了错误的字符串; you're trying to remove single quotes, but your string is a double quote.您正在尝试删除单引号,但您的字符串是双引号。 Also the replacement isn't in-place since strings aren't mutable, you have to use the return value.
替换也不是就地,因为字符串是不可变的,你必须使用返回值。
keygenomes.append(keyline[:-1].replace("'", ""))
Try something like that:尝试这样的事情:
keygenomes = []
f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
for keyline in f4:
keyline = keyline.strip()
if keyline and keyline.startswith("['") and keyline.endswith("']"):
keygenomes.append(keyline[2:-2])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.