Python: Why does list import using quotation marks, how can I avoid this/get rid of them?

Question

Really simple question here but bugging me for long enough to ask. Code looks like this:

f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
keygenomes = []
for keyline in f4:
   keygenomes.append(keyline[:-1])

the genomekey2.txt file format looks like this

['Prochlorococcus marinus str. MIT 9202']
['Prochlorococcus marinus str. NATL1A']
['Synechococcus sp. RS9917']
['Nostoc sp. PCC 7120']
['Synechococcus sp. JA-2-3B'a(2-13)']

The problem being when I print the genomekey list it has all of the entries I want but with quotation marks around each of the [ ] found within the list. I want to get rid of the quotation marks so I can compare it with another list but so far haven't found a way. I tried...

for a in keygenomes:
    a.replace('"', '')

But that didn't seem to work. I would rather a solution where it just doesn't add the quotation marks on at all. What are they for anyway and which part of the code (.append, .readline()) is responsible for adding them? Massively beginner question here but you guys seem pretty nice.

Edit: I eventually want to compare it with a list which is formatted as such

[['Arthrospira maxima CS-328'], ['Prochlorococcus marinus str. MIT 9301'], ['Synechococcus sp. CC9605'], ['Synechococcus sp. WH 5701'], ['Synechococcus sp. CB0205'], ['Prochlorococcus marinus str. MIT 9313'], ['Synechococcus sp. JA-3-3Ab'], ['Trichodesmium erythraeum IMS101'], ['Synechococcus sp. PCC 7335'], ['Trichodesmium erythraeum IMS101'], ...

Edit: So I think I got something to work with a combination of answers, thank you all for your help, The quotations were interfering with the list comparison so I just added them on to the first list as well, even though I think it's only mimicking the list being entered as a string (of which I now think I understand the distinction) it seems to work

f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
keygenomes = []
for keyline in f4:
    keygenomes.append(keyline[:-1])

specieslist = " ".join(["%s" % el for el in specieslist])

nonconservedlist = [i for i in keygenomes if i not in specieslist]

Edit: Yeah the above worked but the more elegant solution I found here (http://forums.devshed.com/python-programming-11/convert-string-to-list-71857.html) after understanding the problem better thanks to your guys help is like this:

for keyline in f4:
    keyline = eval(keyline)
    keygenomes.append(keyline)

Thanks!

Answer 1

Based on what you want to compare your list to, it seems like you are wanting a list of lists and not a list of strings.... Maybe this?

f4 = open("genomekey2.txt", 'rb')
keygenomes = []
for keyline in f4.readlines():
    if keyline:
        keygenomes.append(eval(keyline.strip()))

You are going to have issues with lines line this:

['Synechococcus sp. JA-2-3B'a(2-13)']

The quotes are not correct and it will break the eval. Is it possible to mix the quotes? Like this instead...

["Synechococcus sp. JA-2-3B'a(2-13)"]

Answer 2

A quick and dirty solution is to skip the first two and last two chars of the line

f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
keygenomes = []
for keyline in f4:
   # CHANGE HERE
   keygenomes.append(keyline[2:-2])

otherwise use a regexp like

g = re.match(("^\['(?P<value>.*)'\]"), "['Synechococcus sp. JA-2-3B'a(2-13)']")
g.group(1)
"Synechococcus sp. JA-2-3B'a(2-13)"

Answer 3

a.replace(...) returns the modified string, it doesn't modify a .

Therefore you need to actually replace the entries in your array, or fix them before you put them in your array.

keygenomes = [ a.replace('"', '') for a in keygenomes ]

Edit:

I think I had not read the question carefully enough - the " comes when you print a string - it's not part of the string itself.

Answer 4

Your replace is using the wrong string; you're trying to remove single quotes, but your string is a double quote. Also the replacement isn't in-place since strings aren't mutable, you have to use the return value.

keygenomes.append(keyline[:-1].replace("'", ""))

Answer 5

Try something like that:

keygenomes = []
f4 = open("genomekey2.txt", 'rb')
keyline = f4.readline()
for keyline in f4:
    keyline = keyline.strip()
    if keyline and keyline.startswith("['") and keyline.endswith("']"):
        keygenomes.append(keyline[2:-2])

Python: Why does list import using quotation marks, how can I avoid this/get rid of them?

Question

5 answers

solution1
2 ACCPTED 2012-04-04 18:01:13

solution2
1 2012-04-04 16:11:51

solution3
1 2012-04-04 16:16:03

solution4
0 2012-04-04 16:16:55

solution5
0 2012-04-04 16:32:32

Python: Why does list import using quotation marks, how can I avoid this/get rid of them?

Question

5 answers

solution1 2 ACCPTED 2012-04-04 18:01:13

solution2 1 2012-04-04 16:11:51

solution3 1 2012-04-04 16:16:03

solution4 0 2012-04-04 16:16:55

solution5 0 2012-04-04 16:32:32

solution1
2 ACCPTED 2012-04-04 18:01:13

solution2
1 2012-04-04 16:11:51

solution3
1 2012-04-04 16:16:03

solution4
0 2012-04-04 16:16:55

solution5
0 2012-04-04 16:32:32