简体   繁体   中英

Python: Very Basic, Can't figure out why it is not splitting into the larger number listed but rather into individual integers

Really quick question here, some other people helped me on another problem but I can't get any of their code to work because I don't understand something very fundamental here.

8000.5   16745     0.1257
8001.0   16745     0.1242
8001.5   16745     0.1565
8002.0   16745     0.1595
8002.5   16745     0.1093
8003.0   16745     0.1644

I have a data file as such, and when I type

f1 = open(sys.argv[1], 'rt')
for line in f1:
    fields = line.split()
    print list(fields [0])

I get the output

['1', '6', '8', '2', '5', '.', '5']
['1', '6', '8', '2', '6', '.', '0']
['1', '6', '8', '2', '6', '.', '5']
['1', '6', '8', '2', '7', '.', '0']
['1', '6', '8', '2', '7', '.', '5']
['1', '6', '8', '2', '8', '.', '0']
['1', '6', '8', '2', '8', '.', '5']
['1', '6', '8', '2', '9', '.', '0']

Whereas I would have expected from trialling stuff like print list(fields) to get something like

[16825.5, 162826.0 ....] 

What obvious thing am I missing here?

thanks!

Remove the list ; .split() already returns a list.

You are turning the first element of the fields into a list:

>>> fields = ['8000.5', '16745', '0.1257']
>>> fields[0]
'8000.5'
>>> list(fields[0])
['8', '0', '0', '0', '.', '5']

If you want to have the first column as a list, you can build a list as you go:

myfirstcolumn = []
for line in f1:
    fields = line.split()
    myfirstcolumn.append(fields[0])

This can be simplified into a list comprehension:

myfirstcolumn = [line.split()[0] for line in f1]

The last command is the problem.

print list(fields[0]) takes the zero'th item from your split list, then takes it and converts it into a list.

Since you have a list of strings already ['8000.5','16745','0.1257'] , the zero'th item is a string, which converts into a list of individual elements when list() is applied to it.

Your first problem is that you apply list to a string:

list("123") == ["1", "2", "3"]

Secondly, you print once per line in the file, but it seems you want to collect the first item of each line and print them all at once.

Third, in Python 2, there's no 't' mode in the call to open (text mode is the default).

I think what you want is:

with open(sys.argv[1], 'r') as f:
    print [ line.split()[0] for line in f ]

The problem was you were converting the first field which you correctly extracted into a list .

Here's a solution to print the first column:

with open(sys.argv[1]) as f1:
   first_col = []
   for line in f1:
      fields = line.split()
      first_col.append(fields[0])

   print first_col

gives:

['8000.5', '8001.0', '8001.5', '8002.0', '8002.5', '8003.0']

Rather than doing f1 = open(sys.argv[1], 'rt') consider using with which will close the file when you are done or in case of an exception. Also, I left off rt since open() defaults to r ead and t ext mode.

Finally, this could also be written using list comprehension :

with open(sys.argv[1]) as f1:
   first_col = [line.split()[0] for line in f1]

Others have already done a great job answering this question, the behavior that your seeing is because you're using list on a string. list will take any object that you can iterate over and turn it into a list -- one element at a time. This isn't really surprising except that the object doesn't even have to have an __iter__ method (which is the case with strings) -- There are a number of posts on SO about __iter__ so I won't focus on that part.

In any event, try the following code and see what it prints out:

>>> def enlighten_me(obj):
...     print (list(obj))
...     print (hasattr(obj))
...
>>> enlighten_me("Hello World") 
>>> enlighten_me( (1,2,3,4) )  
>>> enlighten_me( {'red':'wagon',1:5} )

Of course, you can try the example with sets, lists, generators ... Anything you can iterate over.

Levon posted a nice answer about how to create a column while reading your file. I will demonstrate the same thing using the built-in zip function.

rows=[]
for row in myfile:
    rows.append(row.split())

#now rows is stored as [ [col1,col2,...] , [col1,col2,...], ... ]

At this point we could get the first column by (Levon's answer):

column1=[]
for row in rows:
    column1.append(row[0])

or more succinctly:

column1=[row[0] for row in rows]  #<-- This is called a list comprehension

But what if you want all the columns? (and what if you don't know how many columns there are?). This is a job for zip .

zip takes iterables as input and matches them up. In other words:

zip(iter1,iter2)

will take iter1[0] and match it with iter2[0], and match iter1[1] with iter2[1] and so on -- kind of like a zipper if you think about it. But, zip can take more than just 2 arguments ...

zip(iter1,iter2,iter3) #results in [ [iter1[0],iter2[0],iter3[0]] , [iter1[1],iter2[1],iter3[1]], ... ]

Now, the last piece of the puzzle that we need is argument unpacking with the star operator. If I have a function:

def foo(a,b,c):
    print a
    print b
    print c

I can call that function like this:

A=[1,2,3]
foo(A[0],A[1],A[2])

Or, I can call it like this:

foo(*A)

Hopefully this makes sense -- the star takes each element in the list and "unpacks" it before passing it to foo.

So, putting the pieces together (remember back to the list of rows), we can unpack the list of rows and pass it to zip which will match corresponding indices in each row (ie columns).

columns=zip(*rows)

Now to get the first column, we just do:

columns[0]  #first column

for lists of lists, I like to think of zip(*list_of_lists) as a sort of poor-man's transpose.

Hopefully this has been helpful.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM