简体   繁体   中英

Reading a CSV file using Python 2

I'm running Python 2.7. I'm very new to Python. I'm trying to read a CSV file (the values are separated by spaces) and separate the values inside based on the header above the coordinates. The format of the file isn't what I'm used to and I'm having trouble getting the values to read correctly. Even if I could get them to read correctly, I don't understand how to put them in a list.

Here is what the CSV file looks like:

# image name
1.png
# probe locations
100 100
200 100
100 200
300 300

# another image name
2.png
100 200
200 100
300 300
135 322

# end

Here's the code I am playing with:

class CommentedFile:
    def __init__(self, f, commentstring="#"):
        self.f = f
        self.commentstring = commentstring
    def next(self):
        line = self.f.next()
        while line.startswith(self.commentstring):
            line = self.f.next()
        return line
    def __iter__(self):
        return self

#I did this in order to ignore the comments in the CSV file

tsv_file = csv.reader(CommentedFile(open("test.exp", "rb")),
                  delimiter=' ')


for row in tsv_file:
    if row != int:
        next(tsv_file)
    if row:
        print row

the code prints out:

['100', '100']
['100', '200']
['100', '200']
['300', '300']
Traceback (most recent call last):
  File "the path", line 57, in <module>
next(tsv_file)
StopIteration

So I'm trying to get the program to separate the coordinates based on the header and then put them into separate lists. Thank you for your help!

Take a look at pandas . It has a DataFrame object which can hold your data and allow you manipulate in an intuitive way. It also has a read_csv function which takes out a lot of the hassle when dealing with csv files.

for example:

import pandas as pd

#reads your csv file in and returns a DataFrame object as metioned above. 
df = pd.read_csv("your_csv.csv", sep=' ', names=['co_a','co_b'], header=None, skiprows=2)

#extracts your discordant to separate lists
list1 = df.co_a.to_list()
list2 = df.co_b.to_list()

you can use df or df.head() to see your dataframe and how your data is managed. It's also worth mentioning that df.co_a is a Series object, think super list / dict, and you can probably do your analysis or manipulation right from there.

Also if you show me how the comments are in the csv file, I can show you how to ignore them with read_csv .

I know you were looking for an answer with the csv module but this is a much more advanced tool and might help you out in the long run.

Hope it helps!

Your code worked well for me actually. I don't know why you're getting the traceback.

tmp.csv

# image name
1.png
# probe locations
100 100
200 100
100 200
300 300

# another image name
2.png
100 200
200 100
300 300
135 322

# end

tmp.py

import csv

class CommentedFile:
    def __init__(self, f, commentstring="#"):
        self.f = f
        self.commentstring = commentstring
    def next(self):
        line = self.f.next()
        while line.startswith(self.commentstring):
            line = self.f.next()
        return line
    def __iter__(self):
        return self

#I did this in order to ignore the comments in the CSV file

tsv_file = csv.reader(CommentedFile(open("tmp.csv", "rb")),
                  delimiter=' ')


for row in tsv_file:
    if row != int:
        next(tsv_file)
    if row:
        print row

Shell output

tmp$python tmp.py 
['1.png']
['200', '100']
['300', '300']
['2.png']
['200', '100']
['135', '322']
tmp$uname -mprsv
Darwin 12.4.0 Darwin Kernel Version 12.4.0: Wed May  1 17:57:12 PDT 2013; root:xnu-2050.24.15~1/RELEASE_X86_64 x86_64 i386
tmp$python --version
Python 2.7.2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM