简体   繁体   中英

Python how to count colums in a csv open using DictReader

I tried to count the columns of a csv open with DictReader using this code

with open(sys.argv[1],"r") as myfile2:

    reader = csv.DictReader(myfile2)
    first_row = next(reader)
    num_cols = len(first_row)

and it works, but the problem is when i tried to iterate using

for row in reader: ...

it starts in the second row, not the first row after the header. I suppose, after reading the documentation, than the problem is using next(reader), but i couldn't find another way to count columns.

any ideas?

Thanks!

you can use csv.DictReader.fieldnames to take the count of the columns:

with open(sys.argv[1]) as csvfile:
    reader = csv.DictReader(csvfile)
    num_cols = len(reader.fieldnames)
    for row in reader:...

You can try:

with open(file_name, "r") as myfile2:
    reader = csv.DictReader(myfile2)
    num_col = len(reader.fieldnames)
    for row in reader:
        whatever

As a side note, len(first_row) is not for the faint of heart. For dictionaries you probably want to use len(first_row.keys()) , even if your original code works in this case.

When you declare the reader - it is an iterator, which keeps giving the next iteration value.

When you can next() , it returns the first line. If called again it would return the second. And if after calling next() you have a for loop, it won't start at the top - it would start where you left off after next() calls.

So, you either:

  • don't call next() , or
  • you re-define the reader again before calling it in a loop, or
  • pre-compute the output of the reader (only do it if small csv files).

I hope that explains it.

csv.DictReader accepts an fieldnames argument, which is a sequence of column names, or headings if you prefer. The docs say:

If fieldnames is omitted, the values in the first row of file [...] will be used as the fieldnames

In the code in the question, fieldnames is omitted, so the first row is automatically consumed by the reader, to get the column names. Calling next(reader) consumes the second row, so iterating over reader starts at the third row.

>> import csv, io
>>> data = """A,B,C
... 1,2,3
... 4,5,6
... """
>>>
>>> reader = csv.DictReader(io.StringIO(data))
>>> print(next(reader))
{'A': '1', 'B': '2', 'C': '3'}
>>> for row in reader:print(row)
... 
{'A': '4', 'B': '5', 'C': '6'}

As kederrac points out in their answer, you use the value of reader.fieldnames to get the length of the rows - it will be populated automatically by the DictReader :

>>> reader = csv.DictReader(io.StringIO(data))
>>> print(reader.fieldnames)
['A', 'B', 'C']
>>> for row in reader:print(row)
... 
{'A': '1', 'B': '2', 'C': '3'}
{'A': '4', 'B': '5', 'C': '6'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM