简体   繁体   中英

python csv DictReader type

I'm starting to code in python and i now have the problem, that the csv.DictReader gets me the wrong data type.

The csv file looks like:

Col1, Col2, Col3

1,2,3

90,2,3

pol = csv.DictReader(open('..\data\data.csv'),dialect='excel')

Col1 = []

for row in pol:
    if row["Col1"] < 90:
        Col1.append(row["Col1"] * 1.5)
    else:
        Col1.append("Col1")

I get the following error:

if row["Col1"] < 90:
TypeError: unorderable types: str() < int()

I won't convert every single value. Is it possible to define the values of the column?

You could use a library like pandas, it will infer the types for you (it's a bit of an overkill but it does the job).

import pandas
data = pandas.read_csv(r'..\data\data.csv')
# if you just want to retrieve the first column as a list of int do
list(data.Col1)
>>> [1, 90]

# to convert the whole CSV file to a list of dict use
data.transpose().to_dict().values()
>>> [{' Col2': 2, ' Col3': 3, 'Col1': 1}, {' Col2': 2, ' Col3': 3, 'Col1': 90}]

Alternatively here is an implementation of a typed DictReader:

from csv import DictReader
from itertools import imap, izip

class TypedDictReader(DictReader):
  def __init__(self, f, fieldnames=None, restkey=None, restval=None, \
      dialect="excel", fieldtypes=None, *args, **kwds):

    DictReader.__init__(self, f, fieldnames, restkey, restval, dialect, *args, **kwds)
    self._fieldtypes = fieldtypes

  def next(self):
    d = DictReader.next(self)
    if len(self._fieldtypes) >= len(d) :
      # extract the values in the same order as the csv header
      ivalues = imap(d.get, self._fieldnames) 
      # apply type conversions
      iconverted = (x(y) for (x,y) in izip(self._fieldtypes, ivalues)) 
      # pass the field names and the converted values to the dict constructor
      d = dict(izip(self._fieldnames, iconverted)) 

    return d

and here is how to use it:

reader = TypedDictReader(open('..\data\data.csv'), dialect='excel', \
  fieldtypes=[int, int, int])
list(reader)
>>> [{' Col2': 2, ' Col3': 3, 'Col1': 1}, {' Col2': 2, ' Col3': 3, 'Col1': 90}]

If you quote the non-numeric values in the csv file and initialize the reader by

pol = csv.DictReader(open('..\data\data.csv'),
    quoting=csv.QUOTE_NONNUMERIC, dialect="excel")

then numeric values will be automatically converted to floats.

I haven't used DictReader before, but you could just do this to the value:

...
for row in pol:
    col1 = float(row["Col1"]) # or int()
    ...

And then use col1 through out, you probably could also edit the dictionary:

row["Col1"] = float(row["Col1"])

But it depends what you want to use the row for later.

It looks like you want Col1 to be an array of numbers, so you'd need to convert row["Col1"] to a number whether or not you were comparing it to a number. So convert it!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM