简体   繁体   中英

Python - How to get data types for all columns in CSV file?

I am trying to get all data types from a CSV file for each column.
There is no documentation about data types in a file and manually checking will take a long time (it has 150 columns).

Started using this approach:

df = pd.read_csv('/tmp/file.csv')

>>> df.dtypes
a   int64
b   int64
c   object
d   float64

Is above approach good enough or there is a better approach to figure out data types?
Also - file has 150 columns. When I type df.types - I can see only 15 or so columns. How to see them all?

Depending on the size of your file, you might be able to save some time by only reading in the first few rows, using the nrows argument of pd.read_csv :

df = pd.read_csv('/tmp/file.csv', nrows=25)

This is only useful if you know for sure that the types can be correctly inferred from the first n rows though, so be careful with this.

Once you have the data (or a subset of it) loaded into a DataFrame, you can view the types in a number of different ways, a few of which have been posted already, but I'll share another using a simple loop and iteritems :

for name, dtype in df.dtypes.iteritems():
    print(name, dtype)

a int64
b float64
c object

I think this is a good way to do it. It returns a Series object. To see more rows you can use this one: pd.set_option('display.max_rows', 250)

You could update the max_info_columns display option and use DataFrame.info()

pd.set_option('max_info_columns', 200)
df.info()

There are some ways to do it. I like to use

df.dtypes

or

for i, v in enumerate(df.columns):
    print(i, v)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM