简体   繁体   中英

Pandas read_csv: Columns are being imported as rows

Edit : I believe this was all user error. I have been typing df.T by default, and it just occurred to me that this is very likely the TRANSPOSE output. By typing df , the data frame is output normally (headers as columns). Thank you for those who stepped up to try and help. In the end, it was just my misunderstanding of pandas language..

Original Post

I'm not sure if I am making a simple mistake but the columns in a .csv file are being imported as rows using pd.read_csv . The dataframe turns out to be 5 rows by 2000 columns. I am importing only 5 columns out of 14 so I set up a list to hold the names of the columns I want. They match exactly those in the .csv file. What am I doing wrong here?

import os
import numpy as np
import pandas as pd

fp = 'C:/Users/my/file/path'
os.chdir(fp)

cols_to_use = ['VCOMPNO_CURRENT', 'MEASUREMENT_DATETIME',
               'EQUIPMENT_NUMBER', 'AXLE', 'POSITION']

df = pd.read_csv('measurement_file.csv',
                 usecols=cols_to_use,
                 dtype={'EQUIPMENT_NUMBER': np.int,
                        'AXLE': np.int},
                 parse_dates=[2],
                 infer_datetime_format=True)

Output:

                                    0  ...            2603
VCOMPNO_CURRENT                T92656  ...          T5M247
MEASUREMENT_DATETIME  7/26/2018 13:04  ...  9/21/2019 3:21
EQUIPMENT_NUMBER                  208  ...             537
AXLE                                1  ...               6
POSITION                            L  ...               R

[5 rows x 2000 columns]

Thank you.

Edit: To note, if I import the entire .csv with the standard pd.read_csv('measurement_file.csv') , the columns are imported properly.

Edit 2: Sample csv:

VCOMPNO_CURRENT,MEASUREMENT_DATETIME,REPAIR_ORDER_NUMBER,EQUIPMENT_NUMBER,AXLE,POSITION,FLANGE_THICKNESS,FLANGE_HEIGHT,FLANGE_SLOPE,DIAMETER,RO_NUMBER_SRC,CL,VCOMPNO_AT_MEAS,VCOMPNO_SRC
T92656,10/19/2018 7:11,5653054,208,1,L,26.59,27.34,6.52,691.3,OPTIMESS_DATA,2MTA ,T71614 ,RO_EQUIP     
T92656,10/19/2018 7:11,5653054,208,1,R,26.78,27.25,6.64,691.5,OPTIMESS_DATA,2MTA ,T71614 ,RO_EQUIP     
T92656,10/19/2018 7:11,5653054,208,2,L,26.6,27.13,6.49,691.5,OPTIMESS_DATA,2MTA ,T71614 ,RO_EQUIP     
T92656,10/19/2018 7:11,5653054,208,2,R,26.61,27.45,6.75,691.6,OPTIMESS_DATA,2MTA ,T71614 ,RO_EQUIP     
T7L672,10/19/2018 7:11,5653054,208,3,L,26.58,27.14,6.58,644.4,OPTIMESS_DATA,2CTC ,T7L672 ,BOTH         
T7L672,10/19/2018 7:11,5653054,208,3,R,26.21,27.44,6.17,644.5,OPTIMESS_DATA,2CTC ,T7L672 ,BOTH             

A simple workaround here is to just to take the transpose of the dataframe. Link to Pandas Documentation

df = pd.DataFrame.transpose(df)

Can you try like this?

import pandas as pd

dataset = pd.read_csv('yourfile.csv')

#filterhere
dataset = dataset[cols_to_use]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM