简体   繁体   中英

Wrong parsing when importing csv file in python

I am trying to import a csv format file. this is tick trading data info. The file is as follows:

0,2017-09-18 02:00:06,12568.00,1,201,12567.00,12568.00,5462,0,0,C,
0,2017-09-18 02:00:06,12568.50,2,203,12567.00,12568.00,5463,0,0,C,
0,2017-09-18 02:00:06,12569.00,1,204,12567.00,12569.00,5468,0,0,C,
0,2017-09-18 02:00:06,12569.00,1,205,12567.00,12569.00,5470,0,0,C,
0,2017-09-18 02:00:06,12569.50,3,208,12567.00,12569.00,5471,0,0,C,

I am using this python code:

import pandas as pd
df = pd.read_csv("XG#/20170918.txt", names=['empty', 'date time', 'last', 'last size', 'bid', 'ask'])
print(df.head(1))

my output is this:

  empty date time last \\ 0 2017-09-18 02:00:06 12567.0 200.0 200.0 12567.0 12567.0 5430.0 0.0 last size bid ask 0 2017-09-18 02:00:06 12567.0 200.0 200.0 12567.0 0.0 C NaN 

Process finished with exit code 0

My questions are:

  1. Why my "names" (headers) are not starting on the first column?
  2. How do I make 2nd column as date-time and index?
  3. How do I widen the result so I will see all the data in one line (I am using pycharm)? since I need to make date-time as index, I need to remove column 0 but when using df.drop(df.index[0]) nothing happens.

Any help is welcome!

There are 10 columns and you have names for 6 columns, so this how the code should look like:

df = pd.read_csv('lol.csv',usecols = list(range(0,6)),names=['empty', 'date_time', 'last', 'last_size', 'bid', 'ask'])

i used the first 6 columns, please feel to understand the below example and name your desired columns.

usecols is where you put a list of your column numbers which you want it to be named .

for eg : if you want col 1,3,4 to be named as name,gender,address then the code will look like

pd.read_csv('lol.csv',usecols = [1,3,4],names=['name','gender','address'])

for the third question

df = pd.read_csv('lol.csv',usecols = list(range(0,6)),names=['empty','date_time', 'last', 'last_size', 'bid', 'ask'],index_col = 'date_time' ) 

you can use the index_col parameter to tell which column to use as index.

to drop a column after you import an csv in variable (for eg: df ) using pandas, use the following code:

df.drop('empty', axis=1, inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM