简体   繁体   English

Python:使用熊猫读取txt文件时出错

[英]Python: error reading txt file using pandas

I have a txt file "TempData.txt" which has the following format: 我有一个txt文件“ TempData.txt”,其格式如下:

CODE    O/F Valid Date  MAX MIN AVG
K3T5    O   1995/01/01  51  36  44
K3T5    O   1995/01/02  45  33  39
K3T5    O   1995/01/03  48  38  43

I am trying to create a dictionary with 'ValidDates', 'Max' and 'Min' elements in it. 我正在尝试创建一个包含“ ValidDates”,“ Max”和“ Min”元素的字典。

I am trying the following: 我正在尝试以下方法:

import pandas as pd
df = pd.read_csv(r'C:\TempData.txt', sep = "\t", header = 0)

df.columns.tolist() #prints: 'CODE', 'O/F', 'Valid Date', 'MAX', 'MIN', 'AVG'
Max = df([4])

I get the error when I try to separate the Max colum: 当我尝试分离最大列时出现错误:

TypeError: 'DataFrame' object is not callable

I think you can use: 我认为您可以使用:

max_col = df['MAX']

print (max_col)
0    51
1    45
2    48
Name: MAX, dtype: int64

If you want select 4. column use iloc : 如果要选择4.列,请使用iloc

max_col = df.iloc[:, 3] #3, because python counts 0,1,2,3

print (max_col)
0    51
1    45
2    48
Name: MAX, dtype: int64

First you can omit header=0 , because it is default value in read_csv and add parse_dates for converting Valid Date to datetime . 首先,您可以省略header=0 ,因为它是read_csv默认值,并添加了parse_dates以将Valid Date转换为datetime

If need dict from columns Valid Date , MAX , MIN use to_dict , if you want different format of dict , try add parameter orient : 如果需要Valid DateMAXMIN列中的dict ,请使用to_dict ,如果要使用不同格式的dict ,请尝试添加参数orient

df = pd.read_csv(r'C:\TempData.txt', sep = "\t", parse_dates=[2])
print (df)
   CODE O/F Valid Date  MAX  MIN  AVG
0  K3T5   O 1995-01-01   51   36   44
1  K3T5   O 1995-01-02   45   33   39
2  K3T5   O 1995-01-03   48   38   43


print (df[['Valid Date','MAX','MIN']])
  Valid Date  MAX  MIN
0 1995-01-01   51   36
1 1995-01-02   45   33
2 1995-01-03   48   38

print (df[['Valid Date','MAX','MIN']].to_dict())
{'MAX': {0: 51, 1: 45, 2: 48}, 
'MIN': {0: 36, 1: 33, 2: 38}, 
'Valid Date': {0: Timestamp('1995-01-01 00:00:00'), 1: Timestamp('1995-01-02 00:00:00'), 2: Timestamp('1995-01-03 00:00:00')}}

print (df[['Valid Date','MAX','MIN']].to_dict(orient='split'))
{'data': [['1995/01/01', 51, 36], ['1995/01/02', 45, 33], ['1995/01/03', 48, 38]], 'index': [0, 1, 2], 'columns': ['Valid Date', 'MAX', 'MIN']}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM