简体   繁体   中英

Pandas read_fwf difficulty interpreting a date-like string

Pandas read_fwf difficulty interpreting a date-like string

I'm reading several hundred fixed width files into a postgresql database parsing it using pandas read_fwf code.

My stumbling block is trying to pull the end date from the period from the last ten columns of one of the lines.

An example file can be found at this link at the NOAA website:

The critical code snippet from my Python/pandas script:

import os
import time
import requests
import pandas as pd
import time
import datetime
from dateutil.parser import *

## Load adapters
import psycopg2
import psycopg2.extensions

df = pd.read_fwf(ddFname, header=None, )

if str(df[0:1]).find('COOLING') >= 0:
    amtType = 'CDD'
elif str(df[0:1]).find('HEATING') >= 0:
    amtType = 'HDD'

prDate = str(df[3:4])[-10:-1]
print(prDate)

When I invoke the last line I get the following:

SEP 24,...

when I need the following:

SEP 24, 2016

Much thanks for any and all help.

Using the example file you posted. The following works for me:

df = pd.read_fwf(ddFname, header=None, )
str(df.at[4, 0])[-12:]
# Out[99]: 'SEP 24, 2016'

When you do something like:

str(df[3:5])

You are invoking the __repr__ method of a pandas DataFrame. The repr method often truncates large cells for readability (as it does in this case). For this case it looks like:

repr(df[3:5])
Out[106]: '                                                   0    1\n3                                                NaN  NaN\n4  LAST DATE OF DATA COLLECTION PERIOD IS SEP 24,...  NaN'

and str(_)[-10:-1] gives:

Out[107]: '4,...  Na'

The indexes between your file and mine aren't quite matching up, but hopefully you understand better what's going on here. Using at will access the actual value at a particular row and column (the value won't be truncated).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM