TypeError: data type ' int64' not understood

Question

I have the following function to load data in my jupyter notebook

#function to load data
def load_dataset(x_path, y_path):
    x = pd.read_csv(os.sep.join([DATA_DIR, x_path]),
                    dtype=DTYPES,
                    index_col="ID")
    y = pd.read_csv(os.sep.join([DATA_DIR, y_path]))

    return x, y

and the data has the below types defined

DTYPES = {
    'ID':'int64',
    'columnA':'str',
    'columnB':'float32',
    'columnC':'float64',
    'columnD':'datetime64[ns]'}

The header of the above csv is as below

ID          columnA   columnB   columnC         columnD
941215   SALE      15000       56           10/1/2018

when I call the method in my notebook

from model import load_dataset
X_train, y_train = load_dataset("X_train.zip", "y_train.zip")

I get the below error

2055 raise TypeError("data type '{}' not understood".format(dtype))
2057     # Any invalid dtype (such as pd.Timestamp) should raise an error.
TypeError: data type ' int64' not understood

Answer 1

I think you need specify dtypes in numpy :

DTYPES = {
    'ID':np.int64,
    'columnA':'str',
    'columnB':np.float32,
    'columnC':np.float64}

For datetimes need different approach - parameter parse_dates in read_csv :

def load_dataset(x_path, y_path):
    x = pd.read_csv(os.sep.join([DATA_DIR, x_path]),
                    dtype=DTYPES,
                    index_col="ID"
                    parse_dates='columnD')
    y = pd.read_csv(os.sep.join([DATA_DIR, y_path]))

    return x, y

TypeError: data type ' int64' not understood

Question

1 answers

solution1
0 2019-11-29 11:59:34

TypeError: data type ' int64' not understood

Question

1 answers

solution1 0 2019-11-29 11:59:34

solution1
0 2019-11-29 11:59:34