简体   繁体   中英

Unable to read from object of type: <class 'numpy.ndarray'>

I have the following python code which I am trying to output to a directory based on timestamp.

import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
import uuid

data = {'date': ['2018-03-04T14:12:15.653Z', '2018-03-03T14:12:15.653Z', '2018-03-02T14:12:15.653Z', '2018-03-05T14:12:15.653Z'],
    'battles': [34, 25, 26, 57],
    'citys': ['london', 'newyork', 'boston', 'boston']}
df = pd.DataFrame(data, columns=['date', 'battles', 'citys'])
df['date'] = df['date'].map(lambda t: pd.to_datetime(t, format="%Y-%m-%dT%H:%M:%S.%fZ"))
df.groupby(by=['citys'])
dst_path = "logs/year=" + df['date'].dt.year.astype('str').unique() + "/month=" + df['date'].dt.month.astype('str').unique() + "/day=" + df['date'].dt.day.astype('str').unique() + "/" + str(uuid.uuid4()) + ".parq"
table = pa.Table.from_pandas(df)
pq.write_table(table, dst_path)

but i am seeing the following error:

python3 test.py
Traceback (most recent call last):
File "test.py", line 15, in <module>
  pq.write_table(table, dst_path)
File "/usr/local/lib/python3.6/site-packages/pyarrow/parquet.py", line 943, in write_table
**kwargs)
File "/usr/local/lib/python3.6/site-packages/pyarrow/parquet.py", line 286, in __init__
**options)
File "pyarrow/_parquet.pyx", line 837, in pyarrow._parquet.ParquetWriter.__cinit__ (/Users/travis/build/BryanCutler/arrow-dist/arrow/python/build/temp.macosx-10.6-intel-3.6/_parquet.cxx:14606)
File "pyarrow/io.pxi", line 835, in pyarrow.lib.get_writer (/Users/travis/build/BryanCutler/arrow-dist/arrow/python/build/temp.macosx-10.6-intel-3.6/lib.cxx:59078)
TypeError: Unable to read from object of type: <class 'numpy.ndarray'>

how can i create a directory from pandas timestamp?

Your dst_path is a numpy array.

print(type(dst_path))

output

<class 'numpy.ndarray'>

it should be a string, so below the dst_path line I added the following and it worked. Its not elegant so you can investigate a better way to do it. The point here is that you need a string.

dst_path = str(dst_path[0])

Note that the directory has to be already there or you will get an error, so you can write the following prior to write_table.

import os
dir, file = os.path.split(dst_path)
if not os.path.exists(dir):
    os.makedirs(dir)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM