[英]Pandas read_csv is there a way to convert sqlachemy data types to pandas dtypes?
我試圖讀取一個csv,我最終將其推送到數據庫。 我有一個預定的數據模型,數據將最終以該模型結束。 另外,csv沒有標頭。
例如,如果我定義一個模型:
class SamFoo(Base):
__tablename__ = 'foo'
duns = Column(Text)
duns_plus_four = Column(Text)
cage_code = Column(Text)
dodaac = Column(Text)
sam_extract_code = Column(Text)
purpose_of_registration = Column(Text)
initial_registration_date = Column(DateTime)
...
如果我嘗試read_csv
,
sam_name_type_dict = {c.name: c.type for c in SamFoo.__table__.c}
sam_name_type_dict.pop('id', None) # id isn't in csv data.
raw_data = pd.read_csv(
data,
sep='|',
skiprows=1,
header=None,
names=list(sam_name_type_dict.keys()),
dtype=sam_name_type_dict,
)
我收到TypeError: data type not understood
,所以我的問題是,有沒有辦法將sqlachemy數據類型映射到pandas dtypes?
這是我解決的方法:
from sqlalchemy.types import DateTime, Integer, Boolean, Text
import numpy as np
types_map = {
Text: object,
DateTime: np.datetime64,
Integer: np.float64, # NA integer gotcha https://pandas.pydata.org/pandas-docs/stable/gotchas.html#support-for-integer-na
Boolean: bool,
}
# had to split dates and regular dtypes into separate dicts
sam_sql_types = {c.name: c.type for c in SamFoo.__table__.c}
sam_sql_types.pop('id', None) # no id in csv
sam_dtypes = {k: types_map[type(v)] for k, v in sam_sql_types.items()}
sam_dates = {k: v for k, v in sam_dtypes.items() if v == np.datetime64}
sam_dtypes_non_dates = {k: v for k, v in sam_dtypes.items() if v != np.datetime64}
raw_data = pd.read_csv(
data,
sep='|',
skiprows=1,
header=None,
names=list(sam_sql_types.keys()),
dtype=sam_dtypes_non_dates,
parse_dates=list(sam_dates.keys())
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.