简体   繁体   English

Dask 中的 read_sql_table 返回 NoSuchTableError

[英]read_sql_table in Dask returns NoSuchTableError

I have a read_sql using pandas and it works fine.我有一个使用 Pandas 的 read_sql,它工作正常。 However, when I tried to re-create the same dataframe under Dask using the same logic.但是,当我尝试使用相同的逻辑在 Dask 下重新创建相同的数据框时。 It gives me NoSuchTableError.它给了我 NoSuchTableError。 I know for sure the table exists in my SQL database.我确定该表存在于我的 SQL 数据库中。

pandas #works:熊猫#作品:

import urllib
import sqlalchemy as sa
import pandas as pd

sql = "SELECT * FROM my_table"
params = urllib.parse.quote_plus("DRIVER={SQL Server Native Client 11.0};\
                             SERVER=my_server;\
                             DATABASE=db_name;\
                             Trusted_Connection=yes;")
engine = sa.create_engine('mssql+pyodbc:///?odbc_connect=%s' % params)
df = pd.read_sql(sql, engine)
print(df.head())

Since Dask is using the full URL from sqlalchemy, I also tried to re-create the same connection in sqlalchemy and it works.由于 Dask 使用的是来自 sqlalchemy 的完整 URL,我还尝试在 sqlalchemy 中重新创建相同的连接并且它可以工作。 It just puzzles me why it does not work in Dask.它只是让我感到困惑,为什么它在 Dask 中不起作用。

sqlalchemy #works: sqlalchemy #作品:

import pyodbc
import sqlalchemy as sal
from sqlalchemy import create_engine

engine = sal.create_engine('mssql+pyodbc://my_server/db_name\
         ?driver=SQL+Server+Native+Client+11.0?trusted_connection=yes')

result = engine.execute("select * from my_table")

for row in result:
    print(row[0])

Dask #NoSuchTableError: Dask #NoSuchTableError:

import urllib
import sqlalchemy as sa
import dask.dataframe as dd
from sqlalchemy.engine.url import make_url

params = urllib.parse.quote_plus("DRIVER={SQL Server Native Client 11.0};\
                             SERVER=my_server;\
                             DATABASE=db_name;\
                             Trusted_Connection=yes;")
conn_str = 'mssql+pyodbc:///?odbc_connect={}'.format(params)
url = make_url(conn_str)
df = dd.read_sql_table('my_table', url, index_col='ID')
print(df.head())

Has anyone came across the same/similar issue?有没有人遇到过相同/类似的问题? Any thought is much appreciated!任何想法都非常感谢! Thanks in advance.提前致谢。

without knowing further details about how your SQL Server is set up, I believe this will be SQL Server specific from the Dask documentation, you need to provide the schema= keyword, like this:在不知道有关如何设置 SQL Server 的更多详细信息的情况下,我相信这将是 Dask 文档中特定于 SQL Server 的内容,您需要提供schema=关键字,如下所示:

dftest = dd.read_sql_table(table="table_name_only", uri=uri, index_col="somekey", schema="schema_name", divisions=[1,2,3])

Note that uri here is the SQLAlchemy string, not a connection object.请注意,这里的uri是 SQLAlchemy 字符串,而不是连接对象。

https://docs.dask.org/en/latest/dataframe-api.html#dask.dataframe.read_sql_table https://docs.dask.org/en/latest/dataframe-api.html#dask.dataframe.read_sql_table

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM