简体   繁体   English

Azure Synapse 专用池数据拉入 jupyter notebbok

[英]Azure Synapse dedicated pools data pulling in jupyter notebbok

I have data saved in one of view created in Azure synapse dedicated pools?我在 Azure 突触专用池中创建的视图之一中保存了数据? I need to access this data into the jupyter notebook for further processing ?我需要将此数据访问到 jupyter 笔记本中以进行进一步处理吗? would there any way to access/extract the data from dedciated pools in jupyter notebook written in python.有什么方法可以从用 python 编写的 jupyter notebook 中的专用池中访问/提取数据。

The Azure Synapse Dedicated SQL Pool Connector for Apache Spark in Azure Synapse Analytics enables efficient transfer of large data sets between the Apache Spark runtime and the Dedicated SQL pool. Azure Synapse Analytics 中用于 Apache Spark 的 Azure Synapse 专用 SQL 池连接器支持在 Apache Spark 运行时和专用 SQL 池之间高效传输大型数据集。 The connector is shipped as a default library with Azure Synapse Workspace.该连接器作为 Azure Synapse Workspace 的默认库提供。

Sample code -示例代码 -

# Add required imports
import com.microsoft.spark.sqlanalytics
from com.microsoft.spark.sqlanalytics.Constants import Constants
from pyspark.sql.functions import col

# Read from existing internal table
dfToReadFromTable = (spark.read
                     # If `Constants.SERVER` is not provided, the `<database_name>` from the three-part table name argument
                     # to `synapsesql` method is used to infer the Synapse Dedicated SQL End Point.
                     .option(Constants.SERVER, "<sql-server-name>.sql.azuresynapse.net")
                     # Defaults to storage path defined in the runtime configurations
                     .option(Constants.TEMP_FOLDER, "abfss://<container_name>@<storage_account_name>.dfs.core.windows.net/<some_base_path_for_temporary_staging_folders>")
                     # Three-part table name from where data will be read.
                     .synapsesql("<database_name>.<schema_name>.<table_name>")
                     # Column-pruning i.e., query select column values.
                     .select("<some_column_1>", "<some_column_5>", "<some_column_n>")
                     # Push-down filter criteria that gets translated to SQL Push-down Predicates.
                     .filter(col("Title").contains("E"))
                     # Fetch a sample of 10 records
                     .limit(10))

# Show contents of the dataframe
dfToReadFromTable.show()

You can refer this link for more information您可以参考此链接以获取更多信息

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在 Azure Synapse 专用 SQL 池中查找分区的范围值 - How to find the range values for a partition in Azure Synapse dedicated SQL pool 如何在Azure Synapse专用池中的存储过程中编写Execute AS Script? - How to write Execute AS Script in Stored Procedure in Azure Synapse Dedicated Pool? 如何监控 Azure Synapse 专用池中的旧查询历史记录和查询计划 - How to monitor older query history and query plan in Azure Synapse Dedicated pool Azure Synapse Serverless SQL 池无法连接但可以连接到专用 SQL 池? - Azure Synapse Serverless SQL Pool Unable to Connect BUT Can connect to Dedicated SQL Pool? 通过不检测重复但有欺骗来分组。 奇怪 SQL 服务器 - Azure Synapse 数据库专用 SQL 池 - Group by not detecting duplicates but there are dupes. Strange SQL Server - Azure Synapse database dedicated SQL pool 将数据从 ADLS Gen 2 加载到 Azure Synapse - Loading data from ADLS Gen 2 into Azure Synapse Azure Synapse 尝试将数据类型从 varchar 更改为 BIGINT - Azure Synapse Attempting to Change data type from varchar to BIGINT 如何在 Azure 数据仓库(突触)中授予架构级别权限? - How to give Schema Level permission in Azure Data Warehouse (Synapse)? Azure Synapse 数据流 - parquet 文件名不起作用 - Azure Synapse Data Flows - parquet file names not working 无法使用 SQLAlchemy 连接到 Azure 数据仓库(现在称为 Synapse) - Cannot connect to Azure Data Warehouse (now called Synapse) using SQLAlchemy
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM