[英]Is there any way to write data from azure databricks to azure cosmos db GREMLIN API
我正在嘗試通過 Azure 數據塊將頂點和邊寫入 cosmos db gremlin api,但不幸的是我遇到了錯誤。 我嘗試更改不同版本的集群和 maven 庫仍然沒有用。
庫:Databricks 配置:10.4 LTS(包括 Apache Spark 3.2.1、Scala 2.12)
Maven 庫已安裝:com.azure.cosmos.spark:azure-cosmos-spark113-2_2-12.4:
這是我遵循的文件。
https://github.com/Azure/azure-cosmosdb-spark#using-databricks-notebooks
可能會發生一些庫沖突問題,因為在文檔中存在所有舊版本配置。 如果有人遇到這個好心的幫助?
cosmosDbConfig = {
"Endpoint" : "https://xxxxxxxx.gremlin.documents.azure.com:443/",
"Masterkey" : "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"Database" : "sample-database",
"Collection" : "sample-graph",
"Upsert" : "true"
}
cosmosDbFormat = "com.microsoft.azure.cosmosdb.spark"
(cosmosDbVertices.write.format(cosmosDbFormat).mode("append").options(**cosmosDbConfig).save()) ```
Error:
Py4JJavaError: An error occurred while calling o1113.save.
: java.lang.ClassNotFoundException:
Failed to find data source: com.microsoft.azure.cosmosdb.spark. Please find packages at
http://spark.apache.org/third-party-projects.html
at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:557)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:758)
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:808)
at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:983)
at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:293)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:258)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
我試圖在我的環境中重現同樣的事情,我得到了同樣的錯誤。
要解決此錯誤,請嘗試安裝com.azure.cosmos.spark:azure-cosmos-spark_3-2_2-12:4.12.2
庫並遵循以下代碼。
代碼:
cosEndpoint = "https://xxxxxx.dxx.azure.com:443/"
cosMasterkey = "xxxx"
cosDatabase = "xxxx"
cosContainer = "xxxx"
cfg1 = {
"spark.cosmos.accountEndpoint" : cosEndpoint,
"spark.cosmos.accountKey" : cosMasterkey,
"spark.cosmos.database" : cosDatabase,
"spark.cosmos.container" : cosContainer,
}
#Sample dataframe
cosmosDbVertices=spark.createDataFrame((("ss1", "cat", 2, True), ("cc1", "dog", 2, False)))\
.toDF("id","name","age","isAlive")
# writing data into cosmosdb
sf=cosmosDbVertices.write.format("cosmos.oltp").options(**cfg1).mode("APPEND").save()
Output:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.