簡體   English   中英

有什么方法可以將數據從 azure 數據塊寫入 azure cosmos db GREMLIN API

[英]Is there any way to write data from azure databricks to azure cosmos db GREMLIN API

我正在嘗試通過 Azure 數據塊將頂點和邊寫入 cosmos db gremlin api,但不幸的是我遇到了錯誤。 我嘗試更改不同版本的集群和 maven 庫仍然沒有用。

庫:Databricks 配置:10.4 LTS(包括 Apache Spark 3.2.1、Scala 2.12)

Maven 庫已安裝:com.azure.cosmos.spark:azure-cosmos-spark113-2_2-12.4:

這是我遵循的文件。

https://github.com/Azure/azure-cosmosdb-spark#using-databricks-notebooks

可能會發生一些庫沖突問題,因為在文檔中存在所有舊版本配置。 如果有人遇到這個好心的幫助?

cosmosDbConfig = {
  "Endpoint" : "https://xxxxxxxx.gremlin.documents.azure.com:443/",
  "Masterkey" : "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
  "Database" : "sample-database",
  "Collection" : "sample-graph",
  "Upsert" : "true"
}

cosmosDbFormat = "com.microsoft.azure.cosmosdb.spark"

(cosmosDbVertices.write.format(cosmosDbFormat).mode("append").options(**cosmosDbConfig).save()) ```

Error: 
Py4JJavaError: An error occurred while calling o1113.save.
: java.lang.ClassNotFoundException: 
Failed to find data source: com.microsoft.azure.cosmosdb.spark. Please find packages at
http://spark.apache.org/third-party-projects.html
       
    at org.apache.spark.sql.errors.QueryExecutionErrors$.failedToFindDataSourceError(QueryExecutionErrors.scala:557)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:758)
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:808)
    at org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:983)
    at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:293)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:258)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)



我試圖在我的環境中重現同樣的事情,我得到了同樣的錯誤。

在此處輸入圖像描述

要解決此錯誤,請嘗試安裝com.azure.cosmos.spark:azure-cosmos-spark_3-2_2-12:4.12.2庫並遵循以下代碼。

代碼:

cosEndpoint = "https://xxxxxx.dxx.azure.com:443/"
cosMasterkey = "xxxx"
cosDatabase = "xxxx"
cosContainer = "xxxx"

cfg1 = {
  "spark.cosmos.accountEndpoint" : cosEndpoint,
  "spark.cosmos.accountKey" : cosMasterkey,
  "spark.cosmos.database" : cosDatabase,
  "spark.cosmos.container" : cosContainer,
}

#Sample dataframe
cosmosDbVertices=spark.createDataFrame((("ss1", "cat", 2, True), ("cc1", "dog", 2, False)))\
  .toDF("id","name","age","isAlive")

# writing data into cosmosdb
sf=cosmosDbVertices.write.format("cosmos.oltp").options(**cfg1).mode("APPEND").save()

Output:

在此處輸入圖像描述

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM