简体   繁体   中英

Azure specific reading files from local on spark

I am struggling with Azure wasb on spark

I am reading loading a .json.gz file from disk and loading it into hdfs . I have used the following code extensively on other systems.

val file_a_raw = sqlContext.read.json('/home/users/repo_test/file_a.json.gz')

However, on Azure, this returns:

java.io.FileNotFoundException: Filewasb://server-2017-03-07t08-13-41-314z@server.blob.core.windows.net/home/users/repo_test/file_a.json.gz does not exist.

I have checked this location and the file is there and correct.

I think there should be a : between .net and then file path , but I get a java error trying to manually add that in.

java.lang.IllegalArgumentException: java.net.URISyntaxException: Expected scheme name at index 0:

I've also tried:

Filewasb:///home/users/repo_test/file_a.json.gz

But that returns:

java.io.IOException: No FileSystem for scheme: Filewasb

This code works fine on non Azure spark

For Azure, you'll need to configure Spark with the proper credentials. Databricks has documentation on this: https://docs.databricks.com/user-guide/faq/azure-blob-storage.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM