简体繁体中英

AWS EMR notebook Spark kernel infinitely loads small JSON file

原文 2020-02-26 20:46:48 8 1 json/ scala/ apache-spark

I am trying to load a JSON file in an EMR notebook with a Spark kernel. I am using a very large, proven EMR cluster that I have worked with before, so the cluster size/computation power is not the issue. The simple code below is enough to reproduce my issue:

val df = spark.read.json("s3a://src/main/resources/zipcodes.json")

Here is the JSON file I am trying to load. It is extremely small. https://raw.githubusercontent.com/spark-examples/spark-scala-examples/71d2db89ffb24db6f01eb1fa12286bfbb37c44c4/src/main/resources/zipcodes.json

I let it run for 1 hour. In the bottom left corner, it says: Spark | Busy Spark | Busy and the circle in the top right is full, indicating that the kernel is working. However, the Spark Job Progress shows a Task Progress bar that never progresses. Any advice?

1 answers

The problem was not the JSON file. In an attempt to fix this issue, I merely cloned my problematic EMR cluster with the exact same steps/configuration, attached my EMR notebook to the clone and re-attempted the exact same code with the exact same file. It worked nearly instantaneously. The problem was with the original cluster although I do not know what the exact problem was.

JSON file loads in Safari but not in Chrome

get a JSON object from a string using AWS Lambda with EMR

Handling json file in spark

CasperJs loads json data from a local file

Python JSON loads - wrong encoding of file

Is it possible to create cluster in EMR by giving all the configurations from json file

Python load JSON only loads part of a file

Pandas loads JSON file DatetimeIndex and not float

List slicing a json.loads file

change IPython Notebook JSON file encoding

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question JSON file loads in Safari but not in Chrome get a JSON object from a string using AWS Lambda with EMR Handling json file in spark CasperJs loads json data from a local file Python JSON loads - wrong encoding of file Is it possible to create cluster in EMR by giving all the configurations from json file Python load JSON only loads part of a file Pandas loads JSON file DatetimeIndex and not float List slicing a json.loads file change IPython Notebook JSON file encoding

Related Tags

AWS EMR notebook Spark kernel infinitely loads small JSON file

Question

1 answers

solution1 0 2020-02-27 19:49:44

solution1
0 2020-02-27 19:49:44