How to read stderr logs from AWS logs

Question

I am using EMR steps to run my jobs. Typically when I want to analyze the performance of a job or to understand why it failed, I look at the spark history server for DAG visualizations, and job errors, etc. For example, if the job failed due to heap error, or Fetchfailed, etc, I can see it clearly specified in the spark history server. However, I can't seem to be able to find such descriptions when I look at the stderr log files that are written to the LOG URI S3 bucket. Is there a way to obtain such information? I use pyspark and set the log level to

sc = spark.sparkContext
sc.setLogLevel('DEBUG')

Any insight as to what I am doing wrong?

Answer 1

I haven't really tested this but as it's a bit long to fit in a comment, I post it here as an answer.

Like pointed out in my comment, the logs you're viewing using Spark History Server UI aren't the same as the Spark driver logs that are saved to S3 from EMR.

To get the spark history server logs written into S3, you'll have to add some additional configuration to your cluster. These configuration options are described in the section Monitoring and Instrumentation of Spark documentation.

In AWS EMR, you could try to add something like this into your cluster configuration:

...

{
  'Classification': 'spark-defaults',
  'Properties': {
    'spark.eventLog.dir': 's3a://your_bucket/spark_logs',
    'spark.history.fs.logDirectory': 's3a://your_bucket/spark_logs',
    'spark.eventLog.enabled': 'true'
  }
}

...

I found this interesting post which describes how to set this for Kubernetes cluster, you may want to check it for further details.

How to read stderr logs from AWS logs

Question

1 answers

solution1
1 ACCPTED 2021-01-19 13:39:59

How to read stderr logs from AWS logs

Question

1 answers

solution1 1 ACCPTED 2021-01-19 13:39:59

solution1
1 ACCPTED 2021-01-19 13:39:59