How to avoid \r\n in Spark SQL Jobs using python?

Question

I have hundreds of yaml files with Hive Query that I am migrating to SparkSQL using a python script I wrote. My goal was to have SprkSQL query that is properly formatted so I kept tabs( \t ), spaces, and new_line( \n ) characters in my SparkSQL queries.

The problem is when I submit this code I get following error (image). I am able to fix this by replacing \r\n with white space, but that impacts formatting as entire code will be in single line. I am looking for some robust way to deal with \r\n in my code without impacting the formatting.

My workarounds:

When I replace \r\n characters with space then it is working fine but becomes unformatted.
When I use tr -d '\r' < input > output then get error for \n as below

 Parsing Error [line 5]: '(\n' [line 46]: ')\n'

I am spending lots of time manually debug each files and looking for some idea that can automate my process.

Answer 1

use \ to show continuation of text in next line

How to avoid \r\n in Spark SQL Jobs using python?

Question

1 answers

solution1
0 2020-06-12 05:13:14

How to avoid \r\n in Spark SQL Jobs using python?

Question

1 answers

solution1 0 2020-06-12 05:13:14

solution1
0 2020-06-12 05:13:14