I need to load a pure txt RDD in spark. But for some reasons, the filename of the file to be loaded must be named as "xxx.gz". This file, by default, is recognized as a gz file when using sc.textFile. How can I tell spark to recognize the file as a pure txt file?
您可以使用gzip 。
gzip.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.