I am trying to send the logs from pyspark to a file. I've looked at countless questions here with no success. I am using python 3.7 with pysaprk 2.4.7.
I get the logger using:
log4j_logger = spark_session.sparkContext._jvm.org.apache.log4j
spark_logger = log4j_logger.LogManager.getRootLogger()
The spark_logger
is of type org.apache.log4j.spi.RootLogger
so I proceed to add a file appended "java style" using pyjnius:
rolling_appender = autoclass("org.apache.log4j.RollingFileAppender")
appender = rolling_appender()
appender.setFile(str(self._log_file))
appender.activateOptions()
spark_logger.addAppender(appender)
But I get this error:
File "*****/lib/python3.7/site-packages/py4j/java_gateway.py", line 1218, in _build_args
[get_command_part(arg, self.pool) for arg in new_args])
File "*****/lib/python3.7/site-packages/py4j/java_gateway.py", line 1218, in <listcomp>
[get_command_part(arg, self.pool) for arg in new_args])
File "*****/lib/python3.7/site-packages/py4j/protocol.py", line 298, in get_command_part
command_part = REFERENCE_TYPE + parameter._get_object_id()
AttributeError: 'org.apache.log4j.RollingFileAppender' object has no attribute '_get_object_id'
I know the spark logger is fetched successfully because I can log to it and it will appear in the console.
Any easier way (that works lol) to do this?
For reference, I was able to do it using an appender from log4j.
log4j = self._spark_session.sparkContext._jvm.org.apache.log4j
spark_logger = log4j.LogManager.getLogger("org.apache.spark")
appender = log4j.RollingFileAppender()
appender.setAppend(True)
layout = log4j.PatternLayout()
layout.setConversionPattern("%d{yyyy-MM-dd HH:mm:ss} %-5p %c:%L - %m%n")
appender.setLayout(layout)
appender.setFile(str(self._log_file))
appender.activateOptions()
spark_logger.removeAllAppenders()
spark_logger.addAppender(appender)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.