简体   繁体   English

如何在单元测试中抑制Spark记录?

[英]How to suppress Spark logging in unit tests?

So thanks to easily googleable blogs I tried: 所以感谢我试过的易于谷歌的博客:

import org.specs2.mutable.Specification

class SparkEngineSpecs extends Specification {
  sequential

  def setLogLevels(level: Level, loggers: Seq[String]): Map[String, Level] = loggers.map(loggerName => {
    val logger = Logger.getLogger(loggerName)
    val prevLevel = logger.getLevel
    logger.setLevel(level)
    loggerName -> prevLevel
  }).toMap

  setLogLevels(Level.WARN, Seq("spark", "org.eclipse.jetty", "akka"))

  val sc = new SparkContext(new SparkConf().setMaster("local").setAppName("Test Spark Engine"))

  // ... my unit tests

But unfortunately it doesn't work, I still get a lot of spark output, eg: 但不幸的是它不起作用,我仍然得到很多火花输出,例如:

14/12/02 12:01:56 INFO MemoryStore: Block broadcast_4 of size 4184 dropped from memory (free 583461216)
14/12/02 12:01:56 INFO ContextCleaner: Cleaned broadcast 4
14/12/02 12:01:56 INFO ContextCleaner: Cleaned shuffle 4
14/12/02 12:01:56 INFO ShuffleBlockManager: Deleted all files for shuffle 4

Add the following code into the log4j.properties file inside the src/test/resources dir, create the file/dir if not exist 将以下代码添加到src/test/resources目录中的log4j.properties文件中,如果不存在则创建文件/ dir

# Change this to set Spark log level
log4j.logger.org.apache.spark=WARN

# Silence akka remoting
log4j.logger.Remoting=WARN

# Ignore messages below warning level from Jetty, because it's a bit verbose
log4j.logger.org.eclipse.jetty=WARN

When I run my unit tests (I'm using JUnit and Maven), I only receive WARN level logs, in other words no more cluttering with INFO level logs (though they can be useful at times for debugging). 当我运行我的单元测试时(我正在使用JUnit和Maven),我只接收WARN级别日志,换句话说,不再使用INFO级别日志混乱(尽管它们在调试时很有用)。

I hope this helps. 我希望这有帮助。

In my case one of my own libraries brought logback-classic into the mix. 在我的情况下,我自己的一个库带来了logback-classic。 This materialized in a warning at the start: 这在开始时的警告中实现:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/alex/.ivy2/cache/ch.qos.logback/logback-classic/jars/logback-classic-1.1.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/alex/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

I solved this by excluding it from the dependency: 我通过从依赖项中排除它来解决这个问题:

"com.mystuff" % "mylib" % "1.0.0" exclude("ch.qos.logback", "logback-classic")

Now I could add a log4j.properties file in test/resources which now gets used by Spark. 现在我可以在test/resources添加一个log4j.properties文件,现在它被Spark使用了。

After some time of struggling with Spark log output as well, I found a blog post with a solution I particularly liked. 经过一段时间与Spark日志输出的挣扎,我发现了一篇博文,其中包含我特别喜欢的解决方案。

If you use slf4j, one can simply exchange the underlying log implementation. 如果使用slf4j,可以简单地交换底层日志实现。 A good canidate for the test scope is slf4j-nop, which carfully takes the log output and puts it where the sun never shines. 测试范围的一个好的结果是slf4j-nop,它可以很好地获取日志输出并将它放在太阳永不闪耀的地方。

When using Maven you can add the following to the top of your dependencies list: 使用Maven时,您可以将以下内容添加到依赖项列表的顶部:

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-api</artifactId>
  <version>1.7.12</version>
  <scope>provided</scope>
</dependency>

<dependency>
  <groupId>org.slf4j</groupId>
  <artifactId>slf4j-nop</artifactId>
  <version>1.7.12</version>
  <scope>test</scope>
</dependency>

Note that it might be important to have it at the beginning of the dependencies list to make sure that the given implementations are used instead of those that might come with other packages (and which you can consider to exclude in order to keep your class path tidy and avoid unexpected conflicts). 请注意,将它放在依赖项列表的开头可能很重要,以确保使用给定的实现而不是其他包可能带来的实现(并且您可以考虑排除这些实现以保持类路径整洁并避免意外冲突)。

You can use a separate Logback config for tests. 您可以使用单独的Logback配置进行测试。 Depending on your environment it's possible that you just need to create conf/logback-test.xml with something that hides the logs. 根据您的环境,您可能只需要使用隐藏日志的内容创建conf/logback-test.xml I think this should do that: 我认为这应该这样做:

<configuration>
  <root level="debug">
  </root>
</configuration>

As I understand it, this captures all logs (level debug and higher) and assigns no logger to them, so they get discarded. 据我了解,它捕获所有日志(级别debug和更高级别)并且不为它们分配记录器,因此它们被丢弃。 A better option is to configure a file logger for them, so you can still access the logs if you want to. 更好的选择是为它们配置文件记录器,因此您仍可以根据需要访问日志。

See http://logback.qos.ch/manual/configuration.html for the detailed documentation. 有关详细文档,请参阅http://logback.qos.ch/manual/configuration.html

A little late to the party but I found this in the spark example code : 派对有点晚了,但我在spark示例代码中找到了这个:

def setStreamingLogLevels() {
    val log4jInitialized = Logger.getRootLogger.getAllAppenders.hasMoreElements
    if (!log4jInitialized) {
        // We first log something to initialize Spark's default logging, then we override the
        // logging level.
        logInfo("Setting log level to [WARN] for streaming example." +
        " To override add a custom log4j.properties to the classpath.")
        Logger.getRootLogger.setLevel(Level.WARN)
    }
}

I also found that with your code if you call setLogLevels like below it cut out alot of out put for me. 我还发现,如果你像下面这样调用setLogLevels,你的代码会为我减少很多输出。

setLogLevels(Level.WARN, Seq("spark", "org", "akka"))

The easiest solution working for me is: 最适合我的解决方案是:

cp $SPARK_HOME/conf/log4j.properties.template $YOUR_PROJECT/src/test/resources/log4j.properties
sed -i -e 's/log4j.rootCategory=INFO/log4j.rootCategory=WARN/g' $YOUR_PROJECT/src/test/resources/log4j.properties

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM