简体   繁体   中英

How to load typesafe configFactory from file on hdfs?

I am using typesafe ConfigFactory to load the config into my scala application. I do not want to include the config files into my jar, but load them instead from an external hdfs filesystem. However, I cannot find a simple way to load the config from the fsDataInputStream object I get from hadoop:

//get HDFS file
val hadoopConfig: Configuration = sc.hadoopConfiguration
val fs: FileSystem = org.apache.hadoop.fs.FileSystem.get(hadoopConfig)
val file: FSDataInputStream = fs.open(new Path("hdfs://SOME_URL/application.conf"))
//read config from hdfs
val config: Config = ConfigFactory.load(file.readUTF())

However, this throws an EOFException. Is there an easy way to convert the FSDataInputStream object into the required java.io.File? I found Converting from FSDataInputStream to FileInputStream , but this would be pretty cumbersome for such a simple task.

Using ConfigFactory.parseReader should work (but I haven't tested it):

val reader = new InputStreamReader(file)
val config = try {
  ConfigFactory.parseReader(reader)
} finally {
  reader.close()
}

I could fix the issue with below code. Assume configPath is the path in HDFS location, where you have the.conf file available. for Ex:- hdfs://mount-point/abc/xyz/details.conf

import java.io.File
import com.typesafe.config._
import org.apache.hadoop.fs.{FileSystem, Path}
import java.io.InputStreamReader
val configPath = "hdfs://sparkmainserver:8020/file.conf"
val fs = FileSystem.get(new org.apache.hadoop.conf.Configuration())
val reader = new InputStreamReader(fs.open(new Path(configPath)))
val config: Config = ConfigFactory.parseReader(reader)

Then you can use config.getString("variable_name") to extract and use the variables/parameters. Prior to this you should have ConfigFactory sbt/Maven dependency in your pom file.

Here is what I did with Spark application:

  /**
    * Load typesafe's configuration from hdfs file location
    * @param sparkContext
    * @param confHdfsFileLocation
    * @return
    */
  def loadHdfsConfig(sparkContext: SparkContext, confHdfsFileLocation: String) : Config = {
    // Array of 1 element (fileName, fileContent)
    val appConf: Array[(String, String)] = sparkContext.wholeTextFiles(confHdfsFileLocation).collect()
    val appConfStringContent = appConf(0)._2
    ConfigFactory.parseString(appConfStringContent)
  }

Now in the code, just use

val config = loadHdfsConfig(sparkContext, confHdfsFileLocation)
config.getString("key-here")

I hope it helps.

You should be able to load .conf file in hdfs using the following code:

ConfigFactory.parseFile(new File("application.conf"));

Please keep in mind that the .conf file should be placed on the same directory as your app file (eg jar file in spark).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM