简体   繁体   English

如何在 Spark Scala 作业中加载和写入属性文件?

[英]How to load and write properties file in a Spark Scala job?

I have a job Spark and I should read properties information from a file "config.properties" in this format:我有一份工作 Spark,我应该以这种格式从文件“config.properties”中读取属性信息:

var1=1
var2=12/10/2021

At the end of the process, I should update var1 and var2 , so I have to overwrite "config.properties" file.... how can I do?在该过程结束时,我应该更新var1var2 ,所以我必须覆盖“config.properties”文件....我该怎么办?

This code would be part of the driver, so you write it as any Java/Scala app reading a configuration files, whether the properties format or using JSON.此代码将成为驱动程序的一部分,因此您可以将其编写为读取配置文件的任何 Java/Scala 应用程序,无论是属性格式还是使用 JSON。

What you need to keep in mind:您需要记住的事项:

  • when you run in local mode (when you create your session with setMaster(“local”)) or client mode (setting up master to a known cluster) then you run locally.当您在本地模式下运行时(当您使用 setMaster(“local”) 创建 session 时)或客户端模式(将 master 设置为已知集群)然后您在本地运行。 This means that the driver will access your local file system.这意味着驱动程序将访问您的本地文件系统。 Make sure the user running the app Ahmad the rights to do so.确保运行应用程序 Ahmad 的用户有权这样做。
  • when in cluster mode, and you submit your application via Spark-submit or a similar tool, then you do not control the path and you may not be able to access a local file on the cluster.在集群模式下,您通过 Spark-submit 或类似工具提交应用程序,则您无法控制路径,您可能无法访问集群上的本地文件。 In this scenario, depending on your infrastructure, you may want to point to a cloud drive (S3 or equivalent), a network mount (SMB, NFS…), or a virtual drive (Google Drive, ownCloud, Dropbox…)在这种情况下,根据您的基础架构,您可能希望指向云驱动器(S3 或同等产品)、网络挂载(SMB、NFS...)或虚拟驱动器(Google Drive、ownCloud、Dropbox...)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM