简体   繁体   English

如何在 AWS 上使用 saveAsTextFile?

[英]How to use saveAsTextFile on AWS?

I hope to get something by println, but by using AWS, it may not work, How can I save the content of println as a file on AWS using "saveAsTextFile"?我希望通过 println 得到一些东西,但是通过使用 AWS,它可能不起作用,如何使用“saveAsTextFile”将 println 的内容保存为 AWS 上的文件? The original content of println is as following: println 的原始内容如下:

println("\n[ First output is ]")
output1.foreach(a => println("(" + a +"," + titles(a - 1) + ")"));
println("\n[ Second output us ]")
output2.foreach(a => println("(" + a +"," + titles(a - 1) + ")"));

output1 and output2 are both list made up of numbers. output1 和 output2 都是由数字组成的列表。 titles is also a list.标题也是一个列表。 Thanks.谢谢。

Well if both are List s, you may convert them into RDD s, using SparkContext 's method parallelize .好吧,如果两者都是List ,您可以使用SparkContext的方法parallelize将它们转换为RDD

val rdd1 = sc.parallelize(List("[ First output is ]") ++ output1.map(a => "(" + a + "," + titles(a - 1) + ")"))
val rdd2 = sc.parallelize(List("[ Second output is ]") ++ output2.map(a => "(" + a + "," + titles(a - 1) + ")"))

After this you can use saveAsTextFile , in your desired s3 path.在此之后,您可以在所需的s3路径中使用saveAsTextFile

rdd1.saveAsTextFile("s3://yourAccessKey:yourSecretKey@/out1.txt")
rdd2.saveAsTextFile("s3://yourAccessKey:yourSecretKey@/out2.txt")

I recommend you to read this blog, it might help you do understand important things about S3 and Apache-Spark Writing s3 data with Apache Spark我建议您阅读此博客,它可能会帮助您了解有关S3Apache-Spark重要内容使用Apache-Spark 编写 s3 数据

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM