[英]How to use saveAsTextFile on AWS?
I hope to get something by println, but by using AWS, it may not work, How can I save the content of println as a file on AWS using "saveAsTextFile"?我希望通过 println 得到一些东西,但是通过使用 AWS,它可能不起作用,如何使用“saveAsTextFile”将 println 的内容保存为 AWS 上的文件? The original content of println is as following:
println 的原始内容如下:
println("\n[ First output is ]")
output1.foreach(a => println("(" + a +"," + titles(a - 1) + ")"));
println("\n[ Second output us ]")
output2.foreach(a => println("(" + a +"," + titles(a - 1) + ")"));
output1 and output2 are both list made up of numbers. output1 和 output2 都是由数字组成的列表。 titles is also a list.
标题也是一个列表。 Thanks.
谢谢。
Well if both are List
s, you may convert them into RDD
s, using SparkContext
's method parallelize
.好吧,如果两者都是
List
,您可以使用SparkContext
的方法parallelize
将它们转换为RDD
。
val rdd1 = sc.parallelize(List("[ First output is ]") ++ output1.map(a => "(" + a + "," + titles(a - 1) + ")"))
val rdd2 = sc.parallelize(List("[ Second output is ]") ++ output2.map(a => "(" + a + "," + titles(a - 1) + ")"))
After this you can use saveAsTextFile
, in your desired s3 path.在此之后,您可以在所需的s3路径中使用
saveAsTextFile
。
rdd1.saveAsTextFile("s3://yourAccessKey:yourSecretKey@/out1.txt")
rdd2.saveAsTextFile("s3://yourAccessKey:yourSecretKey@/out2.txt")
I recommend you to read this blog, it might help you do understand important things about S3 and Apache-Spark
Writing s3 data with Apache Spark我建议您阅读此博客,它可能会帮助您了解有关S3和
Apache-Spark
重要内容使用Apache-Spark
编写 s3 数据
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.