简体   繁体   English

如何格式化saveAsTextFile的输出?

[英]How to format an output of saveAsTextFile?

I am working on ETL process in Scala. 我正在Scala中进行ETL流程。 My raw log files has many columns (around 70). 我的原始日志文件有很多列(大约70列)。 I try to save it to file using Row() objects: 我尝试使用Row()对象将其保存到文件中:

val base_RDD = rawData.map{r => if(r(13) == null || r(13).trim.isEmpty) Row(
    r(2), r(3), r(4), "", r(6), r(7), r(8), r(9), r(10), r(11), r(12), r(13), r(14), r(15), r(16), 
    r(18), r(21), r(27), r(29), r(30), r(32), r(33), r(34), r(35), r(36), r(37), r(38), r(39), r(40),
    r(41), r(42), r(43), r(44), r(45), r(46), r(47), r(48), r(49), r(50), r(51), r(52), r(53), r(54),
    r(55), r(56), r(57), r(58), r(59), r(60), r(61), r(62), r(63), r(64), r(65), r(66), r(67), r(68),
    r(69), r(70), r(71), r(72), r(73), r(74), r(75), "", "", "", "", "", "", "", r(76), r(77), r(78), r(1))
  else Row(r(2), r(3), r(4), "", r(6), r(7), r(8), r(9), r(10), r(11), r(12), r(13), r(14), r(15), r(16), 
    r(18), r(21), r(27), r(29), r(30), r(32), r(33), r(34), r(35), r(36), r(37), r(38), r(39), r(40), r(41), 
    r(42), r(43), r(44), r(45), r(46), r(47), r(48), r(49), r(50), r(51), r(52), r(53), r(54), r(55), r(56), r(57), r(58), 
    r(59), r(60), r(61), r(62), r(63), r(64), r(65), r(66), r(67), r(68), r(69), r(70), r(71), r(72), r(73), r(74), r(75), 
    r(13).split("_")(0), r(13).split("_")(1), r(13).split("_")(2), r(13).split("_")(3), r(5), r(13).split("_")(5), 
    r(13).split("_")(6),r(76), r(77), r(78), r(1))}

Now exception is gone. 现在异常消失了。 however "[" and "]" are observed after saving data on disk base_RDD.saveAsTextFile("hdfs://nameservice1:8020/tmp/manish/tmpData") Is my approach in correct way? 但是,将数据保存到磁盘base_RDD.saveAsTextFile("hdfs://nameservice1:8020/tmp/manish/tmpData")后,会观察到“ [”和“]”。我的方法是否正确? please suggest what goes wrong? 请指出出什么问题了? If any. 如果有的话。

SAMPLE OUTPUT: 样品输出:

[6035233,500212680,50013723,,,ddd.com,,,,,,,1,0,0,0,,0,,,,,,,,,,0,0,0,0,0,0,-1x-1,,,0,0,0,0,0,0,0,0,,0,0,,0,0,0,0,,,,,0,0,,0,0,0,0,0,,,,,,,,,0,0,]
[6035233,500212680,50013723,,,d.com,,,,,,,1,0,0,0,,0,,,,,,,,,,0,0,0,0,0,0,-1x-1,,,0,0,0,0,0,0,0,0,,0,0,,0,0,0,0,,,,,0,0,,0,0,0,0,0,,,,,,,,,0,0,]

I don't want "[" and "]" 我不要“ [”和“]”

Just use plain Lists and make strings before you call saveAsTextFile : 在调用saveAsTextFile之前,只需使用普通Lists并创建字符串saveAsTextFile

rawData.map{r =>
  if(r(13) == null || r(13).trim.isEmpty) Seq(r(2), r(3), ...).mkString(",")
  else Seq(r(2), r(3), ...).mkString(",")
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM