简体   繁体   English

如何通过读取CSV文件在Scala中创建Map [Int,Set [String]]?

[英]How to create Map[Int,Set[String]] in scala by reading a CSV file?

I want to create Map[Int,Set[String]] in scala by reading input from a CSV file. 我想通过从CSV文件读取输入来在Scala中创建Map [Int,Set [String]]。

My file.csv is, 我的file.csv是,

sunny,hot,high,FALSE,no
sunny,hot,high,TRUE,no
overcast,hot,high,FALSE,yes
rainy,mild,high,FALSE,yes
rainy,cool,normal,FALSE,yes
rainy,cool,normal,TRUE,no
overcast,cool,normal,TRUE,yes

I want the output as, 我希望输出为

var Attributes = Map[Int,Set[String]] = Map()

Attributes += (0 -> Set("sunny","overcast","rainy"))
Attributes += (1 -> Set("hot","mild","cool"))
Attributes += (2 -> Set("high","normal"))
Attributes += (3 -> Set("false","true"))
Attributes += (4 -> Set("yes","no"))

This 0,1,2,3,4 represents the column number and Set contains the distinct values in each column . 这个0,1,2,3,4代表列号,并且Set在每个列中包含不同的值。

I want to add each (Int -> Set(String)) to my attribute "Attributes". 我想将每个(Int-> Set(String))添加到我的属性“ Attributes”。 ie, If we print Attributes.size , it displays 5(In this case). 即,如果我们打印Attributes.size,它将显示5(在这种情况下)。

Use one of the existing answers to read in the CSV file. 使用现有答案之一读取CSV文件。 You'll have a two dimensional array or vector of strings. 您将拥有一个二维数组或字符串向量。 Then build your map. 然后建立您的地图。

// row vectors
val rows = io.Source.fromFile("file.csv").getLines.map(_.split(",")).toVector
// column vectors
val cols = rows.transpose
// convert each vector to a set
val sets = cols.map(_.toSet)
// convert vector of sets to map
val attr = sets.zipWithIndex.map(_.swap).toMap

The last line is bit ugly because there is no direct .toMap method. 最后一行有点难看,因为没有直接的.toMap方法。 You could also write 你也可以写

val attr = Vector.tabulate(sets.size)(i => (i, sets(i))).toMap

Or you could do the last two steps in one go: 或者,您可以一次性完成最后两个步骤:

val attr = cols.zipWithIndex.map { case (xs, i) => 
  (i, xs.toSet) 
} (collection.breakOut): Map[Int,Set[String]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM