简体   繁体   English

读取一个文件作为Map(K,V)并通过V作为键,同时读取第二个文件作为Map

[英]Reading one file as Map(K,V) and pass V as keys while reading the second file as Map

I have two files. 我有两个文件。 One is a text file and another one is CSV. 一个是文本文件,另一个是CSV。 I want to read the text file as Map(keys, values) and pass these values from the first file as key in Map when I read the second file (CSV file). 我想将文本文件读取为Map(键,值),并在读取第二个文件(CSV文件)时将第一个文件的这些值作为Map中的键传递。

I am able to read the first file and get Map(key, value). 我能够读取第一个文件并获取Map(key,value)。 From this Map, I have extracted the values and passed these values as keys in the second file but didn't get the desired result. 从此Map中,我提取了值并将这些值作为键传递到第二个文件中,但未获得所需的结果。

first file - text file 第一个文件-文本文件

sdp:field(0)
meterNumber:field(1)
date:field(2)
time:field(3)
value:field(4),field(5),field(6),field(7),field(8),field(9),
field(10),field(11),field(12),field(13),field(14),
field(15),field(16),field(17)

second file - csv file 第二个文件-CSV文件

SDP,METERNO,READINGDATE,TIME,Reset Count.,Kilowatt-Hour Last Reset .,Kilowatt-Hour Rate A Last Reset.,Kilowatt-Hour Rate B Last Reset.,Kilowatt-Hour Rate C Last Reset.,Max Kilowatt Rate A Last Reset.,Max Kilowatt Rate B Last Reset.,Max Kilowatt Rate C Last Reset.,Accumulate Kilowatt Rate A Current.,Accumulate Kilowatt Rate B Current.,Accumulate Kilowatt Rate C Current.,Total Kilovar-Hour Last Reset.,Max Kilovar Last Reset.,Accumulate Kilovar Last Reset.
9000000001,500001,02-09-2018,00:00:00,2,48.958,8.319333333,24.31933333,16.31933333,6,24,15,10,9,6,48.958,41,40

this is what I have done to read the first file. 这是我阅读第一个文件所做的。

val lines = scala.io.Source.fromFile("D:\\JSON_READER\\dailymapping.txt", "UTF8")
        .getLines
        .map(line=>line.split(":"))
        .map(fields => (fields(0),fields(1))).toMap;        
  val sdp = lines.get("sdp").get;
  val meterNumber = lines.get("meterNumber").get;
  val date = lines.get("date").get;
  val time = lines.get("time").get;
  val values = lines.get("value").get;

now I can see sdp has field(0), meterNumber has field(1), date has field(2), time has field(3) and values has field(4) .. to field(17). 现在我可以看到sdp具有field(0),meterNumber具有field(1),date具有field(2),time具有field(3)和value具有field(4)..到field(17)。

Second file which I m reading using below code 我正在使用以下代码阅读的第二个文件

val keyValuePairs = scala.io.Source.fromFile("D:\\JSON_READER\\Daily.csv")
       .getLines.drop(1).map(_.stripLineEnd.split(",", -1))
       .map{field => ((field(0),field(1),field(2),field(3)) -> (field(4),field(5)))}.toList

  val map = Map(keyValuePairs : _*)
  System.out.println(map);

above code giving me the following output which is desired output. 上面的代码给我下面的输出是所需的输出。

Map((9000000001,500001,02-09-2018,00:00:00) -> (2,48.958))

But I want to replace field(0), field(1), field(2), field(3) with sdp, meterNumber, date, time in the above code. 但是我想在上面的代码中用sdp,meterNumber,date,time替换field(0),field(1),field(2),field(3)。 So, I don't have to mention keys when I read the second file, keys will come from the first file. 因此,当我阅读第二个文件时,我不必提及密钥,密钥将来自第一个文件。

I tried to replace but I got below output which is not desired output. 我尝试更换,但输出低于期望的输出。

Map((field(0),field(1),field(2),field(3)) -> (,))

Can somebody please guide me on how can I achieve the desired output. 有人可以指导我如何实现所需的输出。

This might get you close to what you're after. 这可能会使您接近所追求的目标。 The first Map is used to lookup the correct index into the CSV data. 第一个Map用于在CSV数据中查找正确的索引。

val fieldRE = raw"field\((\d+)\)".r

val idx = io.Source
            .fromFile(<txt_file>, "UTF8")
            .getLines
            .map(_.split(":"))
            .flatMap(fields => fieldRE.replaceAllIn(fields(1), _.group(1))
                                      .split(",")
                                      .map(fields(0) -> _.toInt))
            .toMap

val resMap = io.Source
               .fromFile(<csv_file>)
               .getLines
               .drop(1)
               .map(_.stripLineEnd.split(",", -1))
               .map{ fld =>
  (fld(idx("sdp")),fld(idx("meterNumber")),fld(idx("date")),fld(idx("time"))) ->
  (fld(4),fld(5))  }
               .toMap

//resMap: Map((9000000001,500001,02-09-2018,00:00:00) -> (2,48.958))

UPDATE 更新

Changing the Map of ( String identifiers -> Int index values) into a Map of ( String identifiers -> collection of Int index values) can be done. 更改Map的( String标识- > Int指数值)到Map的( String标识符- >收集Int指数值)可以做到的。 I'm not sure what that buys you, but it's doable. 我不确定能买到什么,但这是可行的。

val fieldRE = raw"field\((\d+)\)".r

val idx = io.Source
            .fromFile(<txt_file>, "UTF8")
            .getLines
            .map(_.split(":"))
            .flatMap(fields => fieldRE.replaceAllIn(fields(1), _.group(1))
                                      .split(",")
                                      .map(fields(0) -> _.toInt))
            .foldLeft(Map[String,Seq[Int]]()){ case (m,(k,v)) =>
               m + (k -> (m.getOrElse(k,Seq()) :+ v))
            }

val resMap = io.Source
               .fromFile(<csv_file>)
               .getLines
               .drop(1)
               .map(_.stripLineEnd.split(",", -1))
               .map{fld => (fld(idx("sdp").head)
                           ,fld(idx("meterNumber").head)
                           ,fld(idx("date").head)
                           ,fld(idx("time").head)) -> (fld(4),fld(5))}
               .toMap

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM