简体   繁体   English

在 Scala 中,如何创建开始日期和结束日期之间的每月日期 arrays 日期列?

[英]In Scala, how do I create a column of date arrays of monthly dates between a start and end date?

In Spark Scala, I am trying to create a column that contains an array of monthly dates between a start and an end date (inclusive).在 Spark Scala 中,我正在尝试创建一个包含开始日期和结束日期(含)之间的每月日期数组的列。

For example, if we have 2018-02-07 and 2018-04-28, the array should contain [2018-02-01, 2018-03-01, 2018-04-01].例如,如果我们有 2018-02-07 和 2018-04-28,则数组应包含 [2018-02-01, 2018-03-01, 2018-04-01]。

Besides the monthly version I would also like to create a quarterly version, ie [2018-1, 2018-2].除了每月版本,我还想创建一个季度版本,即 [2018-1, 2018-2]。

Example Input Data:输入数据示例:

id startDate endDate
1_1 2018-02-07 2018-04-28
1_2 2018-05-06 2018-05-31
2_1 2017-04-13 2017-04-14

Expected (monthly) Output 1:预计(每月)Output 1:

id startDate endDate dateRange
1_1 2018-02-07 2018-04-28 [2018-02-01, 2018-03-01, 2018-04-01]
1_1 2018-05-06 2018-05-31 [2018-05-01]
2_1 2017-04-13 2017-04-14 [2017-04-01]

Ultimate expected (monthly) output 2:最终预期(每月)output 2:

id Date
1_1 2018-02-01 
1_1 2018-03-01
1_1 2018-04-01
1_2 2018-05-01
2_1 2017-04-01

I have spark 2.1.0.167, Scala 2.10.6, and JavaHotSpot 1.8.0_172.我有火花 2.1.0.167、Scala 2.10.6 和 JavaHotSpot 1.8.0_172。

I have tried to implement several answers to similar (day-level) questions on here, but I am struggling with getting a monthly/quarterly version to work.我已经尝试在此处对类似(日级)问题实施几个答案,但我正在努力让每月/每季度的版本正常工作。

The below creates an array from start and endDate and explodes it.下面从 start 和 endDate 创建一个数组并将其分解。 However I need to explode a column that contains all the monthly (quarterly) dates in-between.但是,我需要展开一个包含其间所有每月(每季度)日期的列。

val df1 = df.select($"id", $"startDate", $"endDate").
// This just creates an array of start and end Date
withColumn("start_end_array"), array($"startDate", $"endDate").
withColumn("start_end_array"), explode($"start_end_array"))

Thank you for any leads.感谢您提供任何线索。

case class MyData(id: String, startDate: String, endDate: String, list: List[String])
val inputData = Seq(("1_1", "2018-02-07", "2018-04-28"), ("1_2", "2018-05-06", "2018-05-31"), ("2_2", "2017-04-13", "2017-04-14"))
inputData.map(x => {
  import java.time.temporal._
  import java.time._
  val startDate = LocalDate.parse(x._2)
  val endDate = LocalDate.parse(x._3)
  val diff = ChronoUnit.MONTHS.between(startDate, endDate)
  var result = List[String]();
  for (index <- 0 to diff.toInt) {
    result = (startDate.getYear + "-" + (startDate.getMonth.getValue + index) + "-01") :: result
  }
  new MyData(x._1, x._2, x._3, result)
}).foreach(println)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 两个日期之间的开始日期和结束日期 - start date and end date between two dates 如何使用 Java Streams API 正确过滤开始日期和结束日期之间的日期? - How would I properly filter dates that are in between a start date and end date, using the Java Streams API? 获取给定日期范围内的月份的开始日期和结束日期 - To get start date and end date of months between the given dates range 如何在MS Access中显示开始日期和结束日期之间的日期 - How to show date between start date and end date in ms access 获取给定开始日期和结束日期的日期列表 - Getting a List of dates given a start and end date 如何计算夏令时结束日期之间的剩余时间 - How to calculate remaining time between dates with daylight saving end date 两个日期之间的Scala随机日期 - Scala random date between two dates 根据修订日期映射映射开始日期和结束日期 - Mapping start date and end date according to revision dates map 我需要使用存储在房间数据库中的按日期(开始日期和结束日期之间)的 recyclerView 过滤列表视图 - I need to filter the listview using recyclerView by date(between start date and end date) stored in Room Database 在两个日期之间进行迭代,包括开始日期? - Iterate between two dates including start date?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM