如何根据r中的两列创建序列数值列？

Question

My dataframe "fsp" as 1702551 obs and 3 variables.我的 dataframe “fsp” 作为 1702551 obs 和 3 个变量。 It look like this:它看起来像这样：

tibble [1,702,551 x 3] 
 $ date       : Date[1:1702551], format: "2011-04-12" "2011-04-12" "2011-04-12" ...
 $ wavelength : num [1:1702551] 350 351 352 353 354 355 356 357 358 359 ...
 $ ID         : chr [1:1702551] "c01" "c01" "c01" "c01" ...

Quick explanation of the data: Per each "date" and "ID" I had a spectral data (not shown) throughout the wavelength interval (350 to 2300nm).数据的快速解释：对于每个“日期”和“ID”，我在整个波长间隔（350 到 2300nm）都有一个光谱数据（未显示）。 I want to create a new column "target_ID" with a sequence of repeating numers that increases to the next consecutive number each time date or ID changes.我想创建一个新列“target_ID”，其中包含一系列重复数字，每次日期或 ID 更改时，这些数字都会增加到下一个连续数字。 For example for the first ID, "c01" and date "2011-04-12" I will have a column with the number 1 from the wavelength 350 to 2300. The next ID will have the number 2 and so on (along the dataframe "date" changes as well)例如，对于第一个 ID，“c01”和日期“2011-04-12”，我将有一列编号为 1，从波长 350 到 2300。下一个 ID 的编号为 2，依此类推（沿 dataframe “日期”也会改变）

Example of what I want to achieve (look "target_ID"):我想要实现的示例（查看“target_ID”）：

|date      |wavelength|ID  |target_ID|
|:---------|:---------|:---|:--------|   
|2011-04-12|350       |c01 |1        |
|2011-04-12|351       |c01 |1        |
|2011-04-12|352       |c01 |1        |
|2011-04-12|353       |c01 |1        |
|...…………………|...……………….|....|...…………….|        
|2011-04-12|350       |c03 |2        |
|2011-04-12|351       |c03 |2        |
|...……………..|...……………….|....|...………………|
|2011-04-13|350       |c01 |3        |
|2011-04-13|351       |c01 |3       |

This is the code that I already tried but without success:这是我已经尝试过但没有成功的代码：

fsp<-fsp %>%
group_by(date, ID) %>%
mutate(target_ID, count=n())

Any help will be much appreciatted.任何帮助将不胜感激。

Thank you in advance.先感谢您。

Answer 1

This is a perfect use case for the rleid function from the data.table package:这是来自data.table package 的rleid function 的完美用例：

# example data
xx <- rep(Sys.Date(), 5)
xx <- c(xx, xx + lubridate::days(1))
id <- rep(c(1:4), c(2,3,3,2))
dat <- data.frame(date = xx, id = id)

#          date id
# 1  2021-03-29  1
# 2  2021-03-29  1
# 3  2021-03-29  2
# 4  2021-03-29  2
# 5  2021-03-29  2
# 6  2021-03-30  3
# 7  2021-03-30  3
# 8  2021-03-30  3
# 9  2021-03-30  4
# 10 2021-03-30  4

library(data.table)
dat_dt <- as.data.table(dat)
dat_dt[,target_id := rleid(date, id)]

 #          date id target_id
 # 1: 2021-03-29  1         1
 # 2: 2021-03-29  1         1
 # 3: 2021-03-29  2         2
 # 4: 2021-03-29  2         2
 # 5: 2021-03-29  2         2
 # 6: 2021-03-30  3         3
 # 7: 2021-03-30  3         3
 # 8: 2021-03-30  3         3
 # 9: 2021-03-30  4         4
 #10: 2021-03-30  4         4

And here's how you could use %>% and mutate to solve it:以下是您如何使用%>%和mutate来解决它：

library(tidyverse)
dat %>%
    mutate(target_id = data.table::rleid(date, id))

如何根据r中的两列创建序列数值列？

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-03-29 13:55:02

如何根据r中的两列创建序列数值列？

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-03-29 13:55:02

解决方案1
0 已采纳 2021-03-29 13:55:02