[英]How to create a sequence numerical column based on two columns in r?
My dataframe "fsp" as 1702551 obs and 3 variables.我的 dataframe “fsp” 作为 1702551 obs 和 3 个变量。 It look like this:它看起来像这样:
tibble [1,702,551 x 3]
$ date : Date[1:1702551], format: "2011-04-12" "2011-04-12" "2011-04-12" ...
$ wavelength : num [1:1702551] 350 351 352 353 354 355 356 357 358 359 ...
$ ID : chr [1:1702551] "c01" "c01" "c01" "c01" ...
Quick explanation of the data: Per each "date" and "ID" I had a spectral data (not shown) throughout the wavelength interval (350 to 2300nm).数据的快速解释:对于每个“日期”和“ID”,我在整个波长间隔(350 到 2300nm)都有一个光谱数据(未显示)。 I want to create a new column "target_ID" with a sequence of repeating numers that increases to the next consecutive number each time date or ID changes.我想创建一个新列“target_ID”,其中包含一系列重复数字,每次日期或 ID 更改时,这些数字都会增加到下一个连续数字。 For example for the first ID, "c01" and date "2011-04-12" I will have a column with the number 1 from the wavelength 350 to 2300. The next ID will have the number 2 and so on (along the dataframe "date" changes as well)例如,对于第一个 ID,“c01”和日期“2011-04-12”,我将有一列编号为 1,从波长 350 到 2300。下一个 ID 的编号为 2,依此类推(沿 dataframe “日期”也会改变)
Example of what I want to achieve (look "target_ID"):我想要实现的示例(查看“target_ID”):
|date |wavelength|ID |target_ID|
|:---------|:---------|:---|:--------|
|2011-04-12|350 |c01 |1 |
|2011-04-12|351 |c01 |1 |
|2011-04-12|352 |c01 |1 |
|2011-04-12|353 |c01 |1 |
|...…………………|...……………….|....|...…………….|
|2011-04-12|350 |c03 |2 |
|2011-04-12|351 |c03 |2 |
|...……………..|...……………….|....|...………………|
|2011-04-13|350 |c01 |3 |
|2011-04-13|351 |c01 |3 |
This is the code that I already tried but without success:这是我已经尝试过但没有成功的代码:
fsp<-fsp %>%
group_by(date, ID) %>%
mutate(target_ID, count=n())
Any help will be much appreciatted.任何帮助将不胜感激。
Thank you in advance.先感谢您。
This is a perfect use case for the rleid
function from the data.table
package:这是来自data.table
package 的rleid
function 的完美用例:
# example data
xx <- rep(Sys.Date(), 5)
xx <- c(xx, xx + lubridate::days(1))
id <- rep(c(1:4), c(2,3,3,2))
dat <- data.frame(date = xx, id = id)
# date id
# 1 2021-03-29 1
# 2 2021-03-29 1
# 3 2021-03-29 2
# 4 2021-03-29 2
# 5 2021-03-29 2
# 6 2021-03-30 3
# 7 2021-03-30 3
# 8 2021-03-30 3
# 9 2021-03-30 4
# 10 2021-03-30 4
library(data.table)
dat_dt <- as.data.table(dat)
dat_dt[,target_id := rleid(date, id)]
# date id target_id
# 1: 2021-03-29 1 1
# 2: 2021-03-29 1 1
# 3: 2021-03-29 2 2
# 4: 2021-03-29 2 2
# 5: 2021-03-29 2 2
# 6: 2021-03-30 3 3
# 7: 2021-03-30 3 3
# 8: 2021-03-30 3 3
# 9: 2021-03-30 4 4
#10: 2021-03-30 4 4
And here's how you could use %>%
and mutate
to solve it:以下是您如何使用%>%
和mutate
来解决它:
library(tidyverse)
dat %>%
mutate(target_id = data.table::rleid(date, id))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.