[英]R loop over levels of a factor to create a sequence of numbers for each level
I'm working on a dataframe with GPS data from beavers, the dataframe includes on column with the animals id (see $id
below) which is a factor with 26 levels. 我正在使用海狸的GPS数据处理数据框,该数据框的动物ID(请参见下面的
$id
)包含在列中,这是26级的因子。 For each beaver, we have several GPS values - the number differs from animal to animal. 对于每个海狸,我们有几个GPS值-每个动物的数字都不相同。
I now want to create a separate column with "Time after capture" per individual in 15 min intervalls, starting at 0 min. 我现在想创建一个单独的列,该列的间隔为15分钟(从0分钟开始),每个人的“捕获后时间”。 For the 15 min intervall I tried to create a sequence
在15分钟的间隔中,我尝试创建一个序列
TimePostRel <- seq(from = 0, along = x, by = 15)
Now I'm not sure how to define x so it refers to each individual. 现在,我不确定如何定义x,因此它指向每个个体。 Should I use the split function to split up the dataframe?
我应该使用split函数拆分数据帧吗? We do have a date/time column too, but the problem is that we have no GPS points during daytime (when the animals are sleeping), resulting in breaks that we want to exclude from the
TimePostRel
calculations (we just want to refer to "active time" after capture). 我们也有一个“日期/时间”列,但是问题是白天(动物在睡觉时)没有GPS点,因此我们希望将其排除在
TimePostRel
计算之外(我们只想引用“活动时间”)。
This is the dataframe: 这是数据框:
'data.frame': 6425 obs. of 22 variables:
$ nb : int 1 2 3 4 5 6 7 8 9 10 ...
$ x : num 517710 517680 NA 517625 517624 ...
$ y : num 6587730 6587759 NA 6587929 6588014 ...
$ date : POSIXct, format: "2010-04-10 05:15:00" "2010-04-10 05:30:00" "2010-04-10 05:45:00" "2010-04-10 06:00:00" ...
$ dx : num -30.2 NA NA -0.4 -39.2 ...
$ dy : num 28.8 NA NA 85.7 126.8 ...
$ dist : num 41.7 NA NA 85.7 132.7 ...
$ dt : num 900 900 900 900 900 900 900 900 NA 900 ...
$ R2n : num 0 1743 NA 46880 88416 ...
$ abs.angle : num 2.38 NA NA 1.58 1.87 ...
$ rel.angle : num NA NA NA NA 0.295 ...
$ id : Factor w/ 26 levels "Andreas","Apple",..: 1 1 1 1 1 1 1 1 1 1 ...
$ burst : Factor w/ 329 levels "Andreas.1","Andreas.2",..: 1 1 1 1 1 1 1 1 1 2 ...
$ sex : int 2 2 NA 2 2 2 NA 2 2 2 ...
$ season : int 2 2 NA 2 2 2 NA 2 2 2 ...
$ try : int 33 34 NA 36 37 38 NA 39 40 41 ...
$ x.sats : int 5 5 NA 5 5 5 NA 6 5 6 ...
$ hdop : num 2.1 4.2 NA 2.7 3.3 2.1 NA 2.5 2.8 2.2 ...
$ lodge.x : num 517595 517595 NA 517595 517595 ...
$ lodge.y : num 6587806 6587806 NA 6587806 6587806 ...
$ NSD_lodge : num 19039 9440 NA 15909 44268 ...
$ nsd_1stGPSpoint : num 0 1743 NA 46880 88416 ...
Somebody nows how to solve this? 现在有人如何解决这个问题? Thanks in advance!!
提前致谢!!
Cheers, Patricia 干杯,帕特里夏
You can do this very quickly in data.table
. 您可以在
data.table
非常快速地执行此data.table
。 I assume your data is called dta
: 我假设您的数据称为
dta
:
library(data.table)
setDT(dta) ## change format
dta[, TimePostRel:=seq(from = 0, along = x, by = 15), by=x]
The plyr
package can also accomplish this task. plyr
软件包也可以完成此任务。 For a data frame that has a column of factors, use the transform option of ddply
: 对于具有一列因子的数据帧,请使用
ddply
的transform选项:
library(plyr)
# create a data frame where column x is a factor
df <- data.frame(x=c(rep("b",6),rep("a",3),rep("c",4)))
# apply sequence to each level within x
df <- ddply(df,"x",transform,t=seq(from=0,by=15,length.out=length(x)))
Note that the rows of the new data frame are ordered to match the factor levels of column x: 请注意,新数据框的行被排序为与列x的因子水平匹配:
print(df)
x t
1 a 0
2 a 15
3 a 30
4 a 45
5 a 60
6 a 75
7 b 0
8 b 15
9 b 30
10 c 0
11 c 15
12 c 30
13 c 45
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.