[英]Recoding data in R that is annotated in intervals
I have a data set that has depth in intervals. 我有一个间隔有深度的数据集。
Depth
0-3
3-6
6-9
9-10
10-11
etc
The first three are in 3 unit increments and also the last five ( 60-63, 63-66, 66-69, 69-72, 72-75
). 前三个以3为单位递增,最后五个(
60-63, 63-66, 66-69, 69-72, 72-75
)。
Because of this notation, I cannot plot the depth with my idependent variable. 由于这种表示法,我无法使用独立变量来绘制深度。 I want to recode the column that contains the depth intervals into the higher value.
我想将包含深度间隔的列重新编码为更高的值。 ie for 0-3 it would read as 3.
例如,对于0-3,它将显示为3。
If there a short cut way to do this with the 3 unit increments and the singular increments? 是否有一种捷径可以做到3个单位增量和一个奇异增量?
I tried 我试过了
df$depth <- 1:nrow(wor)
but this only gives me sequential numerics. 但这只给我连续的数字。
and when i try 当我尝试
df$depth <- dplyr::recode(df$depth, "1=3; 2=6; 3=9; 4:54 = 9:60; 55=63; 56=66; 57=69; 58=72; 59=75; 60=78") __________________
but I get the error -------- Warning message:
Unreplaced values treated as NA as .x is not compatible. Please specify replacements exhaustively or supply .default
Any help would be greatly appreciated. 任何帮助将不胜感激。 Tack sa mycket !
赶快行动! (swedish).
(瑞典)。
Try using regular expressions to extract the last number from those strings. 尝试使用正则表达式从这些字符串中提取最后一个数字。
sub("^[[:digit:]]{1,}-([[:digit:]]{1,})", "\\1", "0-3")
[1] "3"
sub("^[[:digit:]]{1,}-([[:digit:]]{1,})", "\\1", "10-11")
[1] "11"
df$depth <- as.numeric(sub("^[[:digit:]]{1,}-([[:digit:]]{1,})", "\\1", df$depth))
You could use regular expressions to try to solve this: 您可以使用正则表达式尝试解决此问题:
dd <- data.frame(depth=c("0-3", "3-6", "6-9", "9-10", "10-11"), stringsAsFactors=FALSE)
dd$max_depth <- gsub("([0-9]+)-([0-9]+)", "\\2", dd$depth)
You can use the function separate from the tidyr package 您可以使用与tidyr包分开的功能
library(tidyr)
tidyr::separate(data, col_name, into = c("first_num", "second_num"), sep = "-")
Then you have two variables (columns) with each number of the interval and you can compute operations with them. 然后,每个间隔数都有两个变量(列),您可以使用它们来计算操作。
library(dplyr)
df %>%
tidyr::separate(depth_var, into = c("first_num", "second_num"), sep = "-") %>%
mutate(first_num = as.double(first_num),
second_num = as.double(second_num),
intervals = abs(first_num - second_num)))
I would use the tidyr package and split the numbers by the dash in the middle 我将使用tidyr软件包,并在中间用破折号将数字分开
set.seed(1)
df <- data.frame(Depth = c("0-3", "3-6", "6-9", "9-12"),
val = sample(x=4, replace = F))
library(tidyr)
df %>%
separate(Depth, c("start", "finish_dep"), sep = "-") %>%
select(-start)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.