[英]convert string column values to numeric and find maximum in those numeric values in R
I have a column called "XYZ" (XYZ is one of the column in my data frame) in data frame and this "XYZ" column is a string type.我在数据框中有一列名为“XYZ”(XYZ 是我的数据框中的列之一),这个“XYZ”列是字符串类型。 values of the "XYZ" column is like below “XYZ”列的值如下所示
example:例子:
XYZ
new_value_1
new_value_2
new_value_4
new_value_3
I have to get the last digit(which is a number) and convert that into number and finds the maximum among those number.我必须得到最后一位数字(这是一个数字)并将其转换为数字并找到这些数字中的最大值。 After finding maximum number in that column I need to generate a sequence from that maximum number till n rows.在该列中找到最大数后,我需要从该最大数到 n 行生成一个序列。
For example from the above "XYZ" every string has digit at the end I have to get the last digit which is number and finds the maximum in those numbers, in this case maximum is 4 after finding maximum I have to mutate id column and id will starts from next number to the maximum number.例如,从上面的“XYZ”中,每个字符串的末尾都有数字,我必须得到最后一位数字,并在这些数字中找到最大值,在这种情况下,找到最大值后最大值为 4 我必须改变 id 列和 id将从下一个数字开始到最大数字。
output:输出:
XYZ ID
new_value_1 5
new_value_2 6
new_value_4 7
new_value_3 8
In the future, please make a minimally reproducible input data set using dput.将来,请使用 dput 制作一个可重现的输入数据集。 I've recreated the data set for convenience.为方便起见,我重新创建了数据集。
Using the dplyr
package for ease:使用dplyr
包轻松:
library(dplyr)
raw_data <- data.frame("XYZ"= c("new_value_1","new_value_2","new_value_3","new_value_4"))
##get the max value
max_value <- max(sapply(raw_data$XYZ, function(x){as.numeric(strsplit(x, "_")[[1]][3])}))
#make the resulting data
final_data <- raw_data %>% mutate(ID = (max_value+1):(max_value+nrow(raw_data)))
Let me know if dplyr is not allowed.如果不允许使用 dplyr,请告诉我。
Here is a base R way.这是一个基本的R方式。 It uses a regex to extract the last digit or digits and seq.int
to create a sequence like the sequence in the question.它使用正则表达式来提取最后一位或多位数字,并使用seq.int
创建一个类似于问题中的序列的序列。
m <- max(as.integer(sub("^[^[:digit:]]*([[:digit:]]+$)", "\\1", df1$XYZ)))
df1$ID <- m + seq.int(nrow(df1))
df1
# XYZ ID
#1 new_value_1 5
#2 new_value_2 6
#3 new_value_4 7
#4 new_value_3 8
Data数据
df1 <- read.table(text = "
XYZ
new_value_1
new_value_2
new_value_4
new_value_3
", header = TRUE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.