I have a dataframe that has a column (chr type) like this
col
"1,3,4,5"
"1,7,2,5"
"8,2,2,9"
How can I create 2 new variables so I can get the first and last element in the variable col using dplyr?
col. first last
"1,3,4,5" 1 5
"1,7,2,5" 1 5
"8,2,2,9" 8 9
You can use a regular expression in that we delete all the elements between the commas.
read.table(text=sub(",.*,",' ', col))
V1 V2
1 1 5
2 1 5
3 8 9
data.frame(col) %>%
separate(col, c('v1', 'v2'), ',.*,')
v1 v2
1 1 5
2 1 5
3 8 9
ANother way:
a <- read.csv(text=col, h = F)
a[c(1,ncol(a))]
V1 V4
1 1 5
2 1 5
3 8 9
A possible solution:
library(tidyverse)
df %>%
mutate(first = str_extract(col, "^\\d+"),
last = str_extract(col, "\\d+$"))
#> col first last
#> 1 1,2,3,4,5 1 5
#> 2 1,7,2,5 1 5
#> 3 8,2,2,9 8 9
Another possible solution:
library(tidyverse)
df %>%
mutate(id = row_number()) %>%
separate_rows(col, sep =",") %>%
group_by(id) %>%
summarise(first = first(col), last = last(col)) %>%
bind_cols(df, .) %>%
select(-id)
#> col first last
#> 1 1,3,4,5 1 5
#> 2 1,7,2,5 1 5
#> 3 8,2,2,9 8 9
Here's a possible base R option:
df$first <- sapply(strsplit(df$col,','), "[", 1)
df$last <- sapply(strsplit(df$col,','), \(x) x[length(x)])
Output
col first last
1 1,3,4,5 1 5
2 1,7,2,5 1 5
3 8,2,2,9 8 9
Or could be done all in one statement:
setNames(cbind(df, do.call(rbind, lapply(strsplit(df$col, ","), function(x)
c(x[1], x[length(x)])))), c("col", "first", "last"))
Data
df <- structure(list(col = c("1,3,4,5", "1,7,2,5", "8,2,2,9")), class = "data.frame", row.names = c(NA,
-3L))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.