I am looking for a smart way to modify my data.set. If I manage to do that will save a lot of time
my data set looks like this
column1
1.0
1.0
2.0
2.0.15
0.0
1.0.30
and I would like to create a new "parental column" where I keep the first part of column 1
column1 column2
1.0 1
1.0 1
2.0 2
2.0.15 2.0
0.0 0
1.0.30 1.0
The reason I want to do that is that I want to recreate a parent-offspring relationship among elements. Column 2 is supposed to be the parents and column 1 its offspring. Any help is highly appreciated.
One option using the tidyverse
and regex:
library(tidyverse)
orig <- tribble(
~column1,
"1.0",
"1.0",
"2.0",
"2.0.15",
"0.0",
"1.0.30"
)
orig
#> # A tibble: 6 x 1
#> column1
#> <chr>
#> 1 1.0
#> 2 1.0
#> 3 2.0
#> 4 2.0.15
#> 5 0.0
#> 6 1.0.30
orig %>%
mutate(parent = str_replace(column1, "\\.\\d+$", ""))
#> # A tibble: 6 x 2
#> column1 parent
#> <chr> <chr>
#> 1 1.0 1
#> 2 1.0 1
#> 3 2.0 2
#> 4 2.0.15 2.0
#> 5 0.0 0
#> 6 1.0.30 1.0
Created on 2020-08-05 by the reprex package (v0.3.0)
The regex \\.\\d+$
takes a literal dot .
followed by one or more digits, followed by the end of the string $
and replaces this match with nothing ""
. See also https://regexr.com/59lnl (where the end of line $
is replaced with a newline \n
).
Try this:
#Data
df <- structure(list(column1 = c("1.0", "1.0", "2.0", "2.0.15", "0.0",
"1.0.30")), row.names = c(NA, -6L), class = "data.frame")
#Code
#Create column
df$column2 <- sub("^(.*)[.].*", "\\1", df$column1)
Output:
column1 column2
1 1.0 1
2 1.0 1
3 2.0 2
4 2.0.15 2.0
5 0.0 0
6 1.0.30 1.0
df$column2 <- sub("\\.[0-9]+$", "", df$column1)
df
# column1 column2
# 1 1.0 1
# 2 1.0 1
# 3 2.0 2
# 4 2.0.15 2.0
# 5 0.0 0
# 6 1.0.30 1.0
Data
df <- data.frame(column1 = c("1.0", "1.0", "2.0", "2.0.15", "0.0", "1.0.30"))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.