[英]R: Extracting part of string in a column
你好 stackoverflow 社區,
我在數據集中有以下問題。 我必須提取“BMI”列中字符串的前四個符號,然后將它們轉換為數字。
例如:而不是“23.3 [21.9-24.6]”-> 23.3
你可以在這里找到數據: https://github.com/tanaytuncer/LifeExpectancy_BMI
library(mosaic)
library(tidyverse)
path <- "/Users/tanaytuncer/Desktop/Quantitative Datenanalyse/BMI.csv"
df_BMI <- read.csv(path)
df_BMI <- df_BMI[-1:-3, ]
df_BMI <- df_BMI %>%
rename(country = "X",
"2000" = "X2000",
"2001" = "X2001",
"2002" = "X2002",
"2003" = "X2003",
"2004" = "X2004",
"2005" = "X2005",
"2006" = "X2006",
"2007" = "X2007",
"2008" = "X2008",
"2009" = "X2009",
"2010" = "X2010",
"2011" = "X2011",
"2012" = "X2012",
"2013" = "X2013",
"2014" = "X2014",
"2015" = "X2015"
)
df_BMI <- df_BMI %>%
gather("year", "BMI", 2:17)
我們可以使用substr
df_BMI$BMI <- as.numeric(substr(df_BMI$BMI, 1, 4))
或者使用來自parse_number
的readr
library(readr)
df_BMI$BMI <- parse_number(df_BMI$BMI)
在不gather
成“長”格式的情況下,我們也可以使用across
library(dplyr)
df_BMI1 <- df_BMI %>%
mutate(across(-country, parse_number))
如果我們使用check.names = FALSE
可以避免rename
df_BMI <- read.csv("https://raw.githubusercontent.com/tanaytuncer/LifeExpectancy_BMI/main/BMI.csv", check.names = FALSE)
df_BMI <- df_BMI[-1:-3, ]
names(df_BMI)[1] <- "country"
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.