簡體   English   中英

如何使用從 R 中的現有列中提取的名稱向 data.frame 添加列?

[英]How add a column to a data.frame with name extracted from an existing column in R?

我有DF data.frame 我想補充另一column (ie, call it station_no)它會extrac t時的numberunderscoreVariables column

library(lubridate)
library(tidyverse)

set.seed(123)

DF <- data.frame(Date = seq(as.Date("1979-01-01"), to = as.Date("1979-12-31"), by = "day"),
                 Grid_2 = runif(365,1,10), Grid_20 = runif(365,5,15)) %>% 
      pivot_longer(-Date, names_to = "Variables", values_to = "Values")

期望輸出:

DF_out <- data.frame(Date = c("1979-01-01","1979-01-01"),Variables = c("Grid_2","Grid_20"), 
                     Values = c(0.95,1.3),    Station_no = c(2,20))

簡單的選項是parse_number ,它返回數字轉換值

library(dplyr)
DF %>% 
   mutate(Station_no  = readr::parse_number(Variables))

或者使用str_extract (以防我們想按照模式進行)

library(stringr)
DF %>%
   mutate(Station_no  = str_extract(Variables, "(?<=_)\\d+"))

或使用base R

DF$Station_no <-  trimws(DF$Variables, whitespace = '\\D+')

base R解決方案是:

#Code
DF$Station_no <- sub("^[^_]*_", "", DF$Variables)

輸出(某些行):

# A tibble: 730 x 4
   Date       Variables Values Station_no
   <date>     <chr>      <dbl> <chr>     
 1 1979-01-01 Grid_2      3.59 2         
 2 1979-01-01 Grid_20    12.8  20        
 3 1979-01-02 Grid_2      8.09 2         
 4 1979-01-02 Grid_20     6.93 20        
 5 1979-01-03 Grid_2      4.68 2         
 6 1979-01-03 Grid_20     5.18 20        
 7 1979-01-04 Grid_2      8.95 2         
 8 1979-01-04 Grid_20     9.07 20        
 9 1979-01-05 Grid_2      9.46 2         
10 1979-01-05 Grid_20     9.83 20        
# ... with 720 more rows

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM