簡體   English   中英

使用不存在的列名稱設置數據框

[英]Subset a dataframe with non-existing column names

我在R中有這行代碼:

newDF <-oldDF[subsettingColumns]

subsettingColumns有一些列名,在oldDF可能存在也可能不存在。 如果它不存在,我想在newDF中插入一個與NA相同位置的列。 如何在R中使用它?

我們來舉個例子:

df <- data.frame(a = c(1, 2, 3, 4), b = c(4, 5, 6, 7))
df

#  a b
#1 1 4
#2 2 5
#3 3 6
#4 4 7

#Columns to take subset of
subsettingColumns <- c('a', 'd', 'e')

#Columns which are already present
cols <- subsettingColumns[subsettingColumns %in% names(df)]

#Add them in the new dataframe
newdf <- df[cols]

#Assign NA to the columns which are not defined in the original dataframe
newdf[setdiff(subsettingColumns, cols)] <- NA

newdf
#  a  d  e
#1 1 NA NA
#2 2 NA NA
#3 3 NA NA
#4 4 NA NA

如果列在數據框中不存在,您可以使用function添加列,如下所示:

AddColumn <- function(oldDF, subsettingColumns) {

  addCol <-subsettingColumns[!subsettingColumns%in%names(oldDF)]

  if(length(addCol)!=0) oldDF[addCol] <- NA
    oldDF
}

在示例數據上測試此函數:

# Example data
oldDF <- data.frame(A = c(1, 2, 3, 4, 5), B = c(11, 12, 13, 14, 15))

AddColumn(oldDF, "testColumn")

#   A   B  testColumn
#1  1  11         NA
#2  2  12         NA
#3  3  13         NA
#4  4  14         NA
#5  5  15         NA

AddColumn(oldDF, c("testColumn1", "testColumn2")

#  A   B  testColumn1  testColumn2
#1 1  11           NA          NA
#2 2  12           NA          NA
#3 3  13           NA          NA
#4 4  14           NA          NA
#5 5  15           NA          NA

數據

oldDF <- mtcars
subsettingColumns <- c("am","IDontExist","gear","IAlsoDontExist")

獲取未知列

unknownCol <- setdiff(subsettingColumns,intersect(names(mtcars),subsettingColumns))

tempDF <- lapply(unknownCol,function(x){df=data.frame(A=NA);names(df)=x;df})
oldDF <- Reduce(cbind,c(list(oldDF),tempDF))

newDF <- oldDF[subsettingColumns]
newDF

結果

> head(newDF)
                  am IDontExist gear IAlsoDontExist
Mazda RX4          1         NA    4             NA
Mazda RX4 Wag      1         NA    4             NA
Datsun 710         1         NA    4             NA
Hornet 4 Drive     0         NA    3             NA
Hornet Sportabout  0         NA    3             NA
Valiant            0         NA    3             NA
> 

基於Andre Elrico答案的基本結構,您可以執行以下操作:

newDf <- data.frame(sapply(subsettingColumns,
                           function(x) if(x %in% names(oldDF)) oldDF[[x]] else NA))

前6行是

head(newDf)
  am IDontExist gear IAlsoDontExist
1  1         NA    4             NA
2  1         NA    4             NA
3  1         NA    4             NA
4  0         NA    3             NA
5  0         NA    3             NA
6  0         NA    3             NA

數據

oldDF <- mtcars
subsettingColumns <- c("am","IDontExist","gear","IAlsoDontExist")

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM