簡體   English   中英

在數據框中使用基於另一列的值創建新列

[英]Create a new column in a dataframe with values based on another column

我想在現有數據框中創建一個新列,該列基於同一數據框中另一列中的值並基於特定條件進行填充。

RAVE_ITN_BVAS_ADVIS3$subtype_ANCA_type_abr <-
  apply(
    RAVE_ITN_BVAS_ADVIS3$Subtype_ANCA_type,
    1,
    FUN = function(x) {
      if (x == "Wegener's Granulomatosis (WG)-PR3") {
        return("GPA_PR3")
      }

      if (x == "Wegener's Granulomatosis (WG)-MPO") {
        return("GPA_MPO")
      }

      if (x == "Microscopic Polyangiitis (MPA)-PR3") {
        return("MPA_PR3")
      }

      if (x == "Microscopic Polyangiitis (MPA)-MPO") {
        return("MPA_MPO")
      }

    }
  )

View(RAVE_ITN_BVAS_ADVIS3$Subtype_ANCA_type_abr)

我已經嘗試了上面的代碼(對格式不佳表示歉意,stackoverflow最近似乎改變了一切)。

我不斷收到錯誤:

Error in apply(RAVE_ITN_BVAS_ADVIS3$Subtype_ANCA_type, 1, FUN = function(x) { : 
  dim(X) must have a positive length

非常感謝在這里的任何幫助,非常感謝。

這是一個奇怪的應用地方。 R還有很多其他功能可以提供幫助。 我將查找有關此主題的一些教程。 關於堆棧溢出有很多類似的問題。

你的問題applyapply需要一個數據幀,然后為每行貫穿。 您正在嘗試使用向量作為apply函數的輸入...向量沒有行...然后函數中的每個x都是一個“命名向量”。 您可以使用x["Subtype_ANCA_type"]從命名向量中調用適當的項。 但是我不會這樣解決問題。 只是想解釋如何使用Apply,因為這是您提出的問題。

#************************************************************#
# The original question
RAVE_ITN_BVAS_ADVIS3 <- data.frame(
  Subtype_ANCA_type = rep(
    c("Wegener's Granulomatosis (WG)-PR3",
      "Wegener's Granulomatosis (WG)-MPO",
      "Microscopic Polyangiitis (MPA)-PR3",
      "Microscopic Polyangiitis (MPA)-MPO"),
    2
  ),
  stringsAsFactors = FALSE)

RAVE_ITN_BVAS_ADVIS3$subtype_ANCA_type_abr <-
  apply(
    RAVE_ITN_BVAS_ADVIS3,
    1,
    FUN = function(x) {
      if (x["Subtype_ANCA_type"] == "Wegener's Granulomatosis (WG)-PR3") {
        return("GPA_PR3")
      }

      if (x["Subtype_ANCA_type"] == "Wegener's Granulomatosis (WG)-MPO") {
        return("GPA_MPO")
      }

      if (x["Subtype_ANCA_type"] == "Microscopic Polyangiitis (MPA)-PR3") {
        return("MPA_PR3")
      }

      if (x["Subtype_ANCA_type"] == "Microscopic Polyangiitis (MPA)-MPO") {
        return("MPA_MPO")
      }
    }
  )

如果您想手動執行操作(如上述操作),則可以簡單地使用[]表示法來標識將新列數據放置在何處。

#************************************************************#
# Manually add new column for alternative variables
RAVE_ITN_BVAS_ADVIS3 <- data.frame(
  Subtype_ANCA_type = rep(
    c("Wegener's Granulomatosis (WG)-PR3",
      "Wegener's Granulomatosis (WG)-MPO",
      "Microscopic Polyangiitis (MPA)-PR3",
      "Microscopic Polyangiitis (MPA)-MPO"),
    2
  ),
stringsAsFactors = FALSE)

# For the rows in the dataframe where Subtype_ANCA_type == "something", fill the next column.
RAVE_ITN_BVAS_ADVIS3[RAVE_ITN_BVAS_ADVIS3$Subtype_ANCA_type == "Wegener's Granulomatosis (WG)-PR3" ,"subtype_ANCA_type_abr"] <- "GPA_PR3"
RAVE_ITN_BVAS_ADVIS3[RAVE_ITN_BVAS_ADVIS3$Subtype_ANCA_type == "Wegener's Granulomatosis (WG)-MPO" ,"subtype_ANCA_type_abr"] <- "GPA_MPO"
RAVE_ITN_BVAS_ADVIS3[RAVE_ITN_BVAS_ADVIS3$Subtype_ANCA_type == "Microscopic Polyangiitis (MPA)-PR3","subtype_ANCA_type_abr"] <- "MPA_PR3"
RAVE_ITN_BVAS_ADVIS3[RAVE_ITN_BVAS_ADVIS3$Subtype_ANCA_type == "Microscopic Polyangiitis (MPA)-MPO","subtype_ANCA_type_abr"] <- "MPA_MPO"

RAVE_ITN_BVAS_ADVIS3
#                    Subtype_ANCA_type subtype_ANCA_type_abr
# 1  Wegener's Granulomatosis (WG)-PR3               GPA_PR3
# 2  Wegener's Granulomatosis (WG)-MPO               GPA_MPO
# 3 Microscopic Polyangiitis (MPA)-PR3               MPA_PR3
# 4 Microscopic Polyangiitis (MPA)-MPO               MPA_MPO
# 5  Wegener's Granulomatosis (WG)-PR3               GPA_PR3
# 6  Wegener's Granulomatosis (WG)-MPO               GPA_MPO
# 7 Microscopic Polyangiitis (MPA)-PR3               MPA_PR3
# 8 Microscopic Polyangiitis (MPA)-MPO               MPA_MPO

如果您將擁有很多這些,則可能需要制作一個查找表。 您甚至可以使用excel或其他來源在csv文件中創建查找表,並使用read.csv讀取表。

#************************************************************#
# Add new column from a lookup table
abv_lookup <- data.frame(
  Subtype_ANCA_type = c(
    "Wegener's Granulomatosis (WG)-PR3",
    "Wegener's Granulomatosis (WG)-MPO",
    "Microscopic Polyangiitis (MPA)-PR3",
    "Microscopic Polyangiitis (MPA)-MPO"
  ),
  subtype_ANCA_type_abr = c(
    "GPA_PR3",
    "GPA_MPO",
    "MPA_PR3",
    "MPA_MPO"
  ),
  stringsAsFactors = FALSE
)

RAVE_ITN_BVAS_ADVIS3 <- data.frame(
  Subtype_ANCA_type = rep(
    c("Wegener's Granulomatosis (WG)-PR3",
      "Wegener's Granulomatosis (WG)-MPO",
      "Microscopic Polyangiitis (MPA)-PR3",
      "Microscopic Polyangiitis (MPA)-MPO"),
    2
  ),
  stringsAsFactors = FALSE)

# Merge the two dataframes together by any common columns (Subtype_ANCA_type)
RAVE_ITN_BVAS_ADVIS3 <- merge(RAVE_ITN_BVAS_ADVIS3,abv_lookup)

RAVE_ITN_BVAS_ADVIS3
#                    Subtype_ANCA_type subtype_ANCA_type_abr
# 1 Microscopic Polyangiitis (MPA)-MPO               MPA_MPO
# 2 Microscopic Polyangiitis (MPA)-MPO               MPA_MPO
# 3 Microscopic Polyangiitis (MPA)-PR3               MPA_PR3
# 4 Microscopic Polyangiitis (MPA)-PR3               MPA_PR3
# 5  Wegener's Granulomatosis (WG)-MPO               GPA_MPO
# 6  Wegener's Granulomatosis (WG)-MPO               GPA_MPO
# 7  Wegener's Granulomatosis (WG)-PR3               GPA_PR3
# 8  Wegener's Granulomatosis (WG)-PR3               GPA_PR3

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM