在數據框中使用基於其他列的值創建新列

Question

我在r中有一個數據框：

         word     positive.polarity    negative.polarity 
1 interesting                 1                 0                         
2      boring                 0                 1

我嘗試添加一個名為positive.ponderate.polarity的新列，如果上下文od單詞包含特殊字符，則該列包含positive.polarity * 3的值；如果不是，則包含positive.polarity / 3。

任何想法請這樣做嗎？

謝謝

Answer 1

不知道您的“特殊字符”是什么...我將使用以下條件： "[o]{2}|[y]$"或基本術語

如果單詞包含兩個“ o”或以“ y”結尾：乘以3； 如果不除以3。

使用tm包作為stopwords和package::dplyr

  # Created some data to mimic yours
  var_df <- data.frame(word = tm::stopwords(),
                       stringsAsFactors = FALSE) %>% mutate(
    positive.polarity = sample(0:1, nrow(.), TRUE)) %>% mutate(
    negative.polarity = ifelse(positive.polarity == 1, 0, 1)
  ) %>% 
   # Applying the condition and evaluating the variable formula if met
  mutate(
    positive.ponderate.polarity = ifelse(
        grepl("[o]{2}|[y]$", word), 
        positive.polarity * 3, 
        positive.polarity / 3)
    )

tail(var_df, 10)

    word positive.polarity negative.polarity positive.ponderate.polarity
165   no                 0                 1                   0.0000000
166  nor                 0                 1                   0.0000000
167  not                 1                 0                   0.3333333
168 only                 1                 0                   3.0000000
169  own                 1                 0                   0.3333333
170 same                 1                 0                   0.3333333
171   so                 0                 1                   0.0000000
172 than                 1                 0                   0.3333333
173  too                 1                 0                   3.0000000
174 very                 1                 0                   3.0000000

在數據框中使用基於其他列的值創建新列

問題描述

1 個解決方案

解決方案1
2 已采納 2017-02-27 18:25:31

在數據框中使用基於其他列的值創建新列

問題描述

1 個解決方案

解決方案1 2 已采納 2017-02-27 18:25:31

解決方案1
2 已采納 2017-02-27 18:25:31