從R數據框中刪除負值和一個正值

Question

我有一個數據框，其中一欄是花費的金額。 在“花費金額”列中，有“花費金額”的值以及任何收益的負值。 例如。

ID    Store    Spent
123    A        18.50
123    A       -18.50
123    A        18.50

我要先去除負值，再去除其正數對分之一-想法是只保留完全完成的支出金額，這樣我才能查看總支出。

現在我在想這樣的事情-在這里我按照花費對數據框進行了排序

if spend < 0 {
  take absolute value of spend
  if diff between abs(spend) and spend+1 = 0 then both are NA}

我想吃點東西

df[df$spend < 0] <- NA

在這里我也可以為NA設定一個積極的對應對象。 有什么建議么？

Answer 1

應該有一個更簡單的解決方案，但這是一種方法。 還創建了我自己的示例，因為一個共享沒有足夠的數據點來測試

#Original vector
x <- c(1, 2, -2, 1, -1, -1, 2, 3, -4, 1, 4)
#Count the frequency of negative numbers, keeping all the unique numbers
vals <- table(factor(abs(x[x < 0]), levels = unique(abs(x))))   
#Count the frequency of absolute value of original vector
vals1 <- table(abs(x)) 
#Subtract the frequencies between two vectors
new_val <- vals1 - (vals * 2 )
#Recreate the new vector
as.integer(rep(names(new_val), new_val))
#[1] 1 2 3

Answer 2

如果添加rowid列，則可以使用data.table ant-joins進行此操作。

這是一個考慮ID的示例，除非它們是相同的ID，否則不會刪除“正對應項”

首先創建更多有趣的樣本數據

df <- fread('
ID    Store    Spent
123    A        18.50
123    A       -18.50
123    A        18.50
123    A       -19.50
123    A        19.50
123    A       -99.50
124    A       -94.50
124    A        99.50
124    A        94.50
124    A        94.50
')

現在，刪除帶有正對應項的所有負值，並刪除那些對應項

negs <- df[Spent < 0][, Spent := -Spent][, rid := rowid(ID, Spent)]
pos <- df[Spent > 0][, rid := rowid(ID, Spent)]
pos[!negs, on = .(ID, Spent, rid), -'rid']
#     ID Store Spent rid
# 1: 123     A  18.5   2
# 2: 124     A  99.5   1
# 3: 124     A  94.5   2

並應用於Ronak的x矢量示例

x <- c(1, 2, -2, 1, -1, -1, 2, 3, -4, 1, 4)
negs <- data.table(x = -x[x<0])[, rid := rowid(x)]
pos <- data.table(x = x[x>0])[, rid := rowid(x)]
pos[!negs, on = names(pos), -'rid']

#    x
# 1: 2
# 2: 3
# 3: 1

Answer 3

我用下面的代碼。

library(dplyr)
store <- rep(LETTERS[1:3], 3)
id <- c(1:4, 1:3, 1:2)
expense <- runif(9, -10, 10)
tibble(store, id, expense) %>%
  group_by(store) %>%
  summarise(net_expenditure = sum(expense))

獲得此輸出：

# A tibble: 3 x 2
  store net_expenditure
  <chr>           <dbl>
1 A               13.3 
2 B                8.17
3 C               16.6

或者，如果您希望每個商店ID配對的凈支出，則可以使用以下代碼：

tibble(store, id, expense) %>%
  group_by(store, id) %>%
  summarise(net_expenditure = sum(expense))

我從稍微不同的角度回答了您的問題。 我不確定我的代碼是否可以回答您的問題，但這可能會有所幫助。

從R數據框中刪除負值和一個正值

問題描述

3 個解決方案

解決方案1
2 已采納 2019-07-22 15:22:33

解決方案2
2 2019-07-22 16:52:28

解決方案3
0 2019-07-22 15:35:24

從R數據框中刪除負值和一個正值

問題描述

3 個解決方案

解決方案1 2 已采納 2019-07-22 15:22:33

解決方案2 2 2019-07-22 16:52:28

解決方案3 0 2019-07-22 15:35:24

解決方案1
2 已采納 2019-07-22 15:22:33

解決方案2
2 2019-07-22 16:52:28

解決方案3
0 2019-07-22 15:35:24