简体   繁体   English

dplyr:使用矩阵中的值子集创建带有 case_when 的新列

[英]dplyr: using values subset from matrix to create new column with case_when

I am trying to use create a new column in a data frame using mutate and case_when but I get unexpected results.我正在尝试使用mutatecase_when在数据框中创建一个新列,但我得到了意想不到的结果。

Here is a dput of a subset of my data: Pastebin .这是我的数据子集的 dput: Pastebin

The aim is to calculate own and cross price elasticities for products in multiple completely separate markets.目的是计算多个完全独立市场中产品的自身和交叉价格弹性。 My idea was to use case_when to use different expressions for own and cross elasticities and use a unique product identifier ( IDprod_un_j and IDprod_un_l ) to subset some values from another matrix.我的想法是使用 case_when 对自身弹性和交叉弹性使用不同的表达式,并使用唯一的产品标识符( IDprod_un_jIDprod_un_l )从另一个矩阵中对某些值进行子集化。 This is the code I am using:这是我正在使用的代码:

elast_small %<>% 
  mutate(
    eta_jlm_rc = case_when(
      IDprod_j == IDprod_l ~ (-price_j/share_j) * rowMeans(-alpha_i_rc * share_i_small[IDprod_un_j,] * (1-share_i_small[IDprod_un_j,])),
      IDprod_j != IDprod_l ~ (-price_l/share_j) * rowMeans(alpha_i_rc * share_i_small[IDprod_un_j,] * share_i_small[IDprod_un_l,])
    )
  )

This runs without errors, but when I try to verify the results, I get different values:这运行没有错误,但是当我尝试验证结果时,我得到了不同的值:

> -elast_small$price_j[1] / elast_small$share_j[1] * mean(-alpha_i_rc * share_i_small[1,] * (1-share_i_small[1,]))
[1] -10.02669
> elast_small$eta_jlm_rc[1]
[1] -14.83231

What am I missing here?我在这里缺少什么?

What I was missing here is that case_when does not apply the RHS row by row, but in one go for each case so that share_i_small[IDprod_un_j,] returns a matrix with more than one row.我在这里缺少的是case_when不逐行应用 RHS,而是一次性应用每种情况,以便share_i_small[IDprod_un_j,]返回一个多于一行的矩阵。 Multiplying a vector and a matrix is done columnwise in R, so the multiplication is not correct.将向量和矩阵相乘是在 R 中按列完成的,因此乘法是不正确的。

This solves the issue:这解决了这个问题:

elast %<>%
  mutate(
    eta_jlm_rc = case_when(
      IDprod_j == IDprod_l ~ (-price_j/share_j) * rowMeans(t(t(share_i[IDprod_ud_j,] * (1-share_i[IDprod_ud_j,])) * -alpha_i_rc)),
      IDprod_j != IDprod_l ~ (-price_l/share_j) * rowMeans(t(t(share_i[IDprod_ud_j,] * share_i[IDprod_ud_l,]) * alpha_i_rc))
    )
  )

It looks like it might work if you group by product type j and l and then make the variables by which you're multiplying (-price/share) before hand in the mutate() statement:如果您按产品类型jl分组,然后在mutate()语句中(-price/share)要乘以(-price/share)的变量,它看起来可能会起作用:

tmp <- elast_small %>% 
  group_by(IDprod_un_j,IDprod_un_l) %>% 
  mutate(
    newvar1 = mean(-alpha_i_rc * share_i_small[IDprod_un_j, ] * (1-share_i_small[IDprod_un_j, ])), 
    newvar2 = mean(alpha_i_rc * share_i_small[IDprod_un_j, ] * share_i_small[IDprod_un_l, ]), 
    eta_jlm_rc = case_when(
      IDprod_j == IDprod_l ~ (-price_j/share_j) * newvar1,
      IDprod_j != IDprod_l ~ (-price_l/share_j) * newvar2
    )
  )

tmp %>% 
  select(IDprod_un_j, IDprod_un_l, eta_jlm_rc2) %>% 
  as.data.frame %>% 
  head
# IDprod_un_j IDprod_un_l   eta_jlm_rc2
# 1           1           1 -10.026692702
# 2           1           2   0.001446025
# 3           1           3   0.005316131
# 4           1           4   0.133027210
# 5           1           5   0.017306581
# 6           1           6   0.063833755

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在dplyr中使用case_when更改新列时遇到问题 - Trouble mutating new column using case_when in dplyr 在 Dplyr 中使用 case_when 创建新列时遇到问题 - Trouble making a new column using case_when in Dplyr 使用case_when在dplyr的mutate中根据条件在数据框中创建新列 - using case_when inside dplyr's mutate to create a new column in dataframe based on conditions 使用dplyr case_when根据来自另一列的值更改NA值 - using dplyr case_when to alter NA values based on value from another column 根据从不同列获得的值创建新列,使用 R 中的 mutate() 和 case_when 函数 - Creating a new column based on values obtained from different column, using mutate() and case_when function in R 使用 case_when() 和 filter() 根据 R 中一列中的值和另一列中的级别对数​​据框进行子集化 - using case_when() and filter() to subset a dataframe based on values in one column and levels in another column in R Case_when 和/或 if_else dplyr - 当 NA 使用来自另一列的值时 - Case_when and or if_else dplyr - when NA use values from another column 使用`dplyr :: case_when`创建一个包含plotmath表达式的新列 - creating a new column containing plotmath expression using `dplyr::case_when` 在 dplyr 中使用 mutate 和 case_when 将新值插入到数据框中 - Inserting new values into a data frame using mutate and case_when in dplyr R dplyr `group_by` 似乎无法使用 `case_when` 创建新值? - R dplyr `group_by` seems not working to create a new value using `case_when`?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM