[英]R ifelse multiple conditions
My data is like我的数据就像
username![]() |
compound score![]() |
age![]() |
Date![]() |
---|---|---|---|
A![]() |
0.5 ![]() |
26 ![]() |
2020-08-10 ![]() |
A![]() |
0.6 ![]() |
26 ![]() |
2020-09-01 ![]() |
B![]() |
0.3 ![]() |
27 ![]() |
2020-11-15 ![]() |
structure(list(age = c(24L, 28L, 25L, 27L, 30L, 25L, 47L, 23L,
26L, 23L), compound = c(-0.765, 0.743, 0.1901, 0, 0, 0.743, 0.2732,
-0.2263, 0.3612, -0.2263), Date = structure(c(18551, 18544, 18544,
18541, 18540, 18538, 18536, 18536, 18534, 18533), class = "Date")), row.names = c(1L,
2L, 3L, 4L, 5L, 6L, 7L, 9L, 10L, 12L), class = "data.frame")
The code I have turns the data into long format with two conditions.我拥有的代码将数据转换为具有两个条件的长格式。
> twitter_wide = twitter_d %>% mutate(Date = as.Date(Date)) %>%
+ group_by(username,age,Trial = ifelse(day(Date) >= 3 & month(Date) >11, 2, 1)) %>%
+ summarise(compound = mean(compound), .groups = 'drop')
How can I change the data to three conditions AND A WIDE FORMAT :如何将数据更改为三个条件和宽格式:
Condition1: Mean Score(compound) Date From 8.16-11.2 (T1)条件1:平均分数(复合)日期从8.16-11.2(T1)
Condition2: Mean Score(compound) Date From 11.3-1.20 (T2)条件 2:平均分数(复合)日期从 11.3-1.20 (T2)
Condition3: Mean Score(compound) Date From 1.21-4.9 (T3)条件3:平均分数(复合)日期从1.21-4.9(T3)
A desired table may look like所需的表可能看起来像
username![]() |
age![]() |
Mean_compound score_T1 ![]() |
Mean_compound score_T2 ![]() |
Mean_compound score_T3 ![]() |
---|---|---|---|---|
A![]() |
26 ![]() |
0.5 ![]() |
0.3 ![]() |
0.7 ![]() |
B![]() |
26 ![]() |
0.3 ![]() |
0.5 ![]() |
0.3 ![]() |
Here is an example with fake data:这是一个假数据的例子:
# example data
twitter_d <- structure(list(username = c("A", "A", "A", "A", "B", "B", "B", "B"),
`compound score` = c(0.5, 0.6, 0.7, 0.8, 0.3, 0.2, 0.1, 0),
age = c(26L, 26L, 26L, 26L, 27L, 27L, 27L, 27L),
Date = c("2020-08-16", "2020-09-01", "2020-11-14", "2021-01-20", "2020-09-12", "2020-11-02", "2020-11-15", "2021-04-09")),
class = "data.frame", row.names = c(NA, -8L))
# solution
library(tidyverse)
twitter_wide <- twitter_d %>% mutate(
Condition = ifelse(Date >= "2020-08-16" & Date <= "2020-11-02", "T1",
ifelse(Date >= "2020-11-03" & Date <= "2021-01-20", "T2",
ifelse(Date >= "2021-01-21" & Date <= "2021-04-09", "T3", NA)))
) %>%
group_by(username, age, Condition) %>%
summarise(compound = mean(`compound score`, na.rm = TRUE), .groups = "drop") %>%
pivot_wider(names_from = Condition, names_prefix = "Mean_compound_score_", values_from = compound)
twitter_wide
## A tibble: 2 x 5
# username age Mean_compound_score_T1 Mean_compound_score_T2 Mean_compound_score_T3
# <chr> <int> <dbl> <dbl> <dbl>
#1 A 26 0.55 0.75 NA
#2 B 27 0.25 0.1 0
Explanation:解释:
I mutate()
a new column called Condition
.我
mutate()
一个名为Condition
的新列。 It contains c("T1", "T2", "T3")
depending on the Date
range that is specified for each time period.它包含
c("T1", "T2", "T3")
取决于为每个时间段指定的Date
范围。 Basically, this is to group the dates.基本上,这是对日期进行分组。
group_by()
and summarise()
does the mean function that you want (that you already have). group_by()
和summarise()
表示您想要的 function (您已经拥有)。
pivot_wider()
changes the data to wide format, using values from compound
and grouping them by the date-grouping column Condition
. pivot_wider()
将数据更改为宽格式,使用来自compound
的值并按日期分组列Condition
对它们进行分组。
For more info and use for pivot_wider()
, which is from tidyr
package, see https://tidyr.tidyverse.org/reference/pivot_wider.html .有关来自
tidyr
package 的更多信息和用于pivot_wider()
的信息,请参阅https://tidyr.tidyverse.org/reference/pivot_wider.ZFC35FDC70D5FC69D53EZ8C 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.