简体   繁体   English

R。 我正在尝试将我的数据框子集几十年。 因此,我想通过使用列的值进行子集化

[英]R. I am trying to subset my data frame by decades. Therefore I want to subset by using values of a column

I have the year column from 1921 to 2020. I want to make an analysis based on the decades, so I want to subset the data frame into decades.我有从 1921 年到 2020 年的年份列。我想根据几十年进行分析,所以我想将数据框子集为几十年。 I tried couple of codes but they keep giving errors.我尝试了几个代码,但他们不断给出错误。

decade1=data_all%>%filter(data_all$year%>%1920:1929)

Error: Problem with filter() input ..1 .错误: filter()输入..1有问题。 x 3 arguments passed to ':' which requires 2 ℹ Input ..1 is data_all$year %>% 1920:1929 . x 3 arguments 传递给 ':' 这需要 2 ℹ 输入..1data_all$year %>% 1920:1929 Run rlang::last_error() to see where the error occurred.运行rlang::last_error()以查看错误发生的位置。

decade1=data_all%>%filter(data_all$year==1920:1929)

Warning message: In data_all$year == 1920:1929: longer object length is not a multiple of shorter object length警告消息:在 data_all$year == 1920:1929 中:较长的 object 长度不是较短 object 长度的倍数

What code should I be using?我应该使用什么代码?

We can change the syntax to %in% within filter我们可以在filter中将语法更改为%in%

library(dplyr)
data_all%>% 
      filter(year %in% 1920:1929) 
It may help to group the years first.

df <- data.frame(
    year = sample(1920:2020,50,replace = TRUE)
)

df %>% 
  mutate( decade = cut(df$year, breaks=c(1910,1919,1929,1939,1949,1959,1969,1979,1989,1999,2009,2019,2029), 
          labels=c("1910s","1920s","1930s","1940s","1950s","1960s","1970s","1980s","1990s","2000s","2010s","2020s"))) %>%
  arrange(year)

my solution of the exercise was (quite similar to yours):我对练习的解决方案是(与您的非常相似):

create new features, 'decade' (log10-based)创建新功能,“十年”(基于 log10)

decades <- seq(1890,2010, by=10)

data$decade <- as.factor(data$Year %/% 10 * 10)

print(data$decade)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在值列表之后,我想对 r 中的数据框进行子集化,其中行包含某列中的值 - Following a list of values, I want to subset a data frame in r with rows containing the values in a certain column 根据列 x 的值对数据框进行子集化。 只想要R中的前两位 - Subset a data frame based on count of values of column x. Want only the top two in R 我是 r 的新手。我试图通过使用 for 循环使我的代码不那么复杂 - I am new to r. I am trying to make my code less complicated by using a for loop R按来自另一个数据框的列中的值对数据框进行子集,并为每个子集命名 - R Subset data frame by values in column from another data frame and give name to each subset 使用列值的新数据帧中的子数据帧 - subset data frame in new data frame using column values 在R中,如何通过另一个data.frame中的值来子集data.frame? - In R, how do I subset a data.frame by values from another data.frame? 基于数据框 R 子集的一列中“分类值”的百分比 - Percentages of "categorical values" in one column based on subset of data frame R 我有一个子集,我试图从特定行中提取值以便在“R”中使用分位数 - I have a subset and I am trying to extract values from a specific row in order to use quantiles in "R" 嗨,我正在尝试在 R 中创建一个对象,然后对数据进行子集化,但收到有关维度的错误消息 - Hi, I am trying to create an object in R and then subset the data but am getting an error message regarding dimensions 如何将此表变成数据框? 我正在使用 R。 试图得到它的方差分析表 - How do I make this table into a data frame? I'm using R. Trying to get the ANOVA table for it
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM