简体   繁体   English

使用索引对 r 中的数据帧执行数学运算

[英]Using indexing to perform mathematical operations on data frame in r

I'm struggling to perform basic indexing on a data frame to perform mathematical operations.我正在努力对数据框执行基本索引以执行数学运算。 I have a data frame containing all 50 US states with an entry for each month of the year, so there are 600 observations.我有一个包含美国所有 50 个州的数据框,其中包含一年中每个月的条目,因此有 600 个观察值。 I wish to find the difference between a value for the month of December minus the January value for each of the states.我希望找到每个州 12 月份的值减去 1 月份的值之间的差值。 My data looks like this:我的数据如下所示:

> head(df)
  state year month             value
1    AL 2020    01               2.7
2    AK 2020    01                 5
3    AZ 2020    01               4.8
4    AR 2020    01               3.7
5    CA 2020    01               4.2
7    CO 2020    01               2.7

For instance, AL has a value in Dec of 4.7 and Jan value of 2.7 so I'd like to return 2 for that state.例如,AL 在 Dec 的值为 4.7,Jan 的值为 2.7,因此我想为该状态返回 2。

I have been trying to do this with the group_by and summarize functions, but can't figure out the indexing piece of it to grab values that correspond to a condition.我一直在尝试使用 group_by 和 summary 函数来做到这一点,但无法弄清楚它的索引部分来获取与条件相对应的值。 I couldn't find a resource for performing these mathematical operations using indexing on a data frame, and would appreciate assistance as I have other transformations I'll be using.我找不到使用数据框上的索引来执行这些数学运算的资源,我将不胜感激,因为我将使用其他转换。

With dplyr :使用dplyr

library(dplyr)
df %>%
  group_by(state) %>%
  summarize(year_change = value[month == "12"] - value[month == "01"])

This assumes that your data is as you describe--every state has a single value for every month.这假设您的数据如您所描述的那样——每个州每个月都有一个值。 If you have missing rows, or multiple observations in for a state in a given month, I would not expect this code to work.如果您在给定月份的某个州缺少行或多次观察,我不希望此代码起作用。

Another approach, based row order rather than month value, might look like this:另一种基于行顺序而不是月份值的方法可能如下所示:

library(dplyr)
df %>%
  ## make sure things are in the right order
  arrange(state, month) %>% 
  group_by(state) %>%
  summarize(year_change = last(value) - first(value))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何根据R数据帧的两列中的位置之间的值执行数学运算? - How do I perform mathematical operations between values in two columns of an R data frame based on their position? 如何按日期对数据帧进行子集化并在 R 中执行多个操作? - How to subset data frame by date and perform multiple operations in R? R数据框操作 - R Data frame operations R中分组数据和数据帧之间的数学运算 - mathematical operations between the grouped data and a dataframe in R 如何在数据框(矩阵)中包含新的行和列,并在 R 的数据集中基于这些执行数学运算 - How to include new rows and columns in a data frame (matrix) and perform a mathematical operation based on these in the dataset in R data.table的数学运算(在R中) - mathematical operations on data.table(in R) 使用应用函数对数据框中的每一列执行不同的操作 - Using apply functions to perform different operations for each column in a data frame 在R中索引另一个数据帧 - Indexing another data frame in R R 使用另一个列中的值索引数据框 - R indexing a data frame using values in the column of another 使用R,遍历数据帧,对每个数据帧执行数学运算,将结果追加到新的数据帧中 - With R, iterate over data frames, perform math operations on each one, append results in new data frame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM