I think my question is fairly simple to answer but I'm learning R so I'd like to know the best way to do it.
I've a dataset looking like this:
> print(agg_df41367)
# A tibble: 72 x 3
# Groups: hour [24]
hour predicted y
1 0 Feeding 0.121
2 0 Foraging 0.632
3 0 Standing 0.300
4 1 Feeding 0.141
5 1 Foraging 0.727
6 1 Standing 0.183
7 2 Feeding 0.0932
8 2 Foraging 0.817
9 2 Standing 0.133
10 3 Feeding 0.214
I would like to run a GLM model, so I'd like my data to look like:
head(agg_df41361_GLM)
hour Foraging Standing Feeding
0 0.632 0.300 0.121
1 0.727 0.183 0.141
2 0.817 0.133 0.0932
3 etc. etc. 0.214
Any ideas of what is the most compact way to do this? Ideally, I would like to use a for
-loop to compute this transformation for multiple datasets. All my datasets follow a name format agg_df4136*
. Any input is appreciated!
Here's a way to reshape the dataset you posted.
library(tidyr)
# example data
dt = read.table(text = "
hour predicted y
1 0 Feeding 0.121
2 0 Foraging 0.632
3 0 Standing 0.300
4 1 Feeding 0.141
5 1 Foraging 0.727
6 1 Standing 0.183
7 2 Feeding 0.0932
8 2 Foraging 0.817
9 2 Standing 0.133
", header=T)
spread(dt, predicted, y)
# hour Feeding Foraging Standing
# 1 0 0.1210 0.632 0.300
# 2 1 0.1410 0.727 0.183
# 3 2 0.0932 0.817 0.133
If you have multiple datasets it's better to create a list of them and apply the reshaping process to each one of them:
library(tidyverse)
# example of list of dataframes
l = list(dt, dt, dt)
map(l, ~spread(., predicted, y))
# [[1]]
# hour Feeding Foraging Standing
# 1 0 0.1210 0.632 0.300
# 2 1 0.1410 0.727 0.183
# 3 2 0.0932 0.817 0.133
#
# [[2]]
# hour Feeding Foraging Standing
# 1 0 0.1210 0.632 0.300
# 2 1 0.1410 0.727 0.183
# 3 2 0.0932 0.817 0.133
#
# [[3]]
# hour Feeding Foraging Standing
# 1 0 0.1210 0.632 0.300
# 2 1 0.1410 0.727 0.183
# 3 2 0.0932 0.817 0.133
Note that here I'm using the same dataset ( dt
) as my 3 list elements, but it will work with different datasets, as long as you have the same column names.
If you want to create a list of all your datasets that start with the name pattern you provided you can do this:
# get objects that start with this name pattern
input_names = ls()[grepl("^agg_df4136", ls())]
# get the data that match those names
list_datasets = map(input_names, get)
So, list_datasets
is a list of all dataframes in your environment with a name that starts with "agg_df4136".
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.