[英]R Creating new data.table with specified rows of a single column from an old data.table
I have the following data.table
: 我有以下
data.table
:
Month Day Lat Long Temperature
1: 10 01 80.0 180 -6.383330333333309
2: 10 01 77.5 180 -6.193327999999976
3: 10 01 75.0 180 -6.263328333333312
4: 10 01 72.5 180 -5.759997333333306
5: 10 01 70.0 180 -4.838330999999976
---
117020: 12 31 32.5 310 11.840003833333355
117021: 12 31 30.0 310 13.065001833333357
117022: 12 31 27.5 310 14.685003333333356
117023: 12 31 25.0 310 15.946669666666690
117024: 12 31 22.5 310 16.578336333333358
For every location (given by Lat
and Long
), I have a temperature for each day from 1 October to 31 December. 对于每个位置(由
Lat
和Long
给出),我都有10月1日至12月31日每一天的温度。
There are 1,272 locations consisting of each pairwise combination of Lat
: Lat
的每对组合有1,272个位置:
Lat
1 80.0
2 77.5
3 75.0
4 72.5
5 70.0
--------
21 30.0
22 27.5
23 25.0
24 22.5
and Long
: 和
Long
:
Long
1 180.0
2 182.5
3 185.0
4 187.5
5 190.0
---------
49 300.0
50 302.5
51 305.0
52 307.5
53 310.0
I'm trying to create a data.table
that consists of 1,272 rows (one per location) and 92 columns (one per day). 我正在尝试创建一个由1,272行(每个位置一个)和92列(每天一个)组成的
data.table
。 Each element of that data.table
will then contain the temperature at that location on that day. 然后,该
data.table
每个元素将包含当天该位置的温度。
Any advice about how to accomplish that goal without using a for
loop? 关于如何在不使用
for
循环的情况下实现该目标的任何建议?
Here we use ChickWeights
as the data, where we use "Chick-Diet" as the equivalent of your "lat-lon", and "Time" as your "Date": 在这里,我们使用
ChickWeights
作为数据,在这里我们将“ Chick-Diet”用作“ lat-lon”的等同物,将“ Time”用作“ Date”:
dcast.data.table(data.table(ChickWeight), Chick + Diet ~ Time)
Produces: 生产:
Chick Diet 0 2 4 6 8 10 12 14 16 18 20 21
1: 18 1 1 1 NA NA NA NA NA NA NA NA NA NA
2: 16 1 1 1 1 1 1 1 1 NA NA NA NA NA
3: 15 1 1 1 1 1 1 1 1 1 NA NA NA NA
4: 13 1 1 1 1 1 1 1 1 1 1 1 1 1
5: ... 46 rows omitted
You will likely need to lat + lon ~ Month + Day
or some such for your formula. 您可能需要为
lat + lon ~ Month + Day
或类似的公式。
In the future, please make your question reproducible as I did here by using a built-in data set. 将来,请使用内置数据集使您的问题像我在这里一样可重复 。
First create a date value using the lubridate
package (I assumed year = 2014, adjust as necessary): 首先使用
lubridate
包创建一个日期值(我假设year = 2014,根据需要进行调整):
library(lubridate)
df$datetext <- paste(df$Month,df$Day,"2014",sep="-")
df$date <- mdy(df$datetext)
Then one option is to use the tidyr
package to spread the columns: 然后一种选择是使用
tidyr
包来扩展列:
library(tidyr)
spread(df[,-c(1:2,6)],date,Temperature)
Lat Long 2014-10-01 2014-12-31
1 22.5 310 NA 16.57834
2 25.0 310 NA 15.94667
3 27.5 310 NA 14.68500
4 30.0 310 NA 13.06500
5 32.5 310 NA 11.84000
6 70.0 180 -4.838331 NA
7 72.5 180 -5.759997 NA
8 75.0 180 -6.263328 NA
9 77.5 180 -6.193328 NA
10 80.0 180 -6.383330 NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.