简体   繁体   English

R使用旧data.table中的单列指定行创建新的data.table

[英]R Creating new data.table with specified rows of a single column from an old data.table

I have the following data.table : 我有以下data.table

    Month Day  Lat Long        Temperature
 1:    10  01 80.0  180 -6.383330333333309
 2:    10  01 77.5  180 -6.193327999999976
 3:    10  01 75.0  180 -6.263328333333312
 4:    10  01 72.5  180 -5.759997333333306
 5:    10  01 70.0  180 -4.838330999999976
---                                       
117020:    12  31 32.5  310 11.840003833333355
117021:    12  31 30.0  310 13.065001833333357
117022:    12  31 27.5  310 14.685003333333356
117023:    12  31 25.0  310 15.946669666666690
117024:    12  31 22.5  310 16.578336333333358

For every location (given by Lat and Long ), I have a temperature for each day from 1 October to 31 December. 对于每个位置(由LatLong给出),我都有10月1日至12月31日每一天的温度。

There are 1,272 locations consisting of each pairwise combination of Lat : Lat的每对组合有1,272个位置:

    Lat
1   80.0
2   77.5
3   75.0
4   72.5
5   70.0
--------
21  30.0
22  27.5
23  25.0
24  22.5

and Long : Long

Long
1   180.0
2   182.5
3   185.0
4   187.5
5   190.0
---------
49  300.0
50  302.5
51  305.0
52  307.5
53  310.0

I'm trying to create a data.table that consists of 1,272 rows (one per location) and 92 columns (one per day). 我正在尝试创建一个由1,272行(每个位置一个)和92列(每天一个)组成的data.table Each element of that data.table will then contain the temperature at that location on that day. 然后,该data.table每个元素将包含当天该位置的温度。

Any advice about how to accomplish that goal without using a for loop? 关于如何在不使用for循环的情况下实现该目标的任何建议?

Here we use ChickWeights as the data, where we use "Chick-Diet" as the equivalent of your "lat-lon", and "Time" as your "Date": 在这里,我们使用ChickWeights作为数据,在这里我们将“ Chick-Diet”用作“ lat-lon”的等同物,将“ Time”用作“ Date”:

dcast.data.table(data.table(ChickWeight), Chick + Diet ~ Time)

Produces: 生产:

     Chick Diet 0 2  4  6  8 10 12 14 16 18 20 21
 1:    18    1 1 1 NA NA NA NA NA NA NA NA NA NA
 2:    16    1 1 1  1  1  1  1  1 NA NA NA NA NA
 3:    15    1 1 1  1  1  1  1  1  1 NA NA NA NA
 4:    13    1 1 1  1  1  1  1  1  1  1  1  1  1
 5:   ... 46 rows omitted

You will likely need to lat + lon ~ Month + Day or some such for your formula. 您可能需要为lat + lon ~ Month + Day或类似的公式。

In the future, please make your question reproducible as I did here by using a built-in data set. 将来,请使用内置数据集使您的问题像我在这里一样可重复

First create a date value using the lubridate package (I assumed year = 2014, adjust as necessary): 首先使用lubridate包创建一个日期值(我假设year = 2014,根据需要进行调整):

library(lubridate)
df$datetext <- paste(df$Month,df$Day,"2014",sep="-")
df$date <- mdy(df$datetext)

Then one option is to use the tidyr package to spread the columns: 然后一种选择是使用tidyr包来扩展列:

library(tidyr)
spread(df[,-c(1:2,6)],date,Temperature)

    Lat Long 2014-10-01 2014-12-31
1  22.5  310         NA   16.57834
2  25.0  310         NA   15.94667
3  27.5  310         NA   14.68500
4  30.0  310         NA   13.06500
5  32.5  310         NA   11.84000
6  70.0  180  -4.838331         NA
7  72.5  180  -5.759997         NA
8  75.0  180  -6.263328         NA
9  77.5  180  -6.193328         NA
10 80.0  180  -6.383330         NA

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM