簡體   English   中英

R:兩個條件的列求和

[英]R: sum columns with two conditions

我有兩個數據表。 第一個表是具有坐標和降水量的矩陣。 它由四列組成,分別是緯度,經度,降水量和監測日期。 該表的示例是:

latitude_1 longitude_1 precipitation day_mon 54.17 62.15 5 34 69.61 48.65 3 62 73.48 90.16 7 96 66.92 90.27 7 19 56.19 96.46 9 25 72.23 74.18 5 81 88.00 95.20 7 97 92.44 44.41 6 18 95.83 52.91 9 88 99.68 96.23 8 6 81.91 48.32 8 96 54.66 52.70 0 62 95.31 91.82 2 84 60.32 96.25 9 71 97.39 47.91 7 76 65.21 44.63 9 3第二個表由5列組成:站號,經度,緯度,開始監視的日期,結束監視的日期。 看起來像:

station latitude_2 longitude_2 day_begin day_end 15 50.00 93.22 34 46 11 86.58 85.29 15 47 14 93.17 63.17 31 97 10 88.56 61.28 15 78 13 45.29 77.10 24 79 6 69.73 99.52 13 73 4 45.60 77.36 28 95 13 92.88 62.38 9 51 1 65.10 64.13 7 69 10 60.57 86.77 34 64 3 53.62 60.76 23 96 16 87.82 59.41 38 47 1 47.83 95.89 21 52 11 75.42 46.20 38 87 3 55.71 55.26 2 73 16 71.65 96.15 36 93

我想總結1張桌子的降水量。 但是我有兩個條件:

  1. day_begin <day_mon <day_end 監視的天數(從1個表開始的day_mon)應小於結束日期且大於開始的天數(從2個表開始)

  2. 從比其他地方更近的點算出降水 監視點(坐標由經度_1和緯度_1組成)與測站(坐標由經度_2和緯度_2組成)之間的距離應最小。 距離的計算公式為: R = 6400*arccos(sin(latitude_1)*sin(latitude_2)+cos(latitude_1)*cos(latitude_2))*cos(longitude_1-longitude_2))

最后我想得到如下表所示的結果: station latitude_2 longitude_2 day_begin day_end Sum 15 50 93.22 34 46 188 11 86.58 85.29 15 47 100 14 93.17 63.17 31 97 116 10 88.56 61.28 15 78 182 13 45.29 77.1 24 79 136 6 69.73 99.52 13 73 126 4 45.6 77.36 28 95 108 13 92.88 62.38 9 51 192 1 65.1 64.13 7 69 125 10 60.57 86.77 34 64 172 3 53.62 60.76 23 96 193 16 87.82 59.41 38 47 183 1 47.83 95.89 21 52 104 11 75.42 46.2 38 87 151 3 55.71 55.26 2 73 111 16 71.65 96.15 36 93 146我知道如何在C ++中進行計算。 我應該在R中使用什么功能? 謝謝您的幫助!

我不確定我是否能正確解決您的問題...但是到了。

我使用了data.table方法。

library( tidyverse )
library( data.table )


#step 1. join days as periods
#create a dummy variables to create a virtual period in dt1
dt1[, point_id := .I]
dt1[, day_begin := day_mon]
dt1[, day_end := day_mon]
setkey(dt2, day_begin, day_end)
#overlap join finding all stations for each point that overlap periods
dt <- foverlaps( dt1, dt2, type = "within" )

#step 2. calculate the distance station for each point based on TS-privided formula
dt[, distance := 6400 * acos( sin( latitude_1 ) * sin( latitude_2 ) + cos( latitude_1 ) * cos( latitude_2 ) ) * cos( longitude_1 - longitude_2 ) ]

#step 3. filter (absolute) minimal distance based on point_id
dt[ , .SD[which.min( abs( distance ) )], by = point_id ]

#    point_id station latitude_2 longitude_2 day_begin day_end latitude_1 longitude_1 precipitation day_mon i.day_begin i.day_end    distance
# 1:        1       1      47.83       95.89        21      52      54.17       62.15             5      34          34        34  -248.72398
# 2:        2       6      69.73       99.52        13      73      69.61       48.65             3      62          62        62   631.89228
# 3:        3      14      93.17       63.17        31      97      73.48       90.16             7      96          96        96 -1519.84886
# 4:        4      11      86.58       85.29        15      47      66.92       90.27             7      19          19        19  1371.54757
# 5:        5      11      86.58       85.29        15      47      56.19       96.46             9      25          25        25  1139.46849
# 6:        6      14      93.17       63.17        31      97      72.23       74.18             5      81          81        81   192.99264
# 7:        7      14      93.17       63.17        31      97      88.00       95.20             7      97          97        97  5822.81529
# 8:        8       3      55.71       55.26         2      73      92.44       44.41             6      18          18        18  -899.71206
# 9:        9       3      53.62       60.76        23      96      95.83       52.91             9      88          88        88    45.16237
# 10:       10       3      55.71       55.26         2      73      99.68       96.23             8       6           6         6   -78.04484
# 11:       11      14      93.17       63.17        31      97      81.91       48.32             8      96          96        96 -5467.77459
# 12:       12       3      53.62       60.76        23      96      54.66       52.70             0      62          62        62 -1361.57863
# 13:       13      11      75.42       46.20        38      87      95.31       91.82             2      84          84        84  -445.18765
# 14:       14      14      93.17       63.17        31      97      60.32       96.25             9      71          71        71  -854.86321
# 15:       15       3      53.62       60.76        23      96      97.39       47.91             7      76          76        76  1304.41634
# 16:       16       3      55.71       55.26         2      73      65.21       44.63             9       3           3         3 -7015.57516

樣本數據

dt1 <- read.table( text = "latitude_1     longitude_1 precipitation    day_mon
54.17   62.15   5   34
69.61   48.65   3   62
73.48   90.16   7   96
66.92   90.27   7   19
56.19   96.46   9   25
72.23   74.18   5   81
88.00   95.20   7   97
92.44   44.41   6   18
95.83   52.91   9   88
99.68   96.23   8   6
81.91   48.32   8   96
54.66   52.70   0   62
95.31   91.82   2   84
60.32   96.25   9   71
97.39   47.91   7   76
65.21   44.63   9   3", header = TRUE ) %>% 
  setDT()

dt2 <- read.table( text = "station    latitude_2  longitude_2  day_begin   day_end 
15  50.00   93.22   34  46
                   11  86.58   85.29   15  47
                   14  93.17   63.17   31  97
                   10  88.56   61.28   15  78
                   13  45.29   77.10   24  79
                   6   69.73   99.52   13  73
                   4   45.60   77.36   28  95
                   13  92.88   62.38   9   51
                   1   65.10   64.13   7   69
                   10  60.57   86.77   34  64
                   3   53.62   60.76   23  96
                   16  87.82   59.41   38  47
                   1   47.83   95.89   21  52
                   11  75.42   46.20   38  87
                   3   55.71   55.26   2   73
                   16  71.65   96.15   36  93", header = TRUE ) %>% 
  setDT()

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM