简体   繁体   English


[英]Extracting data from dataframe using different dataframe without headers (R)

I have a gridded data as a data-frame that has daily temperatures (in K) for 30 years. 我有一个网格数据作为数据框架,该框架具有30年的每日温度(以K为单位)。 I need to extract data for days that matches another data-frame and keep the first and second columns (lon and lat). 我需要提取与另一个数据框匹配的几天的数据,并保留第一列和第二列(lon和lat)。

Data example: 数据示例:

gridded data from which I need to remove days that do not match days in the second data (df2$Dates) 我需要从中删除与第二个数据中的天数不匹配的天数的网格数据(df2 $ Dates)

    lon lat 1991-05-01 1991-05-02 1991-05-03 1991-05-04 1991-05-05 1991-05-06 1991-05-07 1991-05-08 1991-05-09
1 5.000  60   278.2488   280.1225   280.3909   279.4138   276.6809   276.2085   276.6250   277.7930   276.9693
2 5.125  60   278.2514   280.1049   280.3789   279.4395   276.7141   276.2467   276.6571   277.8264   277.0225
3 5.250  60   278.2529   280.0871   280.3648   279.4634   276.7437   276.2849   276.6918   277.8608   277.0740
4 5.375  60   278.2537   280.0687   280.3488   279.4858   276.7691   276.3238   276.7289   277.8960   277.1232
5 5.500  60   278.2537   280.0493   280.3319   279.5066   276.7909   276.3633   276.7688   277.9313   277.1701
6 5.625  60   278.2539   280.0294   280.3143   279.5264   276.8090   276.4042   276.8111   277.9666   277.2147
  1991-05-10 1991-05-11 1991-05-12 1991-05-13 1991-05-14 1991-05-15 1991-05-16 1991-05-17 1991-05-18 1991-05-19
1   276.9616   277.3436   273.3149   274.4931   274.6967   275.6298   272.2511   271.5413   271.7289   271.7964
2   276.9689   277.2988   273.3689   274.5399   274.6801   275.6307   272.2214   271.4445   271.6410   271.7023
3   276.9720   277.2533   273.4225   274.5811   274.6646   275.6241   272.1858   271.3391   271.5424   271.5989
4   276.9716   277.2080   273.4726   274.6146   274.6507   275.6109   272.1456   271.2274   271.4340   271.4872
5   276.9689   277.1632   273.5163   274.6382   274.6380   275.5917   272.1022   271.1121   271.3168   271.3693
6   276.9645   277.1190   273.5507   274.6501   274.6263   275.5672   272.0571   270.9955   271.1919   271.2469
  1991-05-20 1991-05-21 1991-05-22 1991-05-23 1991-05-24 1991-05-25 1991-05-26 1991-05-27 1991-05-28 1991-05-29
1   272.2633   268.0039   268.5981   269.4139   267.7836   265.8771   263.5669   266.1666   269.7285   272.5083
2   272.2543   268.0218   268.5847   269.4107   267.7886   265.8743   263.5125   266.1031   269.6471   272.4676
3   272.2434   268.0369   268.5716   269.4089   267.7910   265.8669   263.4592   266.0332   269.5697   272.4217
4   272.2308   268.0507   268.5597   269.4090   267.7925   265.8559   263.4066   265.9581   269.4987   272.3714
5   272.2164   268.0642   268.5505   269.4112   267.7936   265.8425   263.3546   265.8797   269.4355   272.3175
6   272.2005   268.0793   268.5451   269.4154   267.7962   265.8276   263.3039   265.7997   269.3818   272.2614
  1991-05-30 1991-05-31 1991-06-01 1991-06-02 1991-06-03 1991-06-04 1991-06-05 1991-06-06 1991-06-07 1991-06-08
1   274.2950   273.4715   274.5197   274.7548   273.8259   272.4433   274.1811   274.4135   274.3999   276.0327
2   274.2205   273.4638   274.5292   274.8316   273.8658   272.4700   274.1992   274.4426   274.4650   276.0698
3   274.1421   273.4549   274.5373   274.9027   273.9028   272.4980   274.2160   274.4781   274.5309   276.1012
4   274.0609   273.4452   274.5438   274.9665   273.9365   272.5273   274.2322   274.5211   274.5969   276.1255
5   273.9784   273.4353   274.5482   275.0216   273.9660   272.5576   274.2481   274.5725   274.6617   276.1417
6   273.8960   273.4253   274.5508   275.0668   273.9912   272.5887   274.2649   274.6334   274.7239   276.1487
  1991-06-09 1991-06-10 1991-06-11 1991-06-12 1991-06-13 1991-06-14 1991-06-15 1991-06-16 1991-06-17 1991-06-18
1   276.5216   277.1812   277.8093   278.3013   278.5323   278.5403   277.9563   278.3461   275.8296   273.8277
2   276.5531   277.1925   277.8261   278.3409   278.4956   278.5317   277.9148   278.3234   275.8167   273.8302
3   276.5861   277.2065   277.8457   278.3748   278.4503   278.5181   277.8654   278.2939   275.8057   273.8358
4   276.6204   277.2239   277.8684   278.4029   278.3988   278.4996   277.8080   278.2583   275.7966   273.8427
5   276.6564   277.2466   277.8945   278.4253   278.3423   278.4759   277.7429   278.2171   275.7888   273.8504
6   276.6938   277.2753   277.9242   278.4414   278.2834   278.4472   277.6715   278.1714   275.7819   273.8570
  1991-06-19 1991-06-20 1991-06-21 1991-06-22 1991-06-23 1991-06-24 1991-06-25 1991-06-26 1991-06-27 1991-06-28
1   275.1738   274.6805   275.6100   274.8936   273.5818   273.2099   273.1788   271.2747   273.2458   276.9931
2   275.1808   274.7123   275.7043   274.9494   273.5861   273.1770   273.2280   271.2435   273.2662   276.9822
3   275.1859   274.7478   275.7993   275.0009   273.5956   273.1439   273.2730   271.2133   273.2803   276.9678
4   275.1891   274.7879   275.8941   275.0467   273.6107   273.1106   273.3130   271.1840   273.2886   276.9502
5   275.1902   274.8337   275.9870   275.0857   273.6318   273.0777   273.3472   271.1556   273.2918   276.9307
6   275.1891   274.8864   276.0776   275.1168   273.6589   273.0454   273.3752   271.1285   273.2905   276.9101
  1991-06-29 1991-06-30
1   272.0784   273.5677
2   272.0577   273.5973
3   272.0339   273.6237
4   272.0075   273.6476
5   271.9794   273.6701
6   271.9500   273.6925

Second data I'm using for extracting (using Dates variable) 我正在提取的第二个数据(使用Dates变量)

     Dates Temp Wind.S Wind.D
1 5/1/1991   18      4    238
2 5/2/1991   18      8     93
3 5/4/1991   22      8    229
4 5/6/1991   21      4     81
5 5/7/1991   21      8    192
6 5/9/1991   17      8     32
7 5/13/1991   22      8    229
8 5/18/1991   21      4     81
9 6/2/1991   21      8    192
10 6/7/1991   17      8     32

The header of the final data I'm looking for is something like this: 我要查找的最终数据的标头是这样的:

   lon lat 1991-05-01 1991-05-02 1991-05-04 1991-05-06 1991-05-09 1991-05-13

Example data following the format of yours 遵循您格式的示例数据

Daily.df <- data.frame(lon=1:5,lat=1:5,A=1:5,B=1:5,C=1:5,D=1:5)
colnames(Daily.df) <- c("lon","lat","1991-05-01","1991-05-02","1991-05-03","1991-05-04")

  lon lat 1991-05-01 1991-05-02 1991-05-03 1991-05-04
1   1   1          1          1          1          1
2   2   2          2          2          2          2
3   3   3          3          3          3          3
4   4   4          4          4          4          4
5   5   5          5          5          5          5

df2 <- data.frame(Dates = c("5/1/1991","5/2/1991","5/4/1991"))

1 5/1/1991
2 5/2/1991
3 5/4/1991

Using lubridate to convert df2$Dates into the right format, make a vector of the column names you want to keep ( thesedates ) including lon and lat . 使用lubridatedf2$Dates转换为正确的格式,对要保留的列名(这些thesedates )进行向thesedates包括lonlat Then use select_at to keep those columns. 然后使用select_at保留这些列。

thesedates <- c("lon","lat",as.character(mdy(df2$Dates)))
new.df <- Daily.df %>%

Output 产量

  lon lat 1991-05-01 1991-05-02 1991-05-04
1   1   1          1          1          1
2   2   2          2          2          2
3   3   3          3          3          3
4   4   4          4          4          4
5   5   5          5          5          5

If you want to have a long data set to match, I would think you need to first convert the dates in df2 into the proper format and then wrangle the data into wide format. 如果您想要一个长数据集来匹配,我认为您需要首先将df2的日期转换为正确的格式,然后将数据纠缠为宽格式。

Step 1 - Convert dates into correct format 步骤1-将日期转换成正确的格式

df2$Dates <- as.Date(df2$Dates, format = "%m/%d/%Y")

Step 2 - convert to wide format 第2步-转换为宽格式

spread(df2, Dates, data)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM