简体   繁体   English

将R中的两个数据帧与日期合并

[英]Merging two dataframes in R with date

I have the following 2 dataframes: 我有以下2个数据帧:

> bvg1
                         Parameters X18.Oct.14 X19.Oct.14 X20.Oct.14 X21.Oct.14 X22.Oct.14 X23.Oct.14 X24.Oct.14
1               24K Equivalent Plan      29.00      29.60      33.80      36.60      35.30      31.90      29.00
2                24K Equivalent Act      28.80      31.00      35.40      35.90      34.70      33.40      31.90
3                       Plan Rep WS    2463.00    2513.00    2869.00    3115.00    2999.00    2714.00    2468.00
4                        Act Rep WS    2447.00    2633.00    3013.00    3054.00    2953.00    2842.00    2714.00
5                        Rep WS Var     -16.00     120.00     144.00     -61.00     -46.00     128.00     246.00
6                  Plan Rep Intakes     568.00     461.00    1159.00    1146.00    1126.00    1124.00    1106.00
7                   Act Rep Intakes     707.00     494.00    1106.00    1096.00    1274.00    1087.00    1101.00
8                   Rep Intakes Var     139.00      33.00     -53.00     -50.00     148.00     -37.00      -5.00
9                 Plan Rep Comps_DL     468.00      54.00     836.00    1190.00    1327.00    1286.00    1108.00
10                 Act Rep Comps_DL     471.00      70.00     995.00    1137.00    1323.00    1150.00    1073.00
11                 Rep Comps Var_DL       3.00      16.00     159.00     -53.00      -4.00    -136.00     -35.00
12              Plan Rep Mandays_DL     148.00      19.00     260.00     368.00     412.00     398.00     345.00
13               Act Rep Mandays_DL     147.00      19.00     303.00     359.00     423.00     374.00     348.00
14               Rep Mandays Var_DL      -1.00       1.00      43.00      -9.00      12.00     -24.00       3.00
15              Plan FVR Mandays_DL       0.00       0.00       4.00      18.00      18.00      18.00      18.00
16               Act FVR Mandays_DL       0.00       0.00       4.00       7.00       8.00       8.00       7.00
17               FVR Mandays Var_DL       0.00       0.00       0.00     -11.00     -10.00     -10.00     -11.00
18                 Plan Rep Prod_DL       3.16       2.88       3.21       3.23       3.22       3.23       3.21
19                  Act Rep Prod_DL       3.21       3.62       3.28       3.16       3.12       3.07       3.08
20                  Rep Prod Var_DL       0.05       0.74       0.07      -0.07      -0.10      -0.16      -0.13


> bvg2
                         Parameters  X18.Oct  X19.Oct  X20.Oct  X21.Oct  X22.Oct  X23.Oct  X24.Oct
1               24K Equivalent Plan    30.50    31.30    35.10    36.10    33.60    28.80    25.50
2                24K Equivalent Act    31.40    33.40    36.60    38.10    36.80    34.40    32.10
3                       Plan Rep WS  3419.00  3509.00  3933.00  4041.00  3764.00  3220.00  2859.00
4                        Act Rep WS  3514.00  3734.00  4098.00  4271.00  4122.00  3852.00  3591.00
5                        Rep WS Var    95.00   225.00   165.00   230.00   358.00   632.00   732.00
6                  Plan Rep Intakes   813.00   613.00  1559.00  1560.00  1506.00  1454.00  1410.00
7                   Act Rep Intakes   964.00   602.00  1629.00  1532.00  1657.00  1507.00  1439.00
8                   Rep Intakes Var   151.00   -11.00    70.00   -28.00   151.00    53.00    29.00
9                 Plan Rep Comps_DL   675.00   175.00  1331.00  1732.00  1938.00  1706.00  1493.00
10                 Act Rep Comps_DL   718.00   224.00  1389.00  1609.00  1848.00  1698.00  1537.00
11                 Rep Comps Var_DL    43.00    49.00    58.00  -123.00   -90.00    -8.00    44.00
12              Plan Rep Mandays_DL   203.00    58.00   428.00   541.00   605.00   536.00   475.00
13               Act Rep Mandays_DL   215.00    63.00   472.00   542.00   608.00   556.00   523.00
14               Rep Mandays Var_DL    12.00     5.00    44.00     2.00     3.00    20.00    48.00
15              Plan FVR Mandays_DL     0.00     0.00     1.00    12.00     2.00    32.00    57.00
16               Act FVR Mandays_DL     0.00     0.00     2.00     2.00     5.00     5.00     5.00
17               FVR Mandays Var_DL     0.00     0.00     1.00   -10.00     3.00   -27.00   -52.00
18                 Plan Rep Prod_DL     3.33     3.03     3.11     3.20     3.20     3.18     3.14
19                  Act Rep Prod_DL     3.34     3.56     2.94     2.97     3.04     3.05     2.94
20                  Rep Prod Var_DL     0.01     0.53    -0.17    -0.23    -0.16    -0.13    -0.20

It is a time series data ie 24K Equivalent Plan was 29 on 18th Oct, 29.60 on 19th Oct and 33.80 on 20th Oct. First dataframe have data for one business unit and second dataframe have the data for a different business unit. 这是一个时间序列数据,即24K等效计划是10月18日29日,10月19日29.60和10月20日33.80。第一个数据框有一个业务单位的数据,第二个数据框有不同业务单位的数据。

I want to merge dataframes into 1 and want to analyse the variance ie where they differ in values. 我想将数据帧合并为1,并希望分析方差,即它们在值上的不同之处。 Draw ggplots like 2 histograms showing the difference, timeseries plots etc. 绘制ggplots,如2个直方图,显示差异,时间序列图等。

I have tried the following: I can merge the two dataframes by: 我尝试了以下内容:我可以通过以下方式合并两个数据帧:

joined = rbind(bvg1, bvg2)

however, i can't identify the record whether it belongs to bvg1 or bvg2 df. 但是,我无法识别它是属于bvg1还是bvg2 df的记录。

if i add an additional column ie 如果我添加一个额外的列即

bvg1$id = "bvg1"
bvg2$id = "bvg2"

then merge command doesn't work and gives the following error: 然后合并命令不起作用,并给出以下错误:

Error in match.names(clabs, names(xi)) : 
  names do not match previous names

Any sample code would be highly appreciated. 任何示例代码都将受到高度赞赏。

You can match the column names of the two datasets by stripping the . 您可以通过剥离来匹配两个数据集的列名称. followed by the digits in the bvg1 . 然后是bvg1的数字。 This can be done using regex . 这可以使用regex完成。 In the below code, a lookbehind regex is used. 在下面的代码中,使用了lookbehind正则表达式。 It matches the lookbehind (?<=[A-Za-]) ie an alphabet followed by . 它匹配lookbehind (?<=[A-Za-])即后跟的alphabet . followed by one or more elements .* to the end of string $ and remove those "" . 然后是一个或多个元素.*到字符串$的末尾并删除那些""

colnames(bvg1) <-gsub("(?<=[A-Za-z])\\..*$", "", colnames(bvg1), perl=TRUE)
res <- rbind(bvg1, bvg2)
dim(res)
#[1] 40  9

 head(res,3)
 #           Parameters X18.Oct X19.Oct X20.Oct X21.Oct X22.Oct X23.Oct X24.Oct
 #1 24K Equivalent Plan    29.0    29.6    33.8    36.6    35.3    31.9    29.0
 #2  24K Equivalent Act    28.8    31.0    35.4    35.9    34.7    33.4    31.9
 #3         Plan Rep WS  2463.0  2513.0  2869.0  3115.0  2999.0  2714.0  2468.0
 #   id
 #1 bvg1
 #2 bvg1
 #3 bvg1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM