简体   繁体   English

R:从数据帧计算加速因子

[英]R: Calculate the speed-up factor from data frame

I have the following data frame stored: 我存储了以下数据框:

Source: local data frame [18 x 3]
Groups: instance [?]

   instance          V2             wtime
     (fctr)      (fctr)             (dbl)
1    CCRG10  BranchDBMS         2.1845122
2    CCRG10  CacheDBMS          0.8619093
3    CCRG20  BranchDBMS         7.3522605
4    CCRG20  CacheDBMS          2.5523066
5    CCRG30  BranchDBMS        15.7318869
6    CCRG30  CacheDBMS          5.1411876
7    CCRG40  BranchDBMS        31.7315724
8    CCRG40  CacheDBMS          7.6714212
9    CCRG50  BranchDBMS        58.0909133
10   CCRG50  CacheDBMS         11.3979914
11   CCRG60  BranchDBMS        78.5095645
12   CCRG60  CacheDBMS         15.5988044
13   CCRG70  BranchDBMS        94.0637485
14   CCRG70  CacheDBMS         20.2977642
15   CCRG80  BranchDBMS       102.8716548
16   CCRG80  CacheDBMS         25.0142898
17   CCRG90  BranchDBMS       100.5247555
18   CCRG90  CacheDBMS         28.3753977

I want to transform this table into a new one, eg 我想将此表转换为新表,例如

Source: local data frame [9 x 2]
Groups: instance [?]

   instance           speedup
     (fctr)             (dbl)
1    CCRG10         2.5345035
...

That is for each instance, I want to divide wtime for BranchDBMS by CacheDBMS , here 2.18/0.86=2.53. 那是对于每个实例,我想将BranchDBMS的BranchDBMS除以CacheDBMS ,这里是2.18 / 0.86 = 2.53。

How do I automate this process? 如何自动执行此过程?

By looking at the posted output it seems that you manage your table within dplyr so tidyr approach would be a natural choice. 通过查看发布的输出,您似乎可以在dplyr管理表,因此tidyr方法将是一种自然的选择。

Code

Vectorize(require)(package = c("dplyr", "magrittr", "tidyr"),
                   character.only = TRUE)
dta %<>%
    spread(key = V3, value = V4) %>% 
    mutate(wtimRes = BranchDBMS / CacheDBMS) %>% 
    rename(instance = V2)

Results 结果

> head(dta, 5)
  instance BranchDBMS  CacheDBMS  wtimRes
1   CCRG10   2.184512  0.8619093 2.534504
2   CCRG20   7.352260  2.5523066 2.880634
3   CCRG30  15.731887  5.1411876 3.059971
4   CCRG40  31.731572  7.6714212 4.136336
5   CCRG50  58.090913 11.3979914 5.096592

Gather 收集

Naturally, if desired you may wish to gather your results into one column. 当然,如果需要,您可能希望将结果gather到一栏中。

dta %<>%
    gather(key = key, value = value, -instance)

which would produce: 会产生:

> head(dta,6)
  instance        key     value
1   CCRG10 BranchDBMS  2.184512
2   CCRG20 BranchDBMS  7.352260
3   CCRG30 BranchDBMS 15.731887
4   CCRG40 BranchDBMS 31.731572
5   CCRG50 BranchDBMS 58.090913
6   CCRG60 BranchDBMS 78.509564

Data import 资料汇入

dtaTxt <- "   instance          V2             wtime
     (fctr)      (fctr)             (dbl)
1    CCRG10  BranchDBMS         2.1845122
2    CCRG10  CacheDBMS          0.8619093
3    CCRG20  BranchDBMS         7.3522605
4    CCRG20  CacheDBMS          2.5523066
5    CCRG30  BranchDBMS        15.7318869
6    CCRG30  CacheDBMS          5.1411876
7    CCRG40  BranchDBMS        31.7315724
8    CCRG40  CacheDBMS          7.6714212
9    CCRG50  BranchDBMS        58.0909133
10   CCRG50  CacheDBMS         11.3979914
11   CCRG60  BranchDBMS        78.5095645
12   CCRG60  CacheDBMS         15.5988044
13   CCRG70  BranchDBMS        94.0637485
14   CCRG70  CacheDBMS         20.2977642
15   CCRG80  BranchDBMS       102.8716548
16   CCRG80  CacheDBMS         25.0142898
17   CCRG90  BranchDBMS       100.5247555
18   CCRG90  CacheDBMS         28.3753977"

dta <- read.table(textConnection(dtaTxt), header = FALSE, 
                  colClasses=c("NULL", NA, NA, NA), skip = 2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM