如何使用R中的plyr包創建與給定列級別相對應的新數據框列？

Question

我正在尋找一種“優雅”的方法，基本上是按一個列變量的級別拆分數據幀，然后創建一個經過重塑的新輸出數據幀，以便現在刪除因子變量並為因子變量的級別添加新列。 我可以使用split（）方法之類的功能來執行此操作，但這對我來說似乎是一團糟。 我一直在嘗試使用plyr包中的melt（）和cast（）函數來執行此操作，但是未能成功獲取所需的確切輸出。

這是我的數據：

> jumbo.df = read.csv(...)
> head(jumbo.df)
         PricingDate  Name     Rate 
    186  2012-03-05   Type A   2.875  
    187  2012-03-05   Type B   3.250  
    188  2012-03-05   Type C   3.750  
    189  2012-03-05   Type D   3.750  
    190  2012-03-05   Type E   4.500  
    191  2012-03-06   Type A   2.875

, remove and , then output columns for , , , , and with the corresponding Rate series with Date as ID: 我想做的是按變量拆分，刪除和，然后輸出，，，和，並以Date為ID的相應Rate系列：

> head(output.df)
         PricingDate  Type A   Type B    Type C    Type D    Type E 
         2012-03-05    2.875    3.250     3.750     3.750     4.500  
         2012-03-06    2.875    ...

謝謝！

Answer 1

不知道我是否正確，但是可能只是要將數據重塑為寬格式嗎？ 如果是這樣，你必須使用melt和cast的功能reshape （！）封裝。 reshape2基本相同。 由於您的數據已經采用混合格式（即長格式），因此單線即可滿足您的要求：

df <- read.table(textConnection("PricingDate  Name     Rate 
                                 2012-03-05   TypeA   2.875  
                                 2012-03-05   TypeB   3.250  
                                 2012-03-05   TypeC   3.750  
                                 2012-03-05   TypeD   3.750  
                                 2012-03-05   TypeE   4.500  
                                 2012-03-06   TypeA   2.875"), header=TRUE, row.names=NULL)
library(reshape2)
dcast(df, PricingDate ~ Name)
Using Rate as value column: use value.var to override.
  PricingDate TypeA TypeB TypeC TypeD TypeE
1  2012-03-05 2.875  3.25  3.75  3.75   4.5
2  2012-03-06 2.875    NA    NA    NA    NA

Answer 2

library(plyr)
library(reshape2)

    data <- structure(list(PricingDate = c("2012-03-05", "2012-03-05", "2012-03-05", 
    "2012-03-05", "2012-03-05", "2012-03-06", "2012-03-06", "2012-03-06", 
    "2012-03-06", "2012-03-06"), Name = c("Type A", "Type B", "Type C", 
    "Type D", "Type E", "Type A", "Type B", "Type C", "Type D", "Type E"
    ), Rate = c(2.875, 3.25, 3.75, 3.75, 4.5, 4.875, 5.25, 6.75, 
    7.75, 8.5)), .Names = c("PricingDate", "Name", "Rate"), class = "data.frame", row.names = c("186", 
    "187", "188", "189", "190", "191", "192", "193", "194", "195"
    ))


    > data
        PricingDate   Name  Rate
    186  2012-03-05 Type A 2.875
    187  2012-03-05 Type B 3.250
    188  2012-03-05 Type C 3.750
    189  2012-03-05 Type D 3.750
    190  2012-03-05 Type E 4.500
    191  2012-03-06 Type A 4.875
    192  2012-03-06 Type B 5.250
    193  2012-03-06 Type C 6.750
    194  2012-03-06 Type D 7.750
    195  2012-03-06 Type E 8.500


    ddply(data, .(PricingDate), function(x) reshape(x, idvar="PricingDate", timevar="Name", direction="wide"))



    PricingDate Rate.Type A Rate.Type B Rate.Type C Rate.Type D
    1  2012-03-05       2.875        3.25        3.75        3.75
    2  2012-03-06       4.875        5.25        6.75        7.75
      Rate.Type E
    1         4.5
    2         8.5

如何使用R中的plyr包創建與給定列級別相對應的新數據框列？

問題描述

2 個解決方案

解決方案1
4 已采納 2012-06-01 15:27:26

解決方案2
1 2012-06-01 15:34:27

如何使用R中的plyr包創建與給定列級別相對應的新數據框列？

問題描述

2 個解決方案

解決方案1 4 已采納 2012-06-01 15:27:26

解決方案2 1 2012-06-01 15:34:27

解決方案1
4 已采納 2012-06-01 15:27:26

解決方案2
1 2012-06-01 15:34:27