如何在R中将数据帧从宽格式重塑为长格式？

Question

I am new to R. I am trying to read data from Excel in the mentioned format 我是R的新手。我正尝试以上述格式从Excel读取数据

x1  x2  x3  y1  y2  y3  Result
1   2   3   7   8   9    
4   5   6   10  11  12

and data.frame in R should take data in mentioned format for 1st row R中的data.frame应该在第一行中采用上述格式的数据

then I want to use lm() and export the result to result column. 那么我想使用lm()并将结果导出到结果列。

I want to automate this for n rows ie once results of 1st column is exported to Excel then I want to import data for second row. 我想对n行自动执行此操作，即一旦将第一列的结果导出到Excel，那么我想为第二行导入数据。

Please Help. 请帮忙。

Answer 1

library(gdata)
# this spreadsheet is exactly as in your question

df.original <- read.xls("test.xlsx", sheet="Sheet1", perl="C:/strawberry/perl/bin/perl.exe")
#
#
> df.original
  x1 x2 x3 y1 y2 y3
1  1  2  3  7  8  9
2  4  5  6 10 11 12
#
# for the above code you'll just need to change the argument 'perl' with the
# path of your installer
#
# now the example for the first row
#
library(reshape2)

df <- melt(df.original[1,])

df$variable <- substr(df$variable, 1, 1)

df <- as.data.frame(lapply(split(df, df$variable), `[[`, 2))

> df
  x y
1 1 7
2 2 8
3 3 9

Now, at this stage we automated the process of inport/transformation (for one line). 现在，在此阶段，我们自动化了导入/转换过程（一行）。

First question: How you want the data to look like when every line will be treated? 第一个问题：当每一行都将被处理时，您希望数据看起来如何？ Second question: In result, what do you want exactly to put? 第二个问题：结果，您到底要表达什么？ residual, fitted values? 剩余的拟合值？ what you need from lm() ? 您从lm()需要什么？

EDIT: 编辑：

ok, @kapil tell me if the final shape of df is what you thought: 好的，@ kapil告诉我df的最终形状是否就是您的想法：

library(reshape2)
library(plyr)

df <- adply(df.original, 1, melt, .expand=F)
names(df)[1] <- "rowID"

df$variable <- substr(df$variable, 1, 1)

rows <- df$rowID[ df$variable=="x"] # with y would be the same (they are expected to have the same legnth)
df <- as.data.frame(lapply(split(df, df$variable), `[[`, c("value")))
df$rowID <- rows

df <- df[c("rowID", "x", "y")]

> df
  rowID x  y
1     1 1  7
2     1 2  8
3     1 3  9
4     2 4 10
5     2 5 11
6     2 6 12

regarding the coefficient you can calculate for each rowID (which refers to the actual row in the xls file) in this way: 关于您可以通过以下方式为每个rowID （指xls文件中的实际行）计算的系数：

model <- dlply(df, .(rowID), function(z) {print(z); lm(y ~ x, df);})

> sapply(model, `[`, "coefficients")
$`1.coefficients`
(Intercept)           x 
          6           1 

$`2.coefficients`
(Intercept)           x 
          6           1

so, for each group (or row in original spreadsheet) you have (as expected) two coefficients, intercept and slope, therefore I can't figure out how you want the coefficient to fit inside the data.frame (especially in the 'long' way it appears just above). 因此，对于每个组（或原始电子表格中的行），您都有（按预期）两个系数，即截距和斜率，因此，我无法弄清楚您希望该系数如何适合data.frame （尤其是在的方式显示在上方）。 But if you wanted the data.frame to stay in 'wide' mode then you can try this: 但是，如果您希望data.frame保持“宽”模式，则可以尝试以下操作：

# obtained the object model, you can put the coeff in the df.original data.frame
#
> ldply(model, `[[`, "coefficients")
  rowID (Intercept) x
1     1           6 1
2     2           6 1

df.modified <- cbind(df.original, ldply(model, `[[`, "coefficients"))

> df.modified
  x1 x2 x3 y1 y2 y3 rowID (Intercept) x
1  1  2  3  7  8  9     1           6 1
2  4  5  6 10 11 12     2           6 1

# of course, if you don't like it, you can remove rowID with df.modified$rowID <- NULL

Hope this helps, and let me know if you wanted the 'long' version of df. 希望这会有所帮助，并让我知道您是否需要df的“长版本”。

如何在R中将数据帧从宽格式重塑为长格式？

问题描述

1 个解决方案

解决方案1
3 2013-06-12 11:00:38

如何在R中将数据帧从宽格式重塑为长格式？

问题描述

1 个解决方案

解决方案1 3 2013-06-12 11:00:38

解决方案1
3 2013-06-12 11:00:38