简体   繁体   English

将R中的多个变量的数据重整为宽到长

[英]reshape data wide to long for multiple variables in R

I have a dataset that shows each bank's investment and dollar value associated with this investment. 我有一个数据集,显示每个银行的投资以及与此投资相关的美元价值。 Currently the data looks like this. 当前数据看起来像这样。 I have inv and amt variables stretching from 1 to 43. 我有invamt变量,范围从1扩展到43。

bankid year location inv1   amt1 inv2 amt2 ... inv43 amt43 
1          1990 NYC      AIG    2000 GM   4000     Ford  6000 

but I want the data to look like this 但我希望数据看起来像这样

bankid year location inv number amt
1      1990  NYC     AIG  1     2000  
1      1990  NYC     GM   2     4000
...
1      1990  NYC     Ford 43    6000  

In Stata, I would use this code 在Stata中,我将使用此代码

reshape long inv amt, i(bankid location year) j(number)

What would be the equivalent code in R? R中的等效代码是什么?

reshape can do this. reshape可以做到这一点。 Here I am using the posted subset of your data, where you have time variables 1, 2, and 43: 在这里,我使用的是发布的数据子集,其中有时间变量1、2和43:

x <- read.table(header=TRUE, text='bankid year location inv1   amt1 inv2 amt2  inv43 amt43 
1          1990 NYC      AIG    2000 GM   4000     Ford  6000 ')
x
##   bankid year location inv1 amt1 inv2 amt2 inv43 amt43
## 1      1 1990      NYC  AIG 2000   GM 4000  Ford  6000

v <- outer(c('inv', 'amt'), c(1,2,43), FUN=paste0)
v
##      [,1]   [,2]   [,3]   
## [1,] "inv1" "inv2" "inv43"
## [2,] "amt1" "amt2" "amt43"

reshape(x, direction='long', varying=c(v), sep='')
##      bankid year location time  inv  amt id
## 1.1       1 1990      NYC    1  AIG 2000  1
## 1.2       1 1990      NYC    2   GM 4000  1
## 1.43      1 1990      NYC   43 Ford 6000  1

For your full table, the varying argument would be c(outer(c('inv', 'amt'), 1:43, FUN=paste0)) (but that won't work for the small example, as columns are missing). 对于您的整个表, varying参数将为c(outer(c('inv', 'amt'), 1:43, FUN=paste0)) (但是对于小示例而言,它将不起作用,因为缺少列)。

Here, reshape infers the 'time' variable by inspecting the varying argument and finding common elements ( inv and amt ) on the left, and other elements on the right ( 1 , 2 , and 43 ). 这里, reshape通过检查推断“时间”变量varying参数,并找到共同的元件( invamt在左侧),和其他元素在右侧( 12 ,和43 )。 The sep argument says that there is no separator character (default sep character is . ). sep参数说没有分隔符(默认的sep字符是. )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM