简体   繁体   English

如何将列值转换为R中数据框中每个唯一值的行?

[英]How to convert column values to rows for each unique value in a dataframe in R?

I've a large dataframe which contains 12 columns each for two types of values, Rested and Active. 我有一个大型数据框,其中包含12列,分别用于两种类型的值:Rested和Active。 I want to convert the columns of each month into rows, thus bring all the month columns (Jan, Feb, Mar... ) under 'Month' 我想将每个月的列转换为行,从而将所有月份列(Jan,Feb,Mar ...)置于“Month”下

My data is as follows: 我的数据如下:

ID      L1  L2  Year    JR  FR  MR  AR  MYR JR  JLR AGR SR  OR  NR  DR  JA  FA  MA  AA  MYA JA  JLA AGA SA  OA  NA  DA
1234    89  65  2003    11  34  6   7   8   90  65  54  3   22  55  66  76  86  30  76  43  67  13  98  67  0   127 74
1234    45  76  2004    67  87  98  5   4   3   77  8   99  76  56  4   3   2   65  78  44  53  67  98  79  53  23  65

I'm trying to make it appear as below (column R represents Rested and column A represents Active. and monthly JR, FR, MR respectively means Jan Rested, Feb Rested, Mar Rested and JA, FA, MA respectively means Jan Active, Feb Active, Mar Active and etc): 我试图让它显示如下(列R代表Rested,A列代表Active。月度JR,FR,MR分别表示Jan Rested,2月Rested,Mar Rested和JA,FA,MA分别表示Jan Active,2月活跃,活跃等等):

So, here I'm trying to convert each of the monthly columns to rows and keeping them beside each other for R and A values by creating a new Month column. 所以,在这里我试图通过创建一个新的Month列,将每个每月列转换为行并使它们彼此相邻以获得R和A值。

 ID     L1  L2  Year    Month   R   A
1234    89  65  2003    Jan     11  76
1234    89  65  2003    Feb     34  86
1234    89  65  2003    Mar     6   30
1234    89  65  2003    Apr     7   76
1234    89  65  2003    May     8   43
1234    89  65  2003    Jun     90  67
1234    89  65  2003    Jul     65  13
1234    89  65  2003    Aug     54  98
1234    89  65  2003    Sep     3   67
1234    89  65  2003    Oct     22  0
1234    89  65  2003    Nov     55  127
1234    89  65  2003    Dec     66  74
1234    45  76  2004    Jan     67  3
1234    45  76  2004    Feb     87  2
1234    45  76  2004    Mar     98  65
1234    45  76  2004    Apr     5   78
1234    45  76  2004    May     4   44
1234    45  76  2004    Jun     3   53
1234    45  76  2004    Jul     77  67
1234    45  76  2004    Aug     8   98
1234    45  76  2004    Sep     99  79
1234    45  76  2004    Oct     76  53
1234    45  76  2004    Nov     56  23
1234    45  76  2004    Dec     4   65

I've tried various things like stack , melt , unlist 我尝试了各种各样的东西,如stackmeltunlist

data_reshape <- reshape(df,direction="long", varying=list(c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA")), v.names="Precipitation", timevar="Month")

data_stacked <- stack(data, select = c("JR", "FR", "MR", "AR", "MYR", "JR", "JLR", "AGR", "SR", "OR", "NR", "DR", "JA", "FA","MA", "AA", "MYA", "JA", "JLA","AGA", "SA", "OA","NA", "DA"))

but their result is not quite expected - they are giving Jan values of all years, and then Feb values of all years, and then March values of all years, and etc. But I want to structure them in an proper monthly manner for each Year for each ID existing in the entire dataset. 但他们的结果并不是很令人期待 - 他们给出了所有年份的Jan值,然后给出了所有年份的2月值,然后给出了所有年份的3月值等等。但是我希望每年以适当的月度方式构建它们。对于整个数据集中存在的每个ID。

How to achieve this in R? 如何在R中实现这一目标?

Here's a possible solution using the devel version of data.table 这是使用devel版本data.table的可能解决方案

library(data.table) ## v >= 1.9.5

res <- melt(setDT(df),
            id = 1:4, ## id variables
            measure = list(5:16, 17:ncol(df)), # a list of two groups of measure variables
            variable = "Month", # The name of the additional variable
            value = c("R", "A")) # The names of the grouped variables

setorder(res, ID, -L1, L2, Year) ## Reordering the data to match the desired output
res[, Month := month.abb[Month]] ## You don't really need this part as you already have the months numbers

#       ID L1 L2 Year Month  R   A
#  1: 1234 89 65 2003   Jan 11  76
#  2: 1234 89 65 2003   Feb 34  86
#  3: 1234 89 65 2003   Mar  6  30
#  4: 1234 89 65 2003   Apr  7  76
#  5: 1234 89 65 2003   May  8  43
#  6: 1234 89 65 2003   Jun 90  67
#  7: 1234 89 65 2003   Jul 65  13
#  8: 1234 89 65 2003   Aug 54  98
#  9: 1234 89 65 2003   Sep  3  67
# 10: 1234 89 65 2003   Oct 22   0
# 11: 1234 89 65 2003   Nov 55 127
# 12: 1234 89 65 2003   Dec 66  74
# 13: 1234 45 76 2004   Jan 67   3
# 14: 1234 45 76 2004   Feb 87   2
# 15: 1234 45 76 2004   Mar 98  65
# 16: 1234 45 76 2004   Apr  5  78
# 17: 1234 45 76 2004   May  4  44
# 18: 1234 45 76 2004   Jun  3  53
# 19: 1234 45 76 2004   Jul 77  67
# 20: 1234 45 76 2004   Aug  8  98
# 21: 1234 45 76 2004   Sep 99  79
# 22: 1234 45 76 2004   Oct 76  53
# 23: 1234 45 76 2004   Nov 56  23
# 24: 1234 45 76 2004   Dec  4  65

Installation instructions: 安装说明:

library(devtools)
install_github("Rdatatable/data.table", build_vignettes = FALSE)

Here's a base reshape approach: 这是一个基本重塑方法:

res <- reshape(mydf, direction="long", varying=list(5:16, 17:28), v.names=c("R", "A"), times = month.name, timevar = "Month")
res[with(res, order(ID, -L1, L2, Year)), -8]

This is an inelegant solution, but I'm going to post it just to show how problems can be solved with basic tools without relying on high level functions when the task doesn't necessarily require them. 这是一个不优雅的解决方案,但我将发布它只是为了说明如何在不需要高级功能的情况下使用基本工具解决问题。 I think that the more tools you have, the more you can approach correctly to problems. 我认为你拥有的工具越多,就越能正确处理问题。 Here we are: 我们来了:

 #extract the data part
 data<-t(as.matrix(df[,5:28]))
 #build the data.frame cbinding the needed columns
 res<-cbind(df[rep(1:nrow(df),each=12),1:4],  #this repeats the first 4 columns 12 times each
       Month=month.abb, #the month column
       R=as.vector(data[1:12,]), # the R column, obtained from the first 12 rows of data
       A=as.vector(data[13:24,])) #as above
 rownames(res)<-NULL #just to remove the row names

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R - 如何折叠数据框的行,为每个唯一 id 取每列的最大值 - R - How to collapse rows of dataframe taking the max of each column for each unique id R 将 dataframe 转换为每行每列的唯一成员列表 - R convert dataframe to list of unique memberships per column for each row 在 R 中的 dataframe 中的列的每个唯一值之后添加一个空白行 - Add a blank row after each unique value of a column in a dataframe in R 在R数据帧的另一列中提取具有最大值的唯一行 - Extract the unique rows with maximum value in another column in R dataframe 如何检查数据框的列的每个单元格(列表)在R中是否唯一? - how to check if each cells (list) of a column of a dataframe are unique in R? 如何在每行 dataframe 列中仅提取 2 个值,其中每个值在 R dataframe 中由“,”分隔? - How to extract only 2 values in dataframe column per row where each value is separated by a ',' in R dataframe? 如何在 R 中的 dataframe 中的每一行中提取唯一值 - How to extract unique values within each row in dataframe in R R:如何将数据帧转换为每一列的相对频率值? - R: How can I convert a dataframe into relative frequency values for each column? 如何随机采样具有唯一列值的数据帧行 - How to randomly sample dataframe rows with unique column values 即使我不知道唯一值是什么,如何获取数据框列中每个唯一值的计数? - How do I get a count for each unique value in a dataframe column, even if I don't know what the unique values are?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM