基于现有列和分组在R数据帧中创建新列

Question

I have a dataframe df of soccer team information by game (MATCHID) with these initial values 我有一个足球队信息的数据帧df游戏（MATCHID）与这些初始值

 TEAMID Venue LEAGUEPOS MATCHID
 WHU     A         5       1
 COV     H        12       1
 EVE     H        15       2
 MNU     A         2       2
 ARS     A         3       3
 LEI     H         4       3

I wish to create just one row for each game so that it would end up looking like 我希望为每个游戏创建一行，以便最终看起来像

MATCHID HomeTeam AwayTeam HomePos AwayPos
   1       COV      WHU     12      5      etc.

so I want to create some new columns , delete others and remove duplicated rows. 所以我想创建一些新列，删除其他列并删除重复的行。

I am having trouble with first stage trying 我在第一阶段的尝试遇到了麻烦

df$HomeTeam <- df$TEAMID[df$Venue == "H"] df $ HomeTeam < - df $ TEAMID [df $ Venue ==“H”]

as this produces 因为这会产生

 TEAMID Venue LEAGUEPOS MATCHID HomeTeam
   WHU     A         5       1      COV
   COV     H        12       1      EVE
   EVE     H        15       2      LEI
   MNU     A         2       2      STH
   ARS     A         3       3      TOT
   LEI     H         4       3      WIM

With the HomeTeam just showing the sequential TEAMID for each record with a Venue = H HomeTeam只显示Venue = H的每条记录的顺序TEAMID

Answer 1

This can be easily achieved using the function reshape which is a part of base R. 这可以使用函数reshape轻松实现，该函数reshape是基础R的一部分。

# READ DATA
mydf = read.table(textConnection("
TEAMID Venue LEAGUEPOS MATCHID
 WHU     A         5       1
 COV     H        12       1
 EVE     H        15       2
 MNU     A         2       2
 ARS     A         3       3
 LEI     H         4       3"), 
 sep = "", header = T, colClasses = rep('character', 4))

# RESHAPE DATA
reshape(mydf, idvar = 'MATCHID', timevar = 'Venue', direction = 'wide')

Here is the output produced 这是产生的输出

  MATCHID TEAMID.A LEAGUEPOS.A TEAMID.H LEAGUEPOS.H
1       1      WHU           5      COV          12
3       2      MNU           2      EVE          15
5       3      ARS           3      LEI           4

NOTE: An alternate way to do this is to use cast and melt functions from reshape package. 注意：另一种方法是使用reshape包装中的cast和melt功能。

require(reshape)
mydf_m = melt(mydf, id = c('MATCHID', 'Venue'))
cast(mydf_m, MATCHID ~ Venue + variable)

Answer 2

reshape() in base R does what you want, if a little clunkily. 如果有点笨拙，基础R中的reshape()会做你想要的。 Here is your data: 这是你的数据：

con <- textConnection(" TEAMID Venue LEAGUEPOS MATCHID
 WHU     A         5       1
 COV     H        12       1
 EVE     H        15       2
 MNU     A         2       2
 ARS     A         3       3
 LEI     H         4       3
")
dat <- read.table(con, header = TRUE, stringsAsFactors = FALSE)
close(con)

We reshape() this, get the columns in the requested order, and update the columns names: 我们reshape()这个，按请求的顺序获取列，并更新列名：

newdat <- reshape(dat, direction = "wide", timevar = "Venue", idvar = "MATCHID")
## reorder
newdat <- newdat[, c(1,4,2,5,3)]
names(newdat) <- c("MatchID","HomeTeam","AwayTeam","HomePos","AwayPos")

This gives us: 这给了我们：

> newdat
  MatchID HomeTeam AwayTeam HomePos AwayPos
1       1      COV      WHU      12       5
3       2      EVE      MNU      15       2
5       3      LEI      ARS       4       3

基于现有列和分组在R数据帧中创建新列

问题描述

2 个解决方案

解决方案1
5 已采纳 2011-08-03 17:24:30

解决方案2
1 2011-08-03 17:30:29

基于现有列和分组在R数据帧中创建新列

问题描述

2 个解决方案

解决方案1 5 已采纳 2011-08-03 17:24:30

解决方案2 1 2011-08-03 17:30:29

解决方案1
5 已采纳 2011-08-03 17:24:30

解决方案2
1 2011-08-03 17:30:29