简体   繁体   English

使用循环在 R 中创建多个数据帧

[英]Using a loop to create multiple data frames in R

I have this function that returns a data frame of JSON data from the NBA stats website.我有这个函数,它从 NBA 统计网站返回 JSON 数据的数据框。 The function takes in the game ID of a certain game and returns a data frame of the halftime box score for that game.该函数接受某场比赛的比赛 ID,并返回该比赛的半场得分数据帧。

getstats<- function(game=x){
  for(i in game){
    url<- paste("http://stats.nba.com/stats/boxscoretraditionalv2?EndPeriod=10&
                EndRange=14400&GameID=",i,"&RangeType=2&Season=2015-16&SeasonType=
                Regular+Season&StartPeriod=1&StartRange=0000",sep = "")
    json_data<- fromJSON(paste(readLines(url), collapse=""))
    df<- data.frame(json_data$resultSets[1, "rowSet"])
    names(df)<-unlist(json_data$resultSets[1,"headers"])
  }
  return(df)
}

So what I would like to do with this function is take a vector of several game ID's and create a separate data frame for each one.所以我想用这个函数做的是获取几个游戏 ID 的向量,并为每个游戏 ID 创建一个单独的数据框。 For example:例如:

gameids<- as.character(c(0021500580:0021500593))

I would want to take the vector "gameids", and create fourteen data frames.我想采用向量“gameids”,并创建十四个数据帧。 If anyone knew how I would go about doing this it would be greatly appreciated!如果有人知道我将如何去做这件事,将不胜感激! Thanks!谢谢!

You can save your data.frames into a list by setting up the function as follows:您可以通过如下设置函数将 data.frames 保存到列表中:

getstats<- function(games){

  listofdfs <- list() #Create a list in which you intend to save your df's.

  for(i in 1:length(games)){ #Loop through the numbers of ID's instead of the ID's

    #You are going to use games[i] instead of i to get the ID
    url<- paste("http://stats.nba.com/stats/boxscoretraditionalv2?EndPeriod=10&
                EndRange=14400&GameID=",games[i],"&RangeType=2&Season=2015-16&SeasonType=
                Regular+Season&StartPeriod=1&StartRange=0000",sep = "")
    json_data<- fromJSON(paste(readLines(url), collapse=""))
    df<- data.frame(json_data$resultSets[1, "rowSet"])
    names(df)<-unlist(json_data$resultSets[1,"headers"])
    listofdfs[[i]] <- df # save your dataframes into the list
  }

  return(listofdfs) #Return the list of dataframes.
}

gameids<- as.character(c(0021500580:0021500593))
getstats(games = gameids)

Please note that I could not test this because the URLs do not seem to be working properly.请注意,我无法对此进行测试,因为 URL 似乎无法正常工作。 I get the connection error below:我收到以下连接错误:

Error in file(con, "r") : cannot open the connection

Adding to Abdou's answer, you could create dynamic data frames to hold results from each gameID using the assign() function添加到 Abdou 的答案中,您可以使用assign()函数创建动态数据框来保存每个 gameID 的结果

for(i in 1:length(games)){ #Loop through the numbers of ID's instead of the ID's

#You are going to use games[i] instead of i to get the ID
url<- paste("http://stats.nba.com/stats/boxscoretraditionalv2?EndPeriod=10&
            EndRange=14400&GameID=",games[i],"&RangeType=2&Season=2015-16&SeasonType=
            Regular+Season&StartPeriod=1&StartRange=0000",sep = "")
json_data<- fromJSON(paste(readLines(url), collapse=""))
df<- data.frame(json_data$resultSets[1, "rowSet"])
names(df)<-unlist(json_data$resultSets[1,"headers"])

# create a data frame to hold results
assign(paste('X',i,sep=''),df)
}

The assign function will create data frames same as number of game IDS. assign 函数将创建与游戏 ID 数量相同的数据帧。 They be labelled X1,X2,X3......Xn.它们被标记为 X1,X2,X3......Xn。 Hope this helps.希望这可以帮助。

Use lapply (or sapply) to apply a function to a list and get the results as a list.使用 lapply(或 sapply)将函数应用于列表并将结果作为列表获取。 So if you get a vector of several game ids and a function that do what you want to do, you can use lapply to get a list of dataframe (as your function return df).因此,如果您获得多个游戏 ID 的向量和一个执行您想要执行的操作的函数,则可以使用 lapply 获取数据帧列表(因为您的函数返回 df)。

I haven't been able to test your code (I got an error with the function you provided), but something like this should work:我无法测试您的代码(您提供的函数出现错误),但这样的事情应该有效:

library(RJSONIO)
gameids<- as.character(c(0021500580:0021500593))
df_list <- lapply(gameids, getstats)

getstats<- function(game=x){
        url<- paste0("http://stats.nba.com/stats/boxscoretraditionalv2?EndPeriod=10&EndRange=14400&GameID=",
                     game,
                     "&RangeType=2&Season=2015-16&SeasonType=Regular+Season&StartPeriod=1&StartRange=0000")
        json_data<- fromJSON(paste(readLines(url), collapse=""))
        df<- data.frame(json_data$resultSets[1, "rowSet"])
        names(df)<-unlist(json_data$resultSets[1,"headers"])
        return(df)
}

df_list will contain 1 dataframe per Id you provided in gameids. df_list 将包含您在 gameids 中提供的每个 ID 的 1 个数据帧。

Just use lapply again for additionnal data processing, including saving the dataframes to disk.只需再次使用 lapply 进行额外的数据处理,包括将数据帧保存到磁盘。

data.table is a nice package if you have to deal with a ton of data.如果您必须处理大量数据,data.table 是一个不错的包。 Especially rbindlist allows you to rbind all the dt (=df) contained in a list into a single one if needed (split will do the reverse).特别是 rbindlist 允许您在需要时将列表中包含的所有 dt (=df) rbind 到一个列表中(拆分将相反)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM