简体   繁体   English

如何在R中将JSON文件转换为数据框?

[英]How do I convert a JSON file to a data frame in R?

Link to data. 链接到数据。

For my purposes, I downloaded the data from the above link and saved it as a JSON file. 为了我的目的,我从上面的链接下载了数据并将其另存为JSON文件。

json_convert <- do.call(rbind, lapply(paste(readLines("Myfile.json", warn=TRUE),
                         collapse=""), 
                   jsonlite::fromJSON))

So far, I have managed to code the above. 到目前为止,我已经成功编写了上面的代码。 However, I am confused as to how I can convert this into a data frame. 但是,我对如何将其转换为数据帧感到困惑。 All help is appreciated. 感谢所有帮助。

Let's start by examining the data structure: 让我们从检查数据结构开始:

library(purrr)
library(tibble)
library(jsonlite)

my_json <- fromJSON("Myfile.json")
str(my_json)

List of 3
 $ resource  : chr "shotchartdetail"
 $ parameters:List of 30
  ..$ LeagueID      : chr "00"
  ..$ Season        : chr "2017-18"
  ..$ SeasonType    : chr "Regular Season"
  ..$ TeamID        : int 1610612750
  ..$ PlayerID      : int 0
  ..$ GameID        : NULL
  ..$ Outcome       : NULL
  ..$ Location      : NULL
  ..$ Month         : int 0
  ..$ SeasonSegment : NULL
  ..$ DateFrom      : NULL
  ..$ DateTo        : NULL
  ..$ OpponentTeamID: int 0
  ..$ VsConference  : NULL
  ..$ VsDivision    : NULL
  ..$ Position      : NULL
  ..$ RookieYear    : NULL
  ..$ GameSegment   : NULL
  ..$ Period        : int 0
  ..$ LastNGames    : int 0
  ..$ ClutchTime    : NULL
  ..$ AheadBehind   : NULL
  ..$ PointDiff     : NULL
  ..$ RangeType     : int 0
  ..$ StartPeriod   : int 1
  ..$ EndPeriod     : int 10
  ..$ StartRange    : int 0
  ..$ EndRange      : int 28800
  ..$ ContextFilter : chr "SEASON_YEAR='2017-18'"
  ..$ ContextMeasure: chr "FGA"
 $ resultSets:'data.frame': 2 obs. of  3 variables:
  ..$ name   : chr [1:2] "Shot_Chart_Detail" "LeagueAverages"
  ..$ headers:List of 2
  .. ..$ : chr [1:24] "GRID_TYPE" "GAME_ID" "GAME_EVENT_ID" "PLAYER_ID" ...
  .. ..$ : chr [1:7] "GRID_TYPE" "SHOT_ZONE_BASIC" "SHOT_ZONE_AREA" "SHOT_ZONE_RANGE" 
...
  ..$ rowSet :List of 2
  .. ..$ : chr [1:7063, 1:24] "Shot Chart Detail" "Shot Chart Detail" "Shot Chart 
Detail" "Shot Chart Detail" ...
  .. ..$ : chr [1:20, 1:7] "League Averages" "League Averages" "League Averages" "League Averages" ...

Now you have to decide what it is that you want in your data frame. 现在,您必须确定想要在数据框中显示的内容。

I would assume that player statistics are in the first element of $rowSet (1:7063 = rows, 1:24 = columns) and the headers for those columns are in the first element of $resultSets$headers (1:24). 我假设玩家统计信息位于$rowSet的第一个元素(1:7063 =行,1:24 =列)中,而这些列的标题位于$resultSets$headers (1:24)的第一个元素中。

I'm sure there's a very elegant way to use the map functions in purrr . 我敢肯定,在purrr有一种非常优雅的方法来使用map函数。 This isn't it, but it works: 不是吗,但是可以工作:

my_list <- my_json %>% 
  flatten()

my_df <- my_list$rowSet[[1]] %>% 
  as.tibble() %>% 
  setNames(my_list$headers[[1]])

str(my_df)

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   7063 obs. of  24 variables:
 $ GRID_TYPE          : chr  "Shot Chart Detail" "Shot Chart Detail" "Shot Chart Detail" "Shot Chart Detail" ...
 $ GAME_ID            : chr  "0021700011" "0021700011" "0021700011" "0021700011" ...
 $ GAME_EVENT_ID      : chr  "10" "12" "16" "21" ...
 $ PLAYER_ID          : chr  "1626157" "202710" "202710" "201959" ...
 $ PLAYER_NAME        : chr  "Karl-Anthony Towns" "Jimmy Butler" "Jimmy Butler" "Taj Gibson" ...
 $ TEAM_ID            : chr  "1610612750" "1610612750" "1610612750" "1610612750" ...
 $ TEAM_NAME          : chr  "Minnesota Timberwolves" "Minnesota Timberwolves" "Minnesota Timberwolves" "Minnesota Timberwolves" ...
 $ PERIOD             : chr  "1" "1" "1" "1" ...
 $ MINUTES_REMAINING  : chr  "11" "11" "10" "10" ...
 $ SECONDS_REMAINING  : chr  "14" "9" "32" "21" ...
 $ EVENT_TYPE         : chr  "Missed Shot" "Made Shot" "Missed Shot" "Missed Shot" 
...
 $ ACTION_TYPE        : chr  "Jump Shot" "Jump Shot" "Driving Reverse Layup Shot" "Jump Shot" ...
 $ SHOT_TYPE          : chr  "2PT Field Goal" "3PT Field Goal" "2PT Field Goal" "3PT Field Goal" ...
 $ SHOT_ZONE_BASIC    : chr  "Mid-Range" "Above the Break 3" "Restricted Area" "Left Corner 3" ...
 $ SHOT_ZONE_AREA     : chr  "Left Side Center(LC)" "Right Side Center(RC)" "Center(C)" "Left Side(L)" ...
 $ SHOT_ZONE_RANGE    : chr  "16-24 ft." "24+ ft." "Less Than 8 ft." "24+ ft." ...
 $ SHOT_DISTANCE      : chr  "20" "25" "1" "22" ...
 $ LOC_X              : chr  "-113" "199" "-11" "-225" ...
 $ LOC_Y              : chr  "169" "152" "6" "16" ...
 $ SHOT_ATTEMPTED_FLAG: chr  "1" "1" "1" "1" ...
 $ SHOT_MADE_FLAG     : chr  "0" "1" "0" "0" ...
 $ GAME_DATE          : chr  "20171018" "20171018" "20171018" "20171018" ...
 $ HTM                : chr  "SAS" "SAS" "SAS" "SAS" ...
 $ VTM                : chr  "MIN" "MIN" "MIN" "MIN" ...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM