简体   繁体   中英

Convert json character vector to dataframe in R

I have used Wikitable API to download the table of Nobel Laureates using the following code:

json_2 <- content(response_2, "text")
json_new <- fromJSON(json_2)
json_new <- fromJSON(json_2)
wiki_nobel <- as.data.frame(json_new)

When I convert it into a dataframe, I get the following output. I am unsure of how to convert this into rows and columns. 在此处输入图像描述 在此处输入图像描述 [1,1] should be the column name, followed by the row values

I've tried using

wiki_nobel <- json_new %>% as_tibble()
wiki_nobel <- bind_rows(as.data.frame(json_new)

But they provide the same output.

Any help is appreciated. Thanks

There are several Wikitable API services.
JSON from https://wikitable2json.vercel.app/ can be rectangled with just jsonlite::read_json() :

api_req <- "https://wikitable2json.vercel.app/api/List_of_Nobel_laureates?table=0"
nobel_1 <- jsonlite::read_json(api_req, simplifyVector = T)
tibble::as_tibble(nobel_1)
#> # A tibble: 122 × 7
#>    Year  Physics                           Chemi…¹ Physi…² Liter…³ Peace Econo…⁴
#>    <chr> <chr>                             <chr>   <chr>   <chr>   <chr> <chr>  
#>  1 1901  Wilhelm Röntgen                   Jacobu… Emil A… Sully … Henr… —      
#>  2 1902  Hendrik Lorentz;Pieter Zeeman     Herman… Ronald… Theodo… Élie… —      
#>  3 1903  Henri Becquerel;Pierre Curie;Mar… Svante… Niels … Bjørns… Rand… —      
#>  4 1904  Lord Rayleigh                     Willia… Ivan P… Frédér… Inst… —      
#>  5 1905  Philipp Lenard                    Adolf … Robert… Henryk… Bert… —      
#>  6 1906  J. J. Thomson                     Henri … Camill… Giosuè… Theo… —      
#>  7 1907  Albert Abraham Michelson          Eduard… Charle… Rudyar… Erne… —      
#>  8 1908  Gabriel Lippmann                  Ernest… Élie M… Rudolf… Klas… —      
#>  9 1909  Karl Ferdinand Braun;Guglielmo M… Wilhel… Emil T… Selma … Augu… —      
#> 10 1910  Johannes Diderik van der Waals    Otto W… Albrec… Paul H… Inte… —      
#> # … with 112 more rows, and abbreviated variable names ¹​Chemistry,
#> #   ²​`Physiologyor Medicine`, ³​Literature, ⁴​Economics

Response from https://www.wikitable2json.com/ needs just bit more work:

library(purrr)

nobel_2 <- jsonlite::read_json("https://www.wikitable2json.com/api/List_of_Nobel_laureates")
# response includes a single (nested) list
nobel_2 <- nobel_2[[1]]
# 1st list holds column names
col_names <- unlist(nobel_2[[1]])
# name all other lists, map_dfr turns named lists into single data frame
map_dfr(nobel_2[-1], ~ set_names(.x, col_names))

#> # A tibble: 123 × 7
#>    Year  Physics                           Chemi…¹ Physi…² Liter…³ Peace Econo…⁴
#>    <chr> <chr>                             <chr>   <chr>   <chr>   <chr> <chr>  
#>  1 1901  Wilhelm Röntgen                   Jacobu… Emil A… Sully … Henr… —      
#>  2 1902  Hendrik Lorentz;Pieter Zeeman     Herman… Ronald… Theodo… Élie… —      
#>  3 1903  Henri Becquerel;Pierre Curie;Mar… Svante… Niels … Bjørns… Rand… —      
#>  4 1904  Lord Rayleigh                     Willia… Ivan P… Frédér… Inst… —      
#>  5 1905  Philipp Lenard                    Adolf … Robert… Henryk… Bert… —      
#>  6 1906  J. J. Thomson                     Henri … Camill… Giosuè… Theo… —      
#>  7 1907  Albert Abraham Michelson          Eduard… Charle… Rudyar… Erne… —      
#>  8 1908  Gabriel Lippmann                  Ernest… Élie M… Rudolf… Klas… —      
#>  9 1909  Karl Ferdinand Braun;Guglielmo M… Wilhel… Emil T… Selma … Augu… —      
#> 10 1910  Johannes Diderik van der Waals    Otto W… Albrec… Paul H… Inte… —      
#> # … with 113 more rows, and abbreviated variable names ¹​Chemistry,
#> #   ²​`Physiologyor Medicine`, ³​Literature,
#> #   ⁴​`Economics(The Sveriges Riksbank Prize)[13][lower-alpha 1]`

Created on 2023-01-17 with reprex v2.0.2

Table from wikitable2json is longer by one row, it includes footer with column names.

For some guidelines on how to approach rectangling problems with Tidyverse - https://tidyr.tidyverse.org/articles/rectangle.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM