繁体   English   中英

使用 json api 在 R 或 python 中创建数据集

[英]Creating dataset in R or python using json api

如何使用 json Z8A5DA52ED126447D359E70C057 在 python 或 R 中创建具有适当列名的数据集:

https://api.covid19india.org/data.json

基于 R 的回复:您可以使用jsonlite package:

library(jsonlite)
data <- fromJSON("./data/data.json", flatten = FALSE)

我将您问题中的 json 文件保存到./data/data.json 这将生成一个列表:

List of 3
 $ cases_time_series:'data.frame':  104 obs. of  7 variables:
  ..$ dailyconfirmed: chr [1:104] "1" "0" "0" "1" ...
  ..$ dailydeceased : chr [1:104] "0" "0" "0" "0" ...
  ..$ dailyrecovered: chr [1:104] "0" "0" "0" "0" ...
  ..$ date          : chr [1:104] "30 January " "31 January " "01 February " "02 February " ...
  ..$ totalconfirmed: chr [1:104] "1" "1" "1" "2" ...
  ..$ totaldeceased : chr [1:104] "0" "0" "0" "0" ...
  ..$ totalrecovered: chr [1:104] "0" "0" "0" "0" ...
 $ statewise        :'data.frame':  38 obs. of  11 variables:
  ..$ active         : chr [1:38] "47598" "18381" "5121" "6523" ...
  ..$ confirmed      : chr [1:38] "74925" "24427" "8904" "8718" ...
  ..$ deaths         : chr [1:38] "2436" "921" "537" "61" ...
  ..$ deltaconfirmed : chr [1:38] "595" "0" "0" "0" ...
  ..$ deltadeaths    : chr [1:38] "21" "0" "0" "0" ...
  ..$ deltarecovered : chr [1:38] "434" "0" "0" "0" ...
  ..$ lastupdatedtime: chr [1:38] "13/05/2020 11:54:23" "12/05/2020 22:13:24" "12/05/2020 20:16:23" "12/05/2020 22:48:24" ...
  ..$ recovered      : chr [1:38] "24887" "5125" "3246" "2134" ...
  ..$ state          : chr [1:38] "Total" "Maharashtra" "Gujarat" "Tamil Nadu" ...
  ..$ statecode      : chr [1:38] "TT" "MH" "GJ" "TN" ...
  ..$ statenotes     : chr [1:38] "" "[10-May]<br>\n- Total numbers are updated to the final figure reported for 10th May. <br>\n- 665 cases added by"| __truncated__ "" "" ...
 $ tested           :'data.frame':  65 obs. of  11 variables:
  ..$ individualstestedperconfirmedcase: chr [1:65] "75.64102564" "81.56666667" "73.96428571" "72.99450549" ...
  ..$ positivecasesfromsamplesreported : chr [1:65] "" "" "" "" ...
  ..$ samplereportedtoday              : chr [1:65] "" "" "" "" ...
  ..$ source                           : chr [1:65] "Press_Release_ICMR_13March2020.pdf" "ICMR_website_update_18March_6PM_IST.pdf" "ICMR_website_update_19March_10AM_IST_V2.pdf" "ICMR_website_update_19March_6PM_IST.pdf" ...
  ..$ testpositivityrate               : chr [1:65] "1.32%" "1.23%" "1.35%" "1.37%" ...
  ..$ testsconductedbyprivatelabs      : chr [1:65] "" "" "" "" ...
  ..$ testsperconfirmedcase            : chr [1:65] "83.33333333" "87.5" "79.26190476" "77.88461538" ...
  ..$ totalindividualstested           : chr [1:65] "5900" "12235" "12426" "13285" ...
  ..$ totalpositivecases               : chr [1:65] "78" "150" "168" "182" ...
  ..$ totalsamplestested               : chr [1:65] "6500" "13125" "13316" "14175" ...
  ..$ updatetimestamp                  : chr [1:65] "13/03/2020 00:00:00" "18/03/2020 18:00:00" "19/03/2020 10:00:00" "19/03/2020 18:00:00" ...

您可以将此列表转换为一个或多个数据框。 您不能使用dplyr function bind_rows因为您的列表元素都是不同的; 他们有不同的列数和行数。 如果它们有共同的字段,您可以使用join函数将数据框合并在一起。

对此进行扩展:第一个列表元素cases可以轻松拆分并处理为图形:

library(jsonlite)
library(ggplot2)
library(dplyr)
data <- fromJSON("./data/data.json", flatten = FALSE)

cases <- data[[1]] %>% 
  mutate(date = as.Date(date, format = "%d %B")) %>%
  mutate_if(is.character, as.numeric)

ggplot(data = cases, aes(x = date, y = dailyconfirmed)) +
  geom_line()

有了这个结果:

在此处输入图像描述

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM