I am trying to pull JSON lunar cycle data from the USNO API. The problem is that there are two arrays of JSON data in what I get back. I don't see a way to specify what I get back from the Observatory, so I think I need to clean it up in R. Here is my code:
library(sqldf);
library(jsonlite);
curr_date <- Sys.Date();
Q_date <- format.Date(curr_date, "%m/%d/%Y");
moon_call <- paste0("http://api.usno.navy.mil/moon/phase?date=",Q_date,"&nump=4");
moon_json <- fromJSON(moon_call, simplifyDataFrame = TRUE);
moon_phases <- do.call("rbind.fill", lapply(moon_json$phasedata, as.data.frame));
The data I get back looks like this:
"","error","apiversion","year","month","day","numphases","datechanged","phasedata.phase","phasedata.date","phasedata.time"
"1",FALSE,"2.1.0",2018,8,29,4,FALSE,"Last Quarter","2018 Sep 03","02:37"
"2",FALSE,"2.1.0",2018,8,29,4,FALSE,"New Moon","2018 Sep 09","18:01"
"3",FALSE,"2.1.0",2018,8,29,4,FALSE,"First Quarter","2018 Sep 16","23:15"
"4",FALSE,"2.1.0",2018,8,29,4,FALSE,"Full Moon","2018 Sep 25","02:52"
When convert it to a data frame I get this:
"","X[[i]]"
"1","Last Quarter"
"2","New Moon"
"3","First Quarter"
"4","Full Moon"
"5","2018 Sep 03"
"6","2018 Sep 09"
"7","2018 Sep 16"
"8","2018 Sep 25"
"9","02:37"
"10","18:01"
"11","23:15"
"12","02:52"
But what I want is a dataframe with the phasedata.phase/.date/.time
columns selected:
"","phase","date","time"
"1","Last Quarter","2018 Sep 03","02:37"
"2","New Moon","2018 Sep 09","18:01"
"3","First Quarter","2018 Sep 16","23:15"
"4","Full Moon","2018 Sep 25","02:52"
R allows you to directly extract the three columns from the dataframe moon_json
, like you want:
moon_phases <- moon_json[, c('phasedata.phase', 'phasedata.date', 'phasedata.time')]
(No need whatsoever for the do.call("rbind.fill", lapply(..., as.data.frame))
- that's just a inefficient and tortured way of slicing and then concatenating.)
Then you want to rename your df columns to drop the phasedata.
prefix:
names(moon_phases) <- c('phase', 'date', 'time')
or: names(moon_phases) <- gsub('^phasedata\\.', '', names(moon_phases))
1,2,3...
on your dataframes like moon_json
, so just do row.names(moon_json) <- NULL
or data.frame(..., row.names=NULL)
or as.data.frame(..., row.names=NULL)
( jsonlite
(or else one of the other R json packages) should have options to do this cleanup and renaming automatically, I don't know, I don't use them much. Check them out and pick a package that makes scraping less painful.)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.