简体   繁体   中英

JSON with additional content at top of file

I am trying to read this url into R as a JSON: https://comtrade.un.org/Data/cache/reporterAreas.json

I see that there is additional content at the top of the file, wrapping the content I am after. A sample of the file looks as follows:

{
  "more": false,
  "results": [
    {
      "id": "all",
      "text": "All"
    },
    {
      "id": "4",
      "text": "Afghanistan"
    },
    {
      "id": "8",
      "text": "Albania"
    }
  ]
}

Trying to read using:

x <- GET(url)
fromJSON(rawToChar(x$content))

doesn't work throwing error: unexpected character '<ef>' . I assume this is seeing the [ .

I also tried download.file(url, file) , calling fromJSON(file) , but that threw the error unexpected character 'r' , which I am guessing is from "results"

I assume this is just some header formatting for the JSON (apologies, I don't do much with JSON files), and there is am option for dealing with it either via GET() or fromJSON() , but I can't see anything in the docs. None of the examples that i have seen describing how to pull JSON from url have this format.

When I call class(rawToChar(x$content)) it shows as a chr vector , so I could clean that eliminating the {"more": false,"results": [ and ]} , but that seems wonky for what looks like a standard format.

If someone can show me how to import this correctly, i would welcome it. Also welcome a more useful question title which describes this issue more effectively.

The <ef> character is the first byte of a byte-order mark translated to UTF-8. The other bytes are <bb><bf> .

When I download the file using download.file() and then decode it using jsonlite::read_json() , it gives a warning about the BOM, but appears to read the rest of the file without an error. You should try that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM