I have the following code to send a HTTP request, receive the response (which is in the form of an XML) and parse it:
public Document getDocumentElementFromDatabase() {
// this URL is actually built dynamically from a query, but for this example I just use one of the possible resulting URLs
String url = "http://musicbrainz.org/ws/2/recording?query=%22Thunderstruck%22+AND+artistname%3A%222Cellos%22";
try {
// sleep between successive requests to avoid flooding the server
Thread.sleep(1000);
HttpURLConnection connection = runQuery(url);
InputStream stream = connection.getInputStream();
if (stream != null) {
BufferedInputStream buff = new BufferedInputStream(stream);
return DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(buff);
}
}
// I've grouped exception handling for this example
catch (ParserConfigurationException | InterruptedException | SAXException | IOException e) {
e.printStackTrace();
}
finally {
if (connection != null) connection.disconnect();
}
return null;
}
private void runQuery(String url) throws MalformedURLException, IOException {
HttpURLConnection connection = (HttpURLConnection) new URL(url).openConnection();
connection.setRequestProperty("User-Agent", "MyAppName/1.0 ( myemail@email.email )");
return connection;
}
This code gets called multiple times and sometimes I get the following error:
[Fatal Error] :1:1: Content is not allowed in prolog.
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
...
If I try to access the URL in say Chrome, I get a valid XML response every time, no matter how many times I reload. What's more, this same issue does not seem to appear when I run the exact same code on my laptop.
After a bit of tinkering, I tried printing the InputStream
s directly as strings (using method 4 from this link ), rather than parsing them, and I noticed that sometimes the response in fact did not have the expected XML header ( <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
), but other times it did.
My guess is I'm doing something wrong with the streams, but I can't figure out what.
I have found the problem. The site seemed to sometimes return a JSON response instead of an XML, which caused the parser to freak out. I've added the following line to runQuery
:
connection.setRequestProperty("Accept", "application/xml");
and I can now successfully run the code without errors.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.