简体   繁体   中英

How to check if XML is well-formed in Elixir

I'm receiving XML files which may not be well-formed, in which case I need to ignore them.

I'm using SweetXml which wraps xmerl.

I have example badly formed XML which doesn't have a space between two attributes.

There is no is_well_formed function - one with a simple boolean response would be great.

Xmerl attempts to parse the file, doesn't like it, and so sends an exit.

I haven't yet learnt about supervisors, but this looks to me like a case for them.

Is there a rookie or simple way of handling that exit signal?

defmodule XmlIsWellFormed.WellFormed do
  def is_well_formed(xml) do
    import SweetXml
    xml_string = to_string xml
    result = xml_string |> parse # parse sends exit.

    # FYI - SweetXml.parse :
    # def parse(doc) do
    #     {parsed_doc, _} = :xmerl_scan.string(doc)
    #     parsed_doc
    # end

    # Note:     inspecting result is no use because xmerl sends an exit with:
    #           "whitespace_required_between_attributes"

    # Something like this would be handy:
    # try do
    #     result = :xmerl_scan.string(xml)
    # rescue
    #     :exit, _ -> nil
    # end
  end
end

rubbish_xml = '<rubbishml><html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-US"xmlns:og="http://ogp.me/ns#" xmlns:fb="http://www.facebook.com/2008/fbml"></rubbishml>'
XmlIsWellFormed.WellFormed.is_well_formed rubbish_xml

You used try/rescue , which only intercepts exceptions. Exits, on the other hand, can be intercepted with the try/catch construct :

def is_well_formed(xml) do
  try do
    xml |> to_string |> parse
    true
  catch
    :exit, _ -> false
  end
end

IEX will print the exit message to the console, but the program will continue to execute:

iex> XmlIsWellFormed.WellFormed.is_well_formed ~s(<a b=""c=""/>)
3437- fatal: {whitespace_required_between_attributes}
false

iex> XmlIsWellFormed.WellFormed.is_well_formed ~s(<a b="" c=""/>)
true

However, catch ing and rescue ing exceptions is very uncommon in Elixir. You should rather design your application with a supervision tree, so that it knows how to respawn itself properly. Then you can just let it crash , and the supervisor will take care of the rest.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM