简体   繁体   中英

Using Regex to remove special character for XML Data retrieved

I retrieved some web service xml data and some tags consist of tags such as <br><p> which makes my data unable to process. Can I know how do I remove the special characters using regex and anybody can suggest to me why my description is not printed out on the UI when I run it? Here is my code for the retrieval of data:

static final String URL = "http://api.eventful.com/rest/events/search?app_key=42t54cX7RbrDFczc&location=singapore";
    // XML node keys

    static final String KEY_DESC = "description";

    String description = KEY_DESC.replaceAll("<.*?/>", "");

for (int i = 0; i < nl.getLength(); i++) {
            // creating new HashMap
            HashMap<String, String> map = new HashMap<String, String>();
            Element e = (Element) nl.item(i);
            map.put(description, "Description: " + parser.getValue(e, description));            
            // adding HashList to ArrayList
            menuItems.add(map);

        }
ListAdapter adapter = new SimpleAdapter(this, menuItems,
                R.layout.list_item, new String[] {description}, new int[] {R.id.description});

String description1 = ((TextView) view
                        .findViewById(R.id.description)).getText().toString();

                // Starting new intent
                Intent in = new Intent(getApplicationContext(),
                        SingleMenuItemActivity.class);
                in.putExtra(KEY_TITLE, title);
                in.putExtra(KEY_DESC, description1);

                startActivity(in);

Here is an example of the special character description :

<p><strong>All Real Estate Agents!</strong><p><strong>INCREASE Your Sales
Selling Commercial  & Industrial</strong><p><strong>even with the LATEST
MAS MEASURES!</strong><p><strong>by The Comm/Ind Consultant - David
Poh</strong><p><strong> </strong><p>Do you know that:<ul><li>Comm/Ind Real
Estate will be the star performer for the year 2013/2014.<li>Comm/Ind Real
Estate investment enjoys both High Yield and Capital Gain.<li>Comm/Ind Real
Estate investment is not affected by cooling
measures.</li></li></li></ul><br><strong>How You Will
Benefit</strong><ul><li>Break through your income<li>Work-life balance -
office hours only<li>Serve and advice investors better<li>Investor
retention<li>Hot Spots to advice investors to invest<li>Learn more about
technical aspects<li>Common pitfalls to avoid<li>Updates on real estate
market</li></li></li></li></li></li></li></li></ul><br><p> <strong>THE
SPEAKER: DAVID POH</strong><p> <p>David Poh has spent many years studying
and understanding the commercial & industrial real estate market in
Singapore. He manages companies that specialize in commercial & industrial
investments, investments training, and real estate funds. He has more than
15 years of real estate experience and has trained thousands of
practitioners in real estate. s achievement is prominent as he is the first
and Life Long Champion  winner in PropNex, s largest real estate company.
Today he is leading the biggest team in  David Poh & Associates. He is also
a sought-after trainer in this arena. He has trained many top-notch real
estate agents, who remain top producers in the industry today. David is
often interviewed on TV (Channel 5/8/U/NewsAsia), radio, and newspaper for
his views and analysis in real estate market trends and related
issues.<p><strong>Come hear for yourselves on how you could make money from
Comm/Ind Investments.</strong><p> <p><strong>Free 4 Hours Intensive
Workshop Dates </strong><p>Date: 26 July 2013 (Friday), 7 August 2013
(Wednesday)<p>Time:  6.30pm<p>Venue: 10 Anson Road International Plaza
#12-12, S(079903)<p><br><p><a href="http://www.asiawisdom.com.sg/"
rel="nofollow">Visit us at
asiawisdom.com.sg</a></p></p></p></p></p></p></p></p></p></p></p></p></p></p></p></p></p>2.30pm
PropNex SingaporeAwardonly David

This is the edited code:

public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        ArrayList<HashMap<String, String>> menuItems = new ArrayList<HashMap<String, String>>();

        XMLParser parser = new XMLParser();
        String xml = parser.getXmlFromUrl(URL); // getting XML
        Document doc = parser.getDomElement(xml); // getting DOM element

        NodeList nl = doc.getElementsByTagName(KEY_EVENT);
        // looping through all item nodes <item>
        for (int i = 0; i < nl.getLength(); i++) {
            // creating new HashMap
            HashMap<String, String> map = new HashMap<String, String>();
            Element e = (Element) nl.item(i);
            // adding each child node to HashMap key => value
            map.put(KEY_TITLE, parser.getValue(e, KEY_TITLE));
            map.put(KEY_URL, parser.getValue(e, KEY_URL));
            map.put(KEY_DESC, "Description: " + parser.getValue(e, KEY_DESC));
            map.put(KEY_START_TIME, parser.getValue(e, KEY_START_TIME));
            map.put(KEY_STOP_TIME, parser.getValue(e, KEY_STOP_TIME));
            map.put(KEY_VENUE_NAME, parser.getValue(e, KEY_VENUE_NAME));
            map.put(KEY_COUNTRY_NAME, parser.getValue(e, KEY_COUNTRY_NAME));
            // adding HashList to ArrayList
            KEY_DESC.replaceAll("\n","").replaceAll("</{0,1}.+?>", "");

            menuItems.add(map);




        }

If I understand you correctly, the problem is that the HTML tags are causing errors in the XML you are generating. This answer will help with that: How to encode XML on Android?

If you are just trying to remove the angle brackets, it's probably just as easy to do it without a regex. At worst, you could just loop over the string characters, test which ones are '<' or '>', and change or delete them.

use this regex </{0,1}.+?> instead of <.*?/> if you want, you can see my example here

and its best to remove all the \\n inside that text first like :

someStrings.replaceAll("\n", " ").replaceAll("</{0,1}.+>", "");

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM