简体   繁体   English

如何在不包含特殊字符的情况下拆分已解析的String数据?

[英]How to split parsed String data without special characters?

I parsed this data from Wikipedia and trying to get only characters from here. 我从Wikipedia解析了此数据,并尝试仅从此处获取字符。 But the result comes with \\n* in the front of data. 但是结果以\\n*开头。

" ": "=== 고양이의 종류 ===\\n [[시암고양이]]\\n* [[페르시안 네브스카야]]\\n* [[페르시안]]\\n* [[노르웨이지언 포레스트]]\\n* [[터키시 앙고라]]\\n* [[아메리칸 숏헤어]]\\n* [[브리티시 숏헤어]]\\n* [[러시안블루]]\\n* [[뱅갈]]\\n* [[메인쿤]]\\n* [[랙돌]]\\n* [[히말라얀]]\\n* [[재패니즈 밥테일]]\\n* [[오리엔탈 숏헤어]]\\n* [[피터볼드]]\\n* [[스코티시 폴드]]\\n* 스코티시 스트레이트\\n* [[하일랜드 폴드]]\\n* [[시베리안 포레스트]]\\n* [[터키시 반]]\\n* [[코리안 쇼트헤어]]\\n* [[올블랙]]\\n* [[사바나캣]]\\n* [[쿠나]]\\n* [[아비시니안]]\\n* 먼치킨" ”:“ ===고양이의종류=== \\ n [[시암고양이]] \\ n * [[페르시안네브스카야]] \\ n * [[페르시안]] \\ n * [[노르웨이지언포레스트]] \\ n * [[터키시]] \\ n * [[아메리칸]] \\ n * [[브리티시]] \\ n * [[러시안블루]] \\ n * [[뱅갈]] \\ n * [[메인쿤]] \\ n * [[랙돌]] \\ n * [[히말라얀]] \\ n * [[재패니즈]] \\ n * [[오리엔탈]] \\ n * [[피터볼드 ]] \\ n * [[스코티시폴드]] \\ n *스코티시스트레이트\\ n * [[하일랜드폴드]] \\ n * [[시베리안스터키시]] \\ n * [[반]]] \\ n * [[코리안 쇼트헤어]] \\ n * [[올블랙]] \\ n * [[사바나캣]] \\ n * [[쿠나]] \\ n * [[아비시니안]] \\ n *먼치킨”

This is my code. 这是我的代码。

try {
        URL url = new URL("https://ko.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&rvsection=20&titles=%EA%B3%A0%EC%96%91%EC%9D%B4&format=json");
        URLConnection con = url.openConnection();
        InputStream is = con.getInputStream();
        InputStreamReader isr = new InputStreamReader(is);
        BufferedReader reader = new BufferedReader(isr);

        while(true){
            String data = reader.readLine();
            if(data == null) break;
            result += data;
        }
        JSONObject obj = new JSONObject(result);
        JSONObject query = (JSONObject) obj.get("query");
        JSONObject pages = (JSONObject) query.get("pages");
        JSONObject pageid = (JSONObject) pages.get("93349");
        JSONArray revisions = (JSONArray) pageid.get("revisions");
        String catcat = String.valueOf(revisions);
        String star = "\n*";
        catcat = catcat.replaceAll("\\[\\[","").replaceAll("\\]\\]",",").replaceAll("\\r|\\n", "").replaceAll(star,"");
        String[] catcategory = catcat.split(",");


      for (int i = 0; i<catcategory.length;i++){
          list.add(catcategory[i]);

      }






    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (JSONException e) {
        e.printStackTrace();
    }

Result for this looks like 结果看起来像

\\n 시암고양이 \\ n 시암고양이
\\n 페르시안 \\ n 페르시안

and I want to remove \\n* . 我想删除\\n*

Everything correct except one line where you need escape asterisk character and escape slash character 一切正确,但需要转义星号和斜杠的一行除外

String star = "\\\\n\\*";
str.replaceAll(star, "");

How to split parsed String data without special characters? 如何在不包含特殊字符的情况下拆分已解析的String数据?

Try this piece of code, It's removed \\n* , Then you can add _result_word to your list. 试试这段代码,将其删除\\ n * ,然后可以将_result_word添加到列表中。

    for (int i = 0; i < catcategory.length; i++) {
            try {
                String _result_word = catcategory[i].replaceFirst("\\\\n", "").replace("*", "");
                //String _result_word=catcategory[i].replaceFirst("\\\\n", "").replace("*", "").replaceFirst("\\\\n", "").replace("*", "");
                System.out.println("" + _result_word);
                list.add(_result_word);
            } catch (Exception ex) {
                System.out.println("Special Exception occurred at index : i = " + i);
                ex.printStackTrace();
            }
        }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM