简体   繁体   中英

Hebrew text parsing using regex in Java

I am trying to parse a Hebrew text, but I am not getting any success. Can anyone here please help ?

    String hebrewSearhString  = "חן";

    //String regexHebrewPattern = "([\\u0591-\\u05F4\\s]+)"; // Tried this too, but same no success
    String regexHebrewPattern = "([\\p{InHebrew}]+)"; 

    Pattern patternHebrew = Pattern.compile(regexHebrewPattern, Pattern.UNICODE_CASE);
    Matcher matcherHebrew = pattern.matcher(hebrewSearhString);

    if(matcherHebrew.matches()) {
        System.out.println("Whole -"+ matcherHebrew.group(0));
        //System.out.println("Group 1 -"+ matcherHebrew.group(1));
        //System.out.println("Group 2 -"+ matcherHebrew.group(2));
    }

    Result : "If" condition doesn't gets to TRUE

Thanks

This,

Matcher matcherHebrew = pattern.matcher(hebrewSearhString);

Should be

Matcher matcherHebrew = patternHebrew.matcher(hebrewSearhString);

And I get the output,

Whole -חן

Because the if does evaluate to true .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM