简体   繁体   中英

java regex to exclude specific weight from a larger string with date

I have some string

"Today 31.12.2014g we receive goods. These weight is 31.12g (23.03.2014)"

31.12.2014 g - its not mistake. Some text with date label have g letter (without space)

I need extract from string only weight value (without date value), but my regex:

[0-9]+\\.[0-9]+g

exctact date too :(

my results (two group):

12.2014g

31.12g <- i am need only this!!!

You can add negative look behind to make sure that before part you are interested in there is nothing you don't want which in your case seems to be

  • lets say between 1 and 10 numbers with dot after it like in case

     31.12.2014g ^^^ 
  • also to make sure that we will match entire value and not just part of it like in case

     31.12.2014g ^^^^^^^ 

    where 2.2014g fulfils condition of previous negative look behind we need to make sure that matched part should not have any digit before it

So try maybe something like

(?<!\\d{1,10}\\.)(?<!\\d)\\d+\\.\\d+g

BTW \\d (which in Java is written as "\\\\d" ) represents [0-9] . You can change it back if you want.

Demo:

String data = "Today 31.12.2014g we receive goods. These weight is 31.12g (23.03.2014)";
Pattern p = Pattern.compile("(?<!\\d{1,10}\\.)(?<!\\d)\\d+\\.\\d+g");
Matcher m = p.matcher(data);
while(m.find())
    System.out.println(m.group());

Output: 31.12g

You could search for white spaces:

\\s[0-9]+\\.[0-9]+g // -> " 31.12g"

… always assume there are exactly two decimal positions:

[^\\.][0-9]+\\.[0-9]{2}g // -> "31.12g" (though, this will fail if the date is spelled DD.MM.YYg)

… or work with the date:

[0-9]?[0-9]\\.[0-9]?[0-9]\\.[0-9]?[0-9]?[0-9][0-9]g?.+(\\b[0-9]+\\.[0-9]+g) // -> "31.12.2014g we receive goods. These weight is 31.12g", "31.12g"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM