简体   繁体   中英

Extract numeric data from string in groovy

I am given a string that can include both text and numeric data:

Examples:

"100 pounds" "I think 173 lbs" "73 lbs."

I am looking for a clean way to extract only the numeric data from these strings.

Here is what I'm currently doing to strip the response:

def stripResponse(String response) {
    if(response) {
        def toRemove = ["lbs.", "lbs", "pounds.", "pounds", " "]
        def toMod = response
        for(remove in toRemove) {
            toMod = toMod?.replaceAll(remove, "")
        }
        return toMod
    }
}

You could use findAll then convert the results into Integers:

def extractInts( String input ) {
  input.findAll( /\d+/ )*.toInteger()
}

assert extractInts( "100 pounds is 23"  ) == [ 100, 23 ]
assert extractInts( "I think 173 lbs"   ) == [ 173 ]
assert extractInts( "73 lbs."           ) == [ 73 ]
assert extractInts( "No numbers here"   ) == []
assert extractInts( "23.5 only ints"    ) == [ 23, 5 ]
assert extractInts( "positive only -13" ) == [ 13 ]

If you need decimals and negative numbers, you might use a more complex regex:

def extractInts( String input ) {
  input.findAll( /-?\d+\.\d*|-?\d*\.\d+|-?\d+/ )*.toDouble()
}

assert extractInts( "100 pounds is 23"   ) == [ 100, 23 ]
assert extractInts( "I think 173 lbs"    ) == [ 173 ]
assert extractInts( "73 lbs."            ) == [ 73 ]
assert extractInts( "No numbers here"    ) == []
assert extractInts( "23.5 handles float" ) == [ 23.5 ]
assert extractInts( "and negatives -13"  ) == [ -13 ]

Putting this here for people that also need this.

Instead of creating new question, all I needed was one number from a string.

I did this with regex.

def extractInt( String input ) {
  return input.replaceAll("[^0-9]", "")
}

Where input could be this.may.have.number4.com and return 4

I was receiving error from above answer (probably due to my Jenkins version) - For some reason I get this: java.lang.UnsupportedOperationException: spread not yet supported in input.findAll(\\d+)*.toInteger() ---- And it says on Jenkins its resolved.

Hope this helps.

After adding the method below , numbersFilter ,via metaClass , you can call it as following :

assert " i am a positive number 14".numbersFilter() == [ 14 ]
assert " we 12 are 20.3propaged 10.7".numbersFilter() == [ 12,20.3,10.7 ]
assert " we 12 a20.3p 10.7 ,but you can select one".numbersFilter(0) == 12
assert " we 12 a 20.3 pr 10.7 ,select one by index".numbersFilter(1) == 20.3

Add this code As BootStrap

String.metaClass.numbersFilter={index=-1->
            def tmp=[];
            tmp=delegate.findAll( /-?\d+\.\d*|-?\d*\.\d+|-?\d+/ )*.toDouble()
            if(index<=-1){
                return tmp;
            }else{
                if(tmp.size()>index){
                    return tmp[index];
                }else{
                   return tmp.last();
                }
            }

}

Since input.findAll( /\\d+/ )*.toInteger() doesn't work with Jenkins. You can use this instead.

def packageVersion = "The package number is 9.2.5847.1275"
def nextversion=packageVersion.findAll( /\d+/ ).collect{ "$it".toInteger() }
nextversion.add(nextversion.pop()+1)
nextversion = nextversion.join('.')
Result: 9.2.5847.1276

Another alternative solution, without RegEx. It parses the string into tokens and convert them into a list of numbers or null values. The null values are removed and finally, only first entry is considered (as required).

def extractNumericData(String response) {
    response.split(' ')
        .collect { it.isFloat() ? Float.parseFloat(it) : null }
        .findAll { it }
        .first()
}

assert 100 == extractNumericData("100 pounds")
assert 173 == extractNumericData("I think 173 lbs")
assert 73 == extractNumericData("73 lbs.")

When parsing line by line a String.contains and String.replaceAll (replace all sequences of non-digit chars with a space) then String.split() combination is useful, like this:

if (line.contains("RESULT:")) {
    l = line.replaceAll("[^0-9][^0-9]*"," ")
    a = l.split()
    pCount1 = Integer.parseInt(a[0])
    pCount2 = Integer.parseInt(a[1])
}

The String.findAll solutions are better! Equivalent:

if (line.contains("RESULT:")) {
    a = line.findAll( /\d+/ )*.toInteger()
    pCount1 = a[0]
    pCount2 = a[1]
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM