繁体   English   中英

使用 Groovy 和正则表达式搜索和替换文本

[英]Search and replace text using Groovy and regex

我需要一个 Groovy 方法来查找文本出现的所有实例并将值递增 1。

鉴于此多行 txt,逗号分隔文件:

AT,3,15,"Company Name","1 High Street","LONDON"," "," "," ","SE5 9AA"
TH,6,118316128,01,118316128,"HSYUD8292",19063,20220707,"4133339"," "," ","1800070",1,20220622,"SDD1880842M102580"
RE,6,13915,"10628","Retail Group US ","T/A Retail Group Illinois","Long Bridge Retail Park"

我必须匹配一个数字,在本例中118316128并在写回文件之前递增 1。 这个数字将永远不同。

我的方法(目前使用硬编码测试数据)匹配第一个实例并成功将其替换为99999

Pattern IdPattern = Pattern.compile("(?<=TH,6,)[0-9]+");

def replaceIDs(sourcePath,IdPattern) {
       def source = new File(sourcePath)
       def text = source.text
       source.withWriter {w ->
            w << text.replaceAll(IdPattern), "99999"} //"99999" is dummy text for now
       }

是否有一种巧妙的方法来匹配两个实例并递增一个,以便118316128在两次出现时都变成118316129

我正在学习 Groovy 所以要温柔:)

好消息,您不需要正则表达式。 根据您的用例,如果它是用另一个字符串一次性替换,您可以使用

yourString.replaceAll("118316128,", "118316129,")

如果您经常这样做,我会将每一行读入一个列表,然后用新值替换列表元素。

选项三是使用 CSV 读取器将数据作为行和列进行操作。

最后一个选项是使用正则表达式。

我会使用 java 的 Pattern/Matcher 功能的全部功能并这样说:

String csv = '''\
AT,3,15,"Company Name","1 High Street","LONDON"," "," "," ","SE5 9AA"
TH,6,118316128,01,118316128,"HSYUD8292",19063,20220707,"4133339"," "," ","1800070",1,20220622,"SDD1880842M102580"
TH,6,118317000,01,118317000,"HSYUD8292",19063,20220707,"4133339"," "," ","1800070",1,20220622,"SDD1880842M102580"
TH,6,118318000,01,118319000,"HSYUD8292",19063,20220707,"4133339"," "," ","1800070",1,20220622,"SDD1880842M102580"
RE,6,13915,"10628","Retail Group US ","T/A Retail Group Illinois","Long Bridge Retail Park"'''

def matcher = csv =~ /,(\d{9,20}),/       

StringBuffer sb = new StringBuffer()

while( matcher.find() ){
  int incNum = matcher.group()[ 1..-2 ].toInteger() + 1
  matcher.appendReplacement sb, ',' + incNum + ','
}
matcher.appendTail sb

sb

sb如下:

AT,3,15,"Company Name","1 High Street","LONDON"," "," "," ","SE5 9AA"
TH,6,118316129,01,118316129,"HSYUD8292",19063,20220707,"4133339"," "," ","1800070",1,20220622,"SDD1880842M102580"
TH,6,118317001,01,118317001,"HSYUD8292",19063,20220707,"4133339"," "," ","1800070",1,20220622,"SDD1880842M102580"
TH,6,118318001,01,118319001,"HSYUD8292",19063,20220707,"4133339"," "," ","1800070",1,20220622,"SDD1880842M102580"
RE,6,13915,"10628","Retail Group US ","T/A Retail Group Illinois","Long Bridge Retail Park"

您可能需要调整正则表达式以包含/排除数字。

使用commons-csv的示例。 您可以使用任何 stream 和 output 作为输入。

import org.apache.commons.csv.CSVFormat
import org.apache.commons.csv.CSVPrinter
import org.apache.commons.csv.CSVRecord
import spock.lang.Specification

class CSVTest extends Specification {
    void processCSV(Reader inputStream, Appendable outputStream, String fromNumber, String toNumber) {
        CSVPrinter csvPrinter = new CSVPrinter(outputStream, CSVFormat.DEFAULT)

        Iterable<CSVRecord> records = CSVFormat.DEFAULT.parse(inputStream)

        for (CSVRecord record: records) {
            List<String> values = record.toList()

            for (int i = 0; i < values.size(); i++) {
                if (values.get(i) == fromNumber) {
                    values.set(i, toNumber)
                }
            }

            csvPrinter.printRecord(values)
        }
    }

    def 'replace CSV number'() {
        given:
        String input = """\
111,"2",2,222,333\r
444,222,2,"2,2,",555\r
"""
        String expectedOutput = """\
111,3,3,222,333\r
444,222,3,"2,2,",555\r
"""

        String fromNumber = '2'
        String toNumber = '3'

        Reader inputStream = new StringReader(input)
        Appendable outputStream = new StringBuffer()

        when:
        processCSV(inputStream, outputStream, fromNumber, toNumber)

        then:
        expectedOutput == outputStream.toString()
    }
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM