简体   繁体   中英

Parse text Java regex match before whitespace, and numbers after whitespace to generate CSV

At the moment I'm using this simple regex:

[^\s]

Which I cobbled together with the help of these docs .

It can grab the following information:

在此处输入图片说明

However the full dataset looks like this:

#### LOGS ####
CONSOLE:
makePush            2196
makePush            638
makePush            470
opAdd           8342
opAdd           288
opStop          133
0x
DEBUG:
#### TRACE ####
PUSH32          pc=00000000 gas=10000000000 cost=3

PUSH32          pc=00000033 gas=9999999997 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000005

PUSH32          pc=00000066 gas=9999999994 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000005
00000001  0000000000000000000000000000000000000000000000000000000000000005

ADD             pc=00000099 gas=9999999991 cost=3
Stack:
00000000  0000000000000000000000000000000000000000000000000000000000000005
00000001  0000000000000000000000000000000000000000000000000000000000000005
00000002  0000000000000000000000000000000000000000000000000000000000000005

ADD             pc=00000100 gas=9999999988 cost=3
Stack:
00000000  000000000000000000000000000000000000000000000000000000000000000a
00000001  0000000000000000000000000000000000000000000000000000000000000005

STOP            pc=00000101 gas=9999999985 cost=0
Stack:
00000000  000000000000000000000000000000000000000000000000000000000000000f

Finally I need my result to look like this:

makePush, 2196
makePush, 638
makePush, 470
opAdd, 8342
opAdd, 288
opStop, 133

And the regex I've provided is certainly not robust enough to capture that.

What I'm trying to do is:

  • Ignore any string in the input that doesn't have the form makePush 2196

  • For lines that are of the form depicted above...

    • Split it into three groups"

      first word , whitespace , second word

  • Finally I want to save a csv of the form:

    first word , second word

Try this?

/([a-zA-Z]+)[\t ]+(\d+)/g

where

  • ([a-zA-Z]+) matches a single word literals
  • [\\t ]+ matches horizontal white spaces
  • (\\d+) matches the number literals

Try this (idea from Pshemo but use \\w+)

        Pattern pattern = Pattern.compile("^(\\w+)\\s+(\\d+)$");
        Matcher matcher = pattern.matcher(str);
        while (matcher.find())
        {
            System.out.println(matcher.group(1)+", "+matcher.group(2));
        }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM