[英]Parse text Java regex match before whitespace, and numbers after whitespace to generate CSV
At the moment I'm using this simple regex:目前我正在使用这个简单的正则表达式:
[^\s]
Which I cobbled together with the help of these docs .我是在这些文档的帮助下拼凑起来的。
It can grab the following information:它可以抓取以下信息:
However the full dataset looks like this:然而,完整的数据集如下所示:
#### LOGS ####
CONSOLE:
makePush 2196
makePush 638
makePush 470
opAdd 8342
opAdd 288
opStop 133
0x
DEBUG:
#### TRACE ####
PUSH32 pc=00000000 gas=10000000000 cost=3
PUSH32 pc=00000033 gas=9999999997 cost=3
Stack:
00000000 0000000000000000000000000000000000000000000000000000000000000005
PUSH32 pc=00000066 gas=9999999994 cost=3
Stack:
00000000 0000000000000000000000000000000000000000000000000000000000000005
00000001 0000000000000000000000000000000000000000000000000000000000000005
ADD pc=00000099 gas=9999999991 cost=3
Stack:
00000000 0000000000000000000000000000000000000000000000000000000000000005
00000001 0000000000000000000000000000000000000000000000000000000000000005
00000002 0000000000000000000000000000000000000000000000000000000000000005
ADD pc=00000100 gas=9999999988 cost=3
Stack:
00000000 000000000000000000000000000000000000000000000000000000000000000a
00000001 0000000000000000000000000000000000000000000000000000000000000005
STOP pc=00000101 gas=9999999985 cost=0
Stack:
00000000 000000000000000000000000000000000000000000000000000000000000000f
Finally I need my result to look like this:最后我需要我的结果看起来像这样:
makePush, 2196
makePush, 638
makePush, 470
opAdd, 8342
opAdd, 288
opStop, 133
And the regex
I've provided is certainly not robust enough to capture that.我提供的regex
肯定不够健壮,无法捕捉到这一点。
What I'm trying to do is:我想要做的是:
Ignore any string in the input that doesn't have the form makePush 2196
忽略输入中没有makePush 2196
形式的任何字符串
For lines that are of the form depicted above...对于上述形式的线条...
Split it into three groups"分成三组”
first word
, whitespace
, second word
first word
, whitespace
, second word
Finally I want to save a csv of the form:最后我想保存一个 csv 格式:
first word
, second word
first word
, second word
Try this?尝试这个?
/([a-zA-Z]+)[\t ]+(\d+)/g
where在哪里
([a-zA-Z]+)
matches a single word literals ([a-zA-Z]+)
匹配单个单词文字[\\t ]+
matches horizontal white spaces [\\t ]+
匹配水平空白(\\d+)
matches the number literals (\\d+)
匹配数字文字Try this (idea from Pshemo but use \\w+)试试这个(来自 Pshemo 的想法,但使用 \\w+)
Pattern pattern = Pattern.compile("^(\\w+)\\s+(\\d+)$");
Matcher matcher = pattern.matcher(str);
while (matcher.find())
{
System.out.println(matcher.group(1)+", "+matcher.group(2));
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.