I'm working on lexical analyzing in java world, and want to break a given string into tokens discarding the spaces. I use the below regex to match tokens such as alphabet, numbers and the most common operators and separators:
"[a-zA-Z0-9_]+|[\\[\\](){}.;,!<>+^%]"
However, operators like ++
, --
, ==
, <=
, >=
^=
, *=
, +=
is difficult to handle. Any help in how to improve my regex to fit my needs ? Many thanks.
Try this one:
"[a-zA-Z0-9_]|\+\+|--|<<|>>|[=+<>^*]=|[\[\](){}.;,!<>+^%]"
Explanation:
\\+\\+
catches the ++
--
catches the --
<<
catches the <<
>>
catches the >>
[=+<>^*]=
catches ==
, <=
, >=
, ^=
, *=
, +=
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.