简体   繁体   中英

How to match regex with starting index in the middle of a string?

I'm writing a parser and I'd like to avoid chopping up the input string for performance reasons. Thus, I've created a Stream object that represents the string with a cursor:

const Stream = (string, cursor) => Object.freeze({
  string,
  cursor,
  length: string.length - cursor,
  slice: (start, end) => string.slice(start + cursor, end ? start + end : undefined),
  move: distance => stream(string, cursor + distance),
})

I want to be able to use regular expressions to match against this string. However, I don't care about anything in before the cursor. So suppose I have the following string and cursor:

> string = 'hello ABCD'
'ABCD'
> cursor = 6
6

So we don't care about anything before the A , but we want to be able to use regex to match all those uppercase letters:

> re = /^[A-Z]+/
/^[A-Z]+/

I'm not sure how to get this to work. I noticed when you use the g flag, then you can use RegExp.exec and it will keep track of a lastIndex property. But then the ^ match will not start at lastIndex ...

Any ideas how I can get this to work efficiently? If I have to use a 3rd party regex library, I'm fine with that, but ideally this could be done with the native RegExp...

I would do with sed:

sed -rn 's/^.{'$cursor'}([A-Z]+)$/\1/p'

where $cursor is a shell variable containing the number of ignored chars at the beginning.

Option -r is extended regexp, -n is do not print always, p is print if match.

Now the question is how to port that to your language. Here you have some hints of how to use variables in regular expressions in Javascript.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM