简体   繁体   中英

Regular expression javascript split

I'm trying to create a Regex javascript split, but I'm totally stuck. Here's my input:

9:30 pm
The user did action A.

10:30 pm
Welcome, user John Doe.

***This is a comment

11:30 am
This is some more input.

I want the output array after the split() to be (I've removed the \\n for readability):

["9:30 pm The user did action A.", "10:30 pm Welcome, user John Doe.", "***This is a comment", "11:30 am This is some more input." ];

My current regular expression is:

var split = text.split(/\s*(?=(\b\d+:\d+|\*\*\*))/);

This works, but there is one problem: the timestamps get repeated in extra elements. So I get:

["9:30", "9:30 pm The user did action A.", "10:30",  "10:30 pm Welcome, user John Doe.", "***This is a comment", "11:30", "11:30 am This is some more input." ];

I cant split on the newlines \\n because they aren't consistent, and sometimes there may be no newlines at all.

Could you help me out with a Regex for this?

Thanks so much!!

EDIT: in reply to phleet

It could look like this:

9:30 pm
The user did action A.

He also did action B

10:30 pm Welcome, user John Doe.

Basically, there may or may not be a newline after the timestamp, and there may be multiple newlines for the event description.

I believe the issue is with regards to how Javascript's split treats capturing groups. The solution may just be to use non-capturing group in your pattern. That is, instead of:

/\s*(?=(\b\d+:\d+|\*\*\*))/

Use

/\s*(?=(?:\b\d+:\d+|\*\*\*))/
        ^^

The (?:___) is what is called a non-capturing group.

Looking at the overall pattern, however, the grouping is not actually needed. You should be able to just use:

/\s*(?=\b\d+:\d+|\*\*\*)/

References


Minor point

Instead of \\*\\*\\* , you could use [*]{3} . This may be more readable. The * is not a meta-character inside a character class definition, so it doesn't have to be escaped. The {3} is how you denote "exactly 3 repetition of".

References

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM