简体   繁体   中英

special string splitting in Ruby

I am trying to figure out the best way to do this...

Given a string

s = "if someBool || x==1 && y!=22314" 

I'd like to use Ruby to seperate statements and boolean operators.. so I'd like to split this into

["if","someBool","||","x","==","1","&&","y","!=","22314"]

I could use s.split(), but this only splits with space as delimeters..but I'd like x!=y to be split too (they are valid boolean sentences, they just dont have space in between for good readability). Of course the easiest way is to require the user to put space between boolean operator and variables, but are there any other way to do this?

Split on whitespace or a word boundary:

s = "if someBool || x==1 && y!=22314"
a = s.split( /\s+|\b/ );
p a

Output:

["if", "someBool", "||", "x", "==", "1", "&&", "y", "!=", "22314"]

My rule of thumb: use split if you know what to throw away (the delimiters), use a regex if you know what to keep. In this case you know what to keep (the tokens), so:

s.scan(/ \w+ | (?: \s|\b )(?: \|\| | && | [=!]= )(?: \s|\b ) /x)
# => ["if", "someBool", "||", "x", "==", "1", "&&", "y", "!=", "22314"]

The (?: \\s|\\b ) "delimiters" are to prevent your tokens (eg == ) from matching something you don't want (eg !== )

You can get split to split on anything you want, including a regex. Something like:

s.split( /\s|==|!=/ )

...might be a start.

Disclaimer: regexen make my head hurt. I've tested it now, and it works against your example.


UPDATE: No it doesn't. split always skips what it splits on, so the above code loses the == and != from your example. (Monoceres' code works fine.)

But for some reason if you enclose the split term in the regex in brackets, it keeps the thing in the answer array instead of just splitting on it. I don't know if this is a bug, a feature, or some clever bit of design I don't appreciate properly.

So in fact you need:

s.split( /\s|(==)|(!=)/ )

But this is hardly code that explains itself. And for all I know it doesn't work in 1.9.

Something like this works:

s = "12&&32 || 90==12 !=67"
a = s.split(/ |(\|\|)|(&&)|(!=)|(==)/)
a.delete("")
p a

For some reason "" remained in the array, the delete line fixed that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM