[英]Split on different newlines
Right now I'm doing a split
on a string and assuming that the newline from the user is \r\n
like so:现在我正在对字符串进行
split
,并假设用户的换行符是\r\n
,如下所示:
string.split(/\r\n/)
What I'd like to do is split on either \r\n
or just \n
.我想做的是拆分
\r\n
或只是\n
。
So how what would the regex be to split on either of those?那么正则表达式将如何拆分其中的任何一个?
Did you try /\r?\n/
?你试过
/\r?\n/
吗? The ?
?
makes the \r
optional.使
\r
可选。
Example usage: http://rubular.com/r/1ZuihD0YfF用法示例: http://rubular.com/r/1ZuihD0YfF
Ruby has the methods String#each_line
and String#lines
Ruby 有方法
String#each_line
和String#lines
returns an enum: http://www.ruby-doc.org/core-1.9.3/String.html#method-i-each_line返回一个枚举: http://www.ruby-doc.org/core-1.9.3/String.html#method-i-each_line
returns an array: http://www.ruby-doc.org/core-2.1.2/String.html#method-i-lines返回一个数组: http://www.ruby-doc.org/core-2.1.2/String.html#method-i-lines
I didn't test it against your scenario but I bet it will work better than manually choosing the newline chars.我没有针对您的场景对其进行测试,但我敢打赌它会比手动选择换行符更好。
# Split on \r\n or just \n
string.split( /\r?\n/ )
Although it doesn't help with this question (where you do need a regex), note that String#split
does not require a regex argument.尽管它对这个问题没有帮助(您确实需要正则表达式),但请注意
String#split
不需要正则表达式参数。 Your original code could also have been string.split( "\r\n" )
.您的原始代码也可能是
string.split( "\r\n" )
。
\n is for unix
\r is for mac
\r\n is for windows format
To be safe for operating systems.为了操作系统的安全。 I would do /\r?\n|\r\n?/
我会做 /\r?\n|\r\n?/
"1\r2\n3\r\n4\n\n5\r\r6\r\n\r\n7".split(/\r?\n|\r\n?/)
=> ["1", "2", "3", "4", "", "5", "", "6", "", "7"]
The alternation operator in Ruby Regexp
is the same as in standard regular expressions: |
Ruby
Regexp
中的交替运算符与标准正则表达式中的相同: |
So, the obvious solution would be因此,显而易见的解决方案是
/\r\n|\n/
which is the same as这与
/\r?\n/
ie an optional \r
followed by a mandatory \n
.即可选的
\r
后跟强制的\n
。
Perhaps do a split on only '\n' and remove the '\r' if it exists?也许只对'\n'进行拆分并删除'\r'(如果存在)?
Are you reading from a file, or from standard in?您是从文件中读取,还是从标准输入中读取?
If you're reading from a file, and the file is in text mode, rather than binary mode, or you're reading from standard in, you won't have to deal with \r\n
- it'll just look like \n
.如果您正在从文件中读取,并且该文件处于文本模式,而不是二进制模式,或者您正在从标准输入中读取,则不必处理
\r\n
- 它看起来像\n
。
C:\Documents and Settings\username>irb
irb(main):001:0> gets
foo
=> "foo\n"
Another option is to use String#chomp , which also handles newlines intelligently by itself.另一种选择是使用String#chomp ,它也可以自己智能地处理换行符。
You can accomplish what you are after with something like:您可以通过以下方式完成您所追求的目标:
lines = string.lines.map(&:chomp)
Or if you are dealing with something large enough that memory use is a concern:或者,如果您正在处理的事情足够大,以至于 memory 使用是一个问题:
<string|io>.each_line do |line|
line.chomp!
# do work..
end
Performance isn't always the most important thing when solving this kind of problem, but it is worth noting the chomp solution is also a bit faster than using a regex.解决这类问题时,性能并不总是最重要的,但值得注意的是,chomp 解决方案也比使用正则表达式快一点。
On my machine (i7, ruby 2.1.9):在我的机器上(i7,ruby 2.1.9):
Warming up --------------------------------------
map/chomp 14.715k i/100ms
split custom regex 12.383k i/100ms
Calculating -------------------------------------
map/chomp 158.590k (± 4.4%) i/s - 794.610k in 5.020908s
split custom regex 128.722k (± 5.1%) i/s - 643.916k in 5.016150s
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.