简体   繁体   English

ruby gsub新行字符

[英]ruby gsub new line characters

I have a string with newline characters that I want to gsub out for white space. 我有一个带有换行符的字符串,我想要用于空格。

"hello I\r\nam a test\r\n\r\nstring".gsub(/[\\r\\n]/, ' ')

something like this ^ only my regex seems to be replacing the 'r' and 'n' letters as well. 像这样的东西^只有我的正则表达式似乎也在替换'r''n'字母。 the other constraint is sometimes the pattern repeats itself twice and thus would be replaced with two whitespaces in a row, although this is not preferable it is better than all the text being cut apart. 另一个约束有时候模式会重复两次,因此会被一行中的两个空格所取代,尽管这不是优选的,但它比被分割的所有文本都要好。

If there is a way to only select the new line characters. 如果有办法只选择新行字符。 Or even better if there a more rubiestic way of approaching this outside of going to regex? 或者甚至更好,如果有更多的方法来接近这个去正则表达式?

If you have mixed consecutive line breaks that you want to replace with a single space, you may use the following regex solution: 如果您要将单个空格替换为混合的连续换行符,则可以使用以下正则表达式解决方案:

s.gsub(/\R+/, ' ')

See the Ruby demo . 请参阅Ruby演示

The \\R matches any type of line break and + matches one or more occurrences of the quantified subpattern. \\R匹配任何类型的换行符, +匹配量化子模式的一次或多次出现。

Note that in case you have to deal with an older version of Ruby, you will need to use the negated character class [\\r\\n] that matches either \\r or \\n : 请注意,如果您必须处理旧版本的Ruby,则需要使用与\\r\\n匹配的否定字符类 [\\r\\n]

.gsub(/[\r\n]+/, ' ')

or - add all possible linebreaks: 或 - 添加所有可能的换行符:

/gsub(/(?:\u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029])+/, ' ')

This should work for your test case: 这适用于您的测试用例:

"hello I\\r\\nam a test\\r\\n\\r\\nstring".gsub(/[\\r\\n]/, ' ')

If you don't want successive \\r\\n characters to result in duplicate spaces you can use this instead: 如果您不希望连续的\\r\\n字符导致重复的空格,则可以使用此字符:

"hello I\\r\\nam a test\\r\\n\\r\\nstring".gsub(/[\\r\\n]+/, ' ')

(Note the addition of the + after the character class.) (注意在字符类之后添加+ 。)

As Wiktor mentioned, you're using \\\\ in your regex, which inside the regex literal /.../ actually escapes a backslash, meaning you're matching a literal backslash \\ , r , or n as part of your expression. 正如Wiktor所提到的,你在你的正则表达式中使用了\\\\ ,在正则表达式文字/.../实际上转义为反斜杠,这意味着你将字面反斜杠\\rn作为表达式的一部分进行匹配。 Escaping characters works differently in regex literals, since \\ is used so much, it makes no sense to have a special escape for it (as opposed to regular strings, which is a whole different animal). 逃避字符在正则表达式文字中的工作方式不同,因为\\使用了很多,因此对它进行特殊的转义是没有意义的(与常规字符串相反,这是一个完全不同的动物)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM