简体   繁体   English

从ruby中的字符串中删除空行

[英]Remove empty lines from a string in ruby

I've gone through other similar questions and they dont seem to explain my problem. 我经历过其他类似的问题,他们似乎没有解释我的问题。

My output ,right now is like this, I would like to remove empty lines from the string in ruby, 我的输出,就是这样,我想从ruby中的字符串中删除空行,

#    

CIRRUS LADIES NIGHT with DJ ROHIT

4th of JULY Party ft. DJ JASMEET @ I-Bar

Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)

Champagne Showers - DJs Panic & Nyth @ Blue Waves

THURSDAY PAST AND PRESENT @ Hint

and I want my output to be like this, 我希望我的输出像这样,

CIRRUS LADIES NIGHT with DJ ROHIT
4th of JULY Party ft. DJ JASMEET @ I-Bar
Submerge Deep @ Pebble | Brute Force (Tuhin Mehta) | DJ Arpan (Opening)
Champagne Showers - DJs Panic & Nyth @ Blue Waves
THURSDAY PAST AND PRESENT @ Hint

I've tried gsub /^$\\n/,'' , gsub(/\\n/,'') , squeeze("\\n") and delete! "\\n" 我试过gsub /^$\\n/,''gsub(/\\n/,'')squeeze("\\n")delete! "\\n" delete! "\\n" to no avail. delete! "\\n"无济于事。

Also,I forgot to mention that my string starts with a blank line, the # denotes a blank line before the first line,if that would change anything. 另外,我忘了提到我的字符串以空行开头, #表示第一行之前的空行,如果这会改变任何东西。

My String.inspect as requested,the content of the string has changed,though the issue is still the same. 我的String.inspect请求,字符串的内容已更改,但问题仍然相同。

string.inspect :

"\n\n\t\t\t\t\t\t\t\t\t"
"Tricky Tuesdays with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"Bladder Buster Challenge with DJ Sean @ Star Rock"
"\n\n\t\t\t\t\t\t\t\t\t"
"Classic Rock Tuesday @ 10D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Vodka Night with DJ John @ Blend"
"\n\n\t\t\t\t\t\t\t\t\t"
"\"BOLLYWOOD WEDNESDAYS\" with DJ D Nash @ Candy Club"
"\n\n\t\t\t\t\t\t\t\t\t"
"RE - LAUNCH WEDNESDAY LADIES NIGHT @ ZODIAC"
"\n\n\t\t\t\t\t\t\t\t\t"
"Ladies Night @ 10 D - Chennai"
"\n\n\t\t\t\t\t\t\t\t\t"
"Wednesday Mayhem @ Dublin"
"\n\n\t\t\t\t\t\t\t\t\t"

这是我的解决方案:

text.gsub(/\n+|\r+/, "\n").squeeze("\n").strip

This removes all consecutive empty lines: 这将删除所有连续的空行:

result = s.squeeze("\r\n").gsub(/(\r\n)+/, "\r\n")

or a commandline option without Ruby: 或没有Ruby的命令行选项:

grep -v "^$" <file>

First of all, your code removes all newlines, not just the blank ones - that doesn't sound like what you want. 首先,您的代码会删除所有换行符,而不仅仅是空白换行符 - 这听起来并不像您想要的那样。

Second, THE operating systems have historically disagreed on how to represent newlines - old Macs used \\r for new lines, Linux and OSX use \\n , and Windows uses the combo \\r\\n . 其次,操作系统历来不同意怎么能代表换行符-使用的旧的Mac \\r新线,Linux和OSX使用\\n ,和Windows使用组合\\r\\n So you really want to replace consecutive \\r 's and \\n s (indicating a blank line in there) with a single \\n . 所以你真的想用一个\\n替换连续的\\r\\n s(用那里的空行表示)。

.split(/\\n/).reject{ |l| l.chomp.empty? }.join("\\n")

for Unix style only: 仅适用于Unix风格:

.split(/\\n/).reject(&:empty?).join("\\n")

removes whitespace lines too (Unix, Rails method): 也删除空白行(Unix,Rails方法):

.split(/\\n/).reject(&:blank?).join("\\n")

Here's a single regex that removes all blank lines, including those at the start or end of the file, including lines that contain only spaces or tabs, and allowing for all three forms of line ending markers ( \\r\\n , \\n , and \\r ): 这是一个删除所有空白行的正则表达式,包括文件开头或结尾处的空行,包括仅包含空格或制表符的行,并允许所有三种形式的行结束标记( \\r\\n\\n\\r )):

def remove_blank_lines( str, line_ending="\n" )
  str.gsub(/(?<=\A|#{line_ending})[ \t]*(?:#{line_ending}|\z)/,'')
end

Tested: 测试:

[ "\r\n", "\n", "\r" ].each do |marker|
    puts '='*70, "Lines ending with: #{marker.inspect}", '='*70
  [ "", " ", "\t", " \t", "\t " ].each do |whitespace|
    0.upto(2) do |lines|
        blank_lines = "#{whitespace}#{marker*lines}"
      s = "#{marker*lines}a#{marker*lines}b#{blank_lines}c#{blank_lines}"
      tight = remove_blank_lines(s, marker)
      puts "%43s -> %s" % [s.inspect, tight.inspect]
    end
  end
end

#=> ======================================================================
#=> Lines ending with: "\r\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                       "\r\na\r\nb\r\nc\r\n" -> "a\r\nb\r\nc\r\n"
#=>       "\r\n\r\na\r\n\r\nb\r\n\r\nc\r\n\r\n" -> "a\r\nb\r\nc\r\n"
#=>                                     "ab c " -> "ab c "
#=>                     "\r\na\r\nb \r\nc \r\n" -> "a\r\nb \r\nc \r\n"
#=>     "\r\n\r\na\r\n\r\nb \r\n\r\nc \r\n\r\n" -> "a\r\nb \r\nc \r\n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                   "\r\na\r\nb\t\r\nc\t\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>   "\r\n\r\na\r\n\r\nb\t\r\n\r\nc\t\r\n\r\n" -> "a\r\nb\t\r\nc\t\r\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                 "\r\na\r\nb \t\r\nc \t\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=> "\r\n\r\na\r\n\r\nb \t\r\n\r\nc \t\r\n\r\n" -> "a\r\nb \t\r\nc \t\r\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                 "\r\na\r\nb\t \r\nc\t \r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> "\r\n\r\na\r\n\r\nb\t \r\n\r\nc\t \r\n\r\n" -> "a\r\nb\t \r\nc\t \r\n"
#=> ======================================================================
#=> Lines ending with: "\n"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\na\nb\nc\n" -> "a\nb\nc\n"
#=>                       "\n\na\n\nb\n\nc\n\n" -> "a\nb\nc\n"
#=>                                     "ab c " -> "ab c "
#=>                             "\na\nb \nc \n" -> "a\nb \nc \n"
#=>                     "\n\na\n\nb \n\nc \n\n" -> "a\nb \nc \n"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\na\nb\t\nc\t\n" -> "a\nb\t\nc\t\n"
#=>                   "\n\na\n\nb\t\n\nc\t\n\n" -> "a\nb\t\nc\t\n"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\na\nb \t\nc \t\n" -> "a\nb \t\nc \t\n"
#=>                 "\n\na\n\nb \t\n\nc \t\n\n" -> "a\nb \t\nc \t\n"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\na\nb\t \nc\t \n" -> "a\nb\t \nc\t \n"
#=>                 "\n\na\n\nb\t \n\nc\t \n\n" -> "a\nb\t \nc\t \n"
#=> ======================================================================
#=> Lines ending with: "\r"
#=> ======================================================================
#=>                                       "abc" -> "abc"
#=>                               "\ra\rb\rc\r" -> "a\rb\rc\r"
#=>                       "\r\ra\r\rb\r\rc\r\r" -> "a\rb\rc\r"
#=>                                     "ab c " -> "ab c "
#=>                             "\ra\rb \rc \r" -> "a\rb \rc \r"
#=>                     "\r\ra\r\rb \r\rc \r\r" -> "a\rb \rc \r"
#=>                                   "ab\tc\t" -> "ab\tc\t"
#=>                           "\ra\rb\t\rc\t\r" -> "a\rb\t\rc\t\r"
#=>                   "\r\ra\r\rb\t\r\rc\t\r\r" -> "a\rb\t\rc\t\r"
#=>                                 "ab \tc \t" -> "ab \tc \t"
#=>                         "\ra\rb \t\rc \t\r" -> "a\rb \t\rc \t\r"
#=>                 "\r\ra\r\rb \t\r\rc \t\r\r" -> "a\rb \t\rc \t\r"
#=>                                 "ab\t c\t " -> "ab\t c\t "
#=>                         "\ra\rb\t \rc\t \r" -> "a\rb\t \rc\t \r"
#=>                 "\r\ra\r\rb\t \r\rc\t \r\r" -> "a\rb\t \rc\t \r"

Try 尝试

/^\n/

and replace with the empty string. 并用空字符串替换。

are you sure your newline character is only \\n ? 你确定你的换行​​符只是\\n吗? If not try 如果不试试

/^\r?\n/

to allow also the linebreak sequence \\r\\n . 允许换行序列\\r\\n

Here's an ugly hack based on @Tom's answer: 根据@ Tom的答案,这是一个丑陋的黑客:

result = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }

It supports DOS ( \\r\\n ), Unix ( \\n ), and MacOS 9- ( \\r ) line breaks. 它支持DOS( \\r\\n ),Unix( \\n )和MacOS 9-( \\r )换行符。 Tested: 测试:

[ "\r\n", "\n", "\r" ].each do |marker|
  1.upto(5) do |lines|
    s = "a#{marker*lines}b"
    tight = s.squeeze("\r\n").tap{ |s2| :go while s2.gsub!("\r\n\r\n","\r\n") }
    puts "%24s -> %s" % [s.inspect, tight.inspect]
  end
end
#=>                 "a\r\nb" -> "a\r\nb"
#=>             "a\r\n\r\nb" -> "a\r\nb"
#=>         "a\r\n\r\n\r\nb" -> "a\r\nb"
#=>     "a\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=> "a\r\n\r\n\r\n\r\n\r\nb" -> "a\r\nb"
#=>                   "a\nb" -> "a\nb"
#=>                 "a\n\nb" -> "a\nb"
#=>               "a\n\n\nb" -> "a\nb"
#=>             "a\n\n\n\nb" -> "a\nb"
#=>           "a\n\n\n\n\nb" -> "a\nb"
#=>                   "a\rb" -> "a\rb"
#=>                 "a\r\rb" -> "a\rb"
#=>               "a\r\r\rb" -> "a\rb"
#=>             "a\r\r\r\rb" -> "a\rb"
#=>           "a\r\r\r\r\rb" -> "a\rb"

Note that this assumes that your blank lines are truly blank, and do not have any whitespace on them. 请注意,这假设您的空白行是真正空白的,并且它们上没有任何空格。 If this is the case, you could do a pre pass of s.gsub(/^[ \\t]+$/,'') 如果是这种情况,你可以预先执行s.gsub(/^[ \\t]+$/,'')

This will do it: .gsub(/(\\n\\s*\\n)+/, "\\n") 这样做: .gsub(/(\\n\\s*\\n)+/, "\\n")

and replace \\n in the regex with [\\n|\\r ] if needed. 如果需要,用[\\n|\\r ]替换\\n则表达式中的[\\n|\\r

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM