简体   繁体   English

Ruby删除字符串的一部分

[英]Ruby remove parts of a string

I have a problem with some regular expressions in Ruby. 我在Ruby中遇到一些正则表达式问题。 This is the situation: Input text: 这是这种情况:输入文本:

"NU POSTA aşa ceva pe Facebook! „Prostia se plăteşte”
Publicat la: 10.02.2015 10:20 Ultima actualizare: 10.02.2015 10:35
Adresa de e-mail la care vrei sa primesti STIREA atunci cand se intampla
Abonează-te
---- Here is some usefull text --- 
Abonează-te
× Citeşte mai mult »
Adauga un comentariu"

I need a regular expression witch can extract only useful text between "Abonează-te" word. 我需要一个正则表达式,女巫只能提取“Abonează-te”一词之间的有用文本。

I tried this result = result.gsub(/^[.]{*}\\nAbonează-te/, '') to remove the text from the start of the string to the 'Abonează-te' word, but this does not work. 我尝试过此result = result.gsub(/^[.]{*}\\nAbonează-te/, '')从字符串开头到'Abonează-te'单词的文本删除,但这不起作用。 I have no ideea how to solve this situation. 我不知道如何解决这种情况。 Can you help me? 你能帮助我吗?

You could use string.scan function. 您可以使用string.scan函数。 You don't need to go for string.gsub function where you want to extract a particular text. 您无需在要提取特定文本的地方使用string.gsub函数。

> s = "NU POSTA aşa ceva pe Facebook! „Prostia se plăteşte”
" Publicat la: 10.02.2015 10:20 Ultima actualizare: 10.02.2015 10:35
" Adresa de e-mail la care vrei sa primesti STIREA atunci cand se intampla
" Abonează-te
" ---- Here is some usefull text --- 
" Abonează-te
" × Citeşte mai mult »
" Adauga un comentariu"
=> "NU POSTA aşa ceva pe Facebook! „Prostia se plăteşte”\nPublicat la: 10.02.2015 10:20 Ultima actualizare: 10.02.2015 10:35\nAdresa de e-mail la care vrei sa primesti STIREA atunci cand se intampla\nAbonează-te\n---- Here is some usefull text --- \nAbonează-te\n× Citeşte mai mult »\nAdauga un comentariu"
irb(main):010:0> s.scan(/(?<=Abonează-te\n)[\s\S]*?(?=\nAbonează-te)/)
=> ["---- Here is some usefull text --- "]

Remove the newline \\n character present inside the lookarounds if necessary. 如有必要,请删除环行查询中出现的换行符\\n [\\s\\S]*? will do a non-greedy match of space or non-space characters zero or more times. 将对空格或非空格字符进行非贪心匹配零次或多次。

DEMO DEMO

Instead of using regular expression, you can use String#split , then take the second part: 除了使用正则表达式,还可以使用String#split ,然后使用第二部分:

s = "NU POSTA aşa ceva pe Facebook! „Prostia se plăteşte”
Publicat la: 10.02.2015 10:20 Ultima actualizare: 10.02.2015 10:35
Adresa de e-mail la care vrei sa primesti STIREA atunci cand se intampla
Abonează-te
---- Here is some usefull text --- 
Abonează-te
× Citeşte mai mult »
Adauga un comentariu"
s.split('Abonează-te', 3)[1].strip  # 3: at most 3 parts
# => "---- Here is some usefull text ---"

UPDATE UPDATE

If you want to get multiple matches: 如果要获得多个匹配项:

s = "NU
Abonează-te
-- Here's some
Abonează-te
text --
Abonează-te
comentariu"
s.split('Abonează-te')[1..-2].map(&:strip)
# => ["-- Here's some", "text --"]

Your regex syntax is incorrect . 您的正则表达式语法不正确. inside of a character class means match a dot literally, and the {*} matches an opening curly brace "zero or more" times followed by a closing curly brace. 字符类内部的含义是按字面值匹配点,而{*}匹配开括号“零次或多次”,后跟右括号。

You can match instead of replacing here. 您可以在这里匹配而不是替换。

s.match(/Abonează-te(.*?)Abonează-te/m)[1].strip()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM