Remove all a tags from string

Question

I have a string that is being input by the user. They can add as many links as they link but we only want some users to be able to click a link. What I am trying to do is replace any a tag with just the text inside it. I have managed to do it if there is one link but can't figure out how to do it when there are multiple.

This is what I currently have and have tried many variations to get to this:

url_text = text.split("<a").last.split("</a>").first.split('>').last
text.gsub! /<a.+a>/m, url_text

But it only works for the first instance of a tag.

The string I am receiving looks like this:

text = <div>blah blah blah.<br /><br /></div>\r\n<div><a href=\"http://www.google.com\">Google</a><br />Another link: <br /> <a href=\"http://www.test.com\">Test Link</a><br /><br /></div>"

I want it to say: blah blah blah. Google Another Link: Test Link

Any help will be appreciated. Let me know if you need more code or info.

Answer 1

You can use strip_tags (to strip all tags) or strip_links (to strip just links).

In Rails console:

> text = '<div>blah blah blah.<br /><br /></div>\r\n<div><a href=\"http://www.google.com\">Google</a><br />Another link: <br /> <a href=\"http://www.test.com\">Test Link</a><br /><br /></div>'
=> "<div>blah blah blah.<br /><br /></div>\\r\\n<div><a href=\\\"http://www.google.com\\\">Google</a><br />Another link: <br /> <a href=\\\"http://www.test.com\\\">Test Link</a><br /><br /></div>"
> helper.strip_tags(text)
=> "blah blah blah.\\r\\nGoogleAnother link:  Test Link"

Answer 2

Use rails helper

ActionView::Base.full_sanitizer.sanitize('text = <div>blah blah blah.<br /><br /></div>\r\n<div><a href=\"http://www.google.com\">Google</a><br />Another link: <br /> <a href=\"http://www.test.com\">Test Link</a><br /><br /></div>"
')

"text = blah blah blah.\\r\\nGoogleAnother link:  Test Link\"\n"

Answer 3

@mrzasa seems to have cracked it, though if you're wondering why the regex didn't work, it's due to it being too greedy.

Using the ? lazy operator means a scan returns as few characters of the criteria as possible.

The following adds lazy operators to the search, and I believe works as you intended:

text = "<div>blah blah blah.<br /><br /></div>\r\n<div><a href=\"http://www.google.com\">Google</a><br />Another link: <br /> <a href=\"http://www.test.com\">Test Link</a><br /><br /></div><div>blah blah blah.<br /><br /></div>\r\n<div><a href=\"http://www.google.com\">Google</a><br />Another link: <br /> <a href=\"http://www.test.com\">Test Link</a><br /><br /></div>"
text.gsub(/<a.*?>(.+?)<\/a>/, '\1')

# => "<div>blah blah blah.<br /><br /></div>\r\n<div>Google<br />Another link: <br /> Test Link<br /><br /></div><div>blah blah blah.<br /><br /></div>\r\n<div>Google<br />Another link: <br /> Test Link<br /><br /></div>"

'\\1' as the second argument of gsub simply replaces with the first match.

Hope that's in some way useful, and gives a flexible option if you'd rather use regex.

Answer 4

According to documentation , strip_tags is a method of ActionView::Helpers::SanitizeHelper module. For me worked just to include this module in my class and then you can use it's method like this:

strip_tags(your_text_with_html)

Remove all a tags from string

Question

4 answers

solution1
5 ACCPTED 2019-01-28 15:45:45

solution2
3 2019-01-28 15:55:32

solution3
2 2019-01-28 15:52:08

solution4
0 2021-02-01 10:37:31

Remove all a tags from string

Question

4 answers

solution1 5 ACCPTED 2019-01-28 15:45:45

solution2 3 2019-01-28 15:55:32

solution3 2 2019-01-28 15:52:08

solution4 0 2021-02-01 10:37:31

solution1
5 ACCPTED 2019-01-28 15:45:45

solution2
3 2019-01-28 15:55:32

solution3
2 2019-01-28 15:52:08

solution4
0 2021-02-01 10:37:31