简体   繁体   中英

Splitting a String in Multiple points using Ruby on Rails

I have a string in my DB that represents notes for a user. I want to split this string up so I can separate each note into the content, user, and date.

Here is the format of the String:

"Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>" 

I need to break this into an array of

["Example Note",  "Josh Test", "12:53 8/14/12", "Another example note", "John Doe", "12:00 PM 9/15/12", "Last Example Note", "Joe Smoe", "1:00 AM 10/12/12"]

I am still experimenting with this. Any ideas are very welcomed thank you! :)

You could use regex for a simpler approach.

s = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>" 
s.split(/\s+<i>|<\/i><br><br>\s?|(?<!on) (?=\d)/)
=> ["Example Note", "Josh Test", "12:53 PM on 8/14/12", "Another example note", "John Doe", "12:00 PM on 9/15/12", " Last Example Note", "Joe Smoe", "1:00 AM on 10/12/12"]

The datetime element is off format, but perhaps it would be acceptable to apply some formatting on them separately.

Edit: Removed unnecessary + character.

You can use Nokogiri to parse out the required text using Xpath/CSS selectors. Just to give you a simple example with bare-bones parsing to get you started, the following maps every i tag as a new element in an array:

require 'nokogiri'

html = Nokogiri::HTML("Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>")

my_array = html.css('i').map {|text| text.content}
#=> ["Josh Test 12:53 PM on 8/14/12", "John Doe 12:00 PM on 9/15/12", "Joe Smoe :00 AM on 10/12/12"]

With the CSS selector you could just as easily do something like:

require 'nokogiri'

html = Nokogiri::HTML("<h1>My Message</h1><p>Hi today's date is: <time>Firday, May 31st</time></p>")
message_header = html.css('h1').first.content #=> "My Message"
message_body = html.css('p').first.content #=> "Hi today's date is:"
message_sent_at = html.css('p > time').first.content #=> "Friday, May 31st"

maybe this could be useful

require 'date'
require 'time'

text = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>"

notes=text.split('<br><br>')

pro_notes = []

notes.each do |note_e|
  notes_temp = note_e.split('<i>')
  words = notes_temp[1].split(' ')

  temp = words[5].gsub('</i>','')
  a = temp.split('/')

  full_name = words[0] + ' ' + words[1]
  nn = notes_temp[0]
  dt = DateTime.parse(a[2] +'/'+ a[0] +'/'+ a[1] +' '+ words[2])

  pro_notes << [full_name, nn, dt]
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM