[英]Splitting a String in Multiple points using Ruby on Rails
I have a string in my DB that represents notes for a user. 我的数据库中有一个字符串代表用户的注释。 I want to split this string up so I can separate each note into the content, user, and date. 我想拆分这个字符串,这样我就可以将每个音符分成内容,用户和日期。
Here is the format of the String: 这是String的格式:
"Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>"
I need to break this into an array of 我需要把它分成一个数组
["Example Note", "Josh Test", "12:53 8/14/12", "Another example note", "John Doe", "12:00 PM 9/15/12", "Last Example Note", "Joe Smoe", "1:00 AM 10/12/12"]
I am still experimenting with this. 我还在试验这个。 Any ideas are very welcomed thank you! 任何想法都非常欢迎,谢谢! :) :)
You could use regex for a simpler approach. 您可以使用正则表达式来实现更简单的方法。
s = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>"
s.split(/\s+<i>|<\/i><br><br>\s?|(?<!on) (?=\d)/)
=> ["Example Note", "Josh Test", "12:53 PM on 8/14/12", "Another example note", "John Doe", "12:00 PM on 9/15/12", " Last Example Note", "Joe Smoe", "1:00 AM on 10/12/12"]
The datetime element is off format, but perhaps it would be acceptable to apply some formatting on them separately. datetime元素是关闭格式的,但也许可以单独对它们应用一些格式。
Edit: Removed unnecessary +
character. 编辑:删除不必要的+
字符。
You can use Nokogiri to parse out the required text using Xpath/CSS selectors. 您可以使用Nokogiri使用Xpath / CSS选择器解析所需的文本。 Just to give you a simple example with bare-bones parsing to get you started, the following maps every i
tag as a new element in an array: 为了给你一个简单的例子,使用简单的解析来开始,下面将每个i
标记映射为数组中的新元素:
require 'nokogiri'
html = Nokogiri::HTML("Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>")
my_array = html.css('i').map {|text| text.content}
#=> ["Josh Test 12:53 PM on 8/14/12", "John Doe 12:00 PM on 9/15/12", "Joe Smoe :00 AM on 10/12/12"]
With the CSS selector you could just as easily do something like: 使用CSS选择器,您可以轻松地执行以下操作:
require 'nokogiri'
html = Nokogiri::HTML("<h1>My Message</h1><p>Hi today's date is: <time>Firday, May 31st</time></p>")
message_header = html.css('h1').first.content #=> "My Message"
message_body = html.css('p').first.content #=> "Hi today's date is:"
message_sent_at = html.css('p > time').first.content #=> "Friday, May 31st"
maybe this could be useful 也许这可能有用
require 'date'
require 'time'
text = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br> Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>"
notes=text.split('<br><br>')
pro_notes = []
notes.each do |note_e|
notes_temp = note_e.split('<i>')
words = notes_temp[1].split(' ')
temp = words[5].gsub('</i>','')
a = temp.split('/')
full_name = words[0] + ' ' + words[1]
nn = notes_temp[0]
dt = DateTime.parse(a[2] +'/'+ a[0] +'/'+ a[1] +' '+ words[2])
pro_notes << [full_name, nn, dt]
end
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.