简体   繁体   English

使用Ruby on Rails在多个点中拆分字符串

[英]Splitting a String in Multiple points using Ruby on Rails

I have a string in my DB that represents notes for a user. 我的数据库中有一个字符串代表用户的注释。 I want to split this string up so I can separate each note into the content, user, and date. 我想拆分这个字符串,这样我就可以将每个音符分成内容,用户和日期。

Here is the format of the String: 这是String的格式:

"Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>" 

I need to break this into an array of 我需要把它分成一个数组

["Example Note",  "Josh Test", "12:53 8/14/12", "Another example note", "John Doe", "12:00 PM 9/15/12", "Last Example Note", "Joe Smoe", "1:00 AM 10/12/12"]

I am still experimenting with this. 我还在试验这个。 Any ideas are very welcomed thank you! 任何想法都非常欢迎,谢谢! :) :)

You could use regex for a simpler approach. 您可以使用正则表达式来实现更简单的方法。

s = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>" 
s.split(/\s+<i>|<\/i><br><br>\s?|(?<!on) (?=\d)/)
=> ["Example Note", "Josh Test", "12:53 PM on 8/14/12", "Another example note", "John Doe", "12:00 PM on 9/15/12", " Last Example Note", "Joe Smoe", "1:00 AM on 10/12/12"]

The datetime element is off format, but perhaps it would be acceptable to apply some formatting on them separately. datetime元素是关闭格式的,但也许可以单独对它们应用一些格式。

Edit: Removed unnecessary + character. 编辑:删除不必要的+字符。

You can use Nokogiri to parse out the required text using Xpath/CSS selectors. 您可以使用Nokogiri使用Xpath / CSS选择器解析所需的文本。 Just to give you a simple example with bare-bones parsing to get you started, the following maps every i tag as a new element in an array: 为了给你一个简单的例子,使用简单的解析来开始,下面将每个i标记映射为数组中的新元素:

require 'nokogiri'

html = Nokogiri::HTML("Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>")

my_array = html.css('i').map {|text| text.content}
#=> ["Josh Test 12:53 PM on 8/14/12", "John Doe 12:00 PM on 9/15/12", "Joe Smoe :00 AM on 10/12/12"]

With the CSS selector you could just as easily do something like: 使用CSS选择器,您可以轻松地执行以下操作:

require 'nokogiri'

html = Nokogiri::HTML("<h1>My Message</h1><p>Hi today's date is: <time>Firday, May 31st</time></p>")
message_header = html.css('h1').first.content #=> "My Message"
message_body = html.css('p').first.content #=> "Hi today's date is:"
message_sent_at = html.css('p > time').first.content #=> "Friday, May 31st"

maybe this could be useful 也许这可能有用

require 'date'
require 'time'

text = "Example Note <i>Josh Test 12:53 PM on 8/14/12</i><br><br> Another example note <i>John Doe 12:00 PM on 9/15/12</i><br><br>  Last Example Note <i>Joe Smoe 1:00 AM on 10/12/12</i><br><br>"

notes=text.split('<br><br>')

pro_notes = []

notes.each do |note_e|
  notes_temp = note_e.split('<i>')
  words = notes_temp[1].split(' ')

  temp = words[5].gsub('</i>','')
  a = temp.split('/')

  full_name = words[0] + ' ' + words[1]
  nn = notes_temp[0]
  dt = DateTime.parse(a[2] +'/'+ a[0] +'/'+ a[1] +' '+ words[2])

  pro_notes << [full_name, nn, dt]
end

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM