简体   繁体   中英

Splitting this string using javascript or ruby

I have the following string:

Upper and lower ranch milk 125ML (3 * 8)

and 1000 other similar ones that are not of identical format. I want to separate the product (the text portion), the volume ( 125ML ), and the collation ( (3 * 8) ) into separate variables.

I tried with excel and with matlab to come up with a function, but have not managed to achieve the desired result. I want to come up with a clever way to do it than manually screening each one. All input appreciated.

You can use a regular expression, for example ^(.*)( \\d+ML) +\\((.*)\\)

Explanation

^(.*) Group 1 : any characters from start

( \\d+ML) Group 2 : A space followed by a volume in digits and ML

+\\((.*)\\) Group 3 : Anything between parenthesis after at least one space

Applied to your sample string

Full match Upper and lower 2 ranch milk 125ML (3 * 8)

Group 1: Upper and lower 2 ranch milk

Group 2: 125ML

Group 3: 3 * 8

Demo

Sample snippet in JavaScript

Look at console

 function extractInformation(from) { var re = /^(.*)( \\d+ML) +\\((.*)\\)/; var matches = re.exec(from); if(matches) { return { "title" : matches[1].trim(), "volume": matches[2].trim(), "collation": matches[3].trim(), } } return {}; } console.log(extractInformation("Upper and lower ranch milk 125ML (3 * 8)")); console.log(extractInformation("Upper and lower 123 ranch milk 125ML (3 * 8)")) 

Not a good solution but might save the day (JavaScript).

var str = "Upper and lower ranch milk 125ML (3 * 8)"    
f = str.match(/\d+ML/g)[0]
//"125ML"
[x,y] = str.split(f)
//Array [ "Upper and lower ranch milk ", " (3 * 8)" ]
x
//"Upper and lower ranch milk "
y
//" (3 * 8)"

In Ruby, you'd just need to split around some digits followed by ML :

text = "Upper and lower ranch milk 125ML (3 * 8)"
p text.split(/\s+(\d+ML)\s+/)
# ["Upper and lower ranch milk", "125ML", "(3 * 8)"]

The split argument usually isn't returned in the list, except if you define a group (with () inside the regex).

To parse your Excel file, it might be much easier to export the spreadsheet as a CSV file and parse it with the CSV class .

"Upper and lower ranch milk 125ML (3 * 8)".partition(/\d+ML/)
# => ["Upper and lower ranch milk ", "125ML", " (3 * 8)"]

"Upper and lower ranch milk 125ML (3 * 8)".partition(/\d+ML/).map(&:strip)
# => ["Upper and lower ranch milk", "125ML", "(3 * 8)"]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM