简体   繁体   English

使用javascript或ruby分割此字符串

[英]Splitting this string using javascript or ruby

I have the following string: 我有以下字符串:

Upper and lower ranch milk 125ML (3 * 8)

and 1000 other similar ones that are not of identical format. 以及其他1000个格式不同的类似内容。 I want to separate the product (the text portion), the volume ( 125ML ), and the collation ( (3 * 8) ) into separate variables. 我想将产品(文本部分),卷( 125ML )和排序规则( (3 * 8) )分成单独的变量。

I tried with excel and with matlab to come up with a function, but have not managed to achieve the desired result. 我尝试使用excel和matlab来提出一个功能,但是未能达到预期的效果。 I want to come up with a clever way to do it than manually screening each one. 我想提出一种聪明的方法,而不是手动筛选每个方法。 All input appreciated. 所有输入表示赞赏。

You can use a regular expression, for example ^(.*)( \\d+ML) +\\((.*)\\) 您可以使用正则表达式,例如^(.*)( \\d+ML) +\\((.*)\\)

Explanation 说明

^(.*) Group 1 : any characters from start ^(.*)组1:从头开始的任何字符

( \\d+ML) Group 2 : A space followed by a volume in digits and ML ( \\d+ML)第2组:一个空格,后跟数字和ML的体积

+\\((.*)\\) Group 3 : Anything between parenthesis after at least one space +\\((.*)\\)第3组:括号中至少有一个空格后的任何内容

Applied to your sample string 应用于样本字符串

Full match Upper and lower 2 ranch milk 125ML (3 * 8) 完全匹配的Upper and lower 2 ranch milk 125ML (3 * 8)

Group 1: Upper and lower 2 ranch milk 第一组: Upper and lower 2 ranch milk

Group 2: 125ML 第2组: 125ML

Group 3: 3 * 8 第3组: 3 * 8

Demo 演示版

Sample snippet in JavaScript JavaScript中的示例代码段

Look at console 看控制台

 function extractInformation(from) { var re = /^(.*)( \\d+ML) +\\((.*)\\)/; var matches = re.exec(from); if(matches) { return { "title" : matches[1].trim(), "volume": matches[2].trim(), "collation": matches[3].trim(), } } return {}; } console.log(extractInformation("Upper and lower ranch milk 125ML (3 * 8)")); console.log(extractInformation("Upper and lower 123 ranch milk 125ML (3 * 8)")) 

Not a good solution but might save the day (JavaScript). 这不是一个好的解决方案,但可以节省一天的时间(JavaScript)。

var str = "Upper and lower ranch milk 125ML (3 * 8)"    
f = str.match(/\d+ML/g)[0]
//"125ML"
[x,y] = str.split(f)
//Array [ "Upper and lower ranch milk ", " (3 * 8)" ]
x
//"Upper and lower ranch milk "
y
//" (3 * 8)"

In Ruby, you'd just need to split around some digits followed by ML : 在Ruby中,您只需要在一些数字后面加上ML

text = "Upper and lower ranch milk 125ML (3 * 8)"
p text.split(/\s+(\d+ML)\s+/)
# ["Upper and lower ranch milk", "125ML", "(3 * 8)"]

The split argument usually isn't returned in the list, except if you define a group (with () inside the regex). 除非您定义了一个组(在正则表达式中带有() ,否则通常不会在列表中返回split参数。

To parse your Excel file, it might be much easier to export the spreadsheet as a CSV file and parse it with the CSV class . 要解析您的Excel文件,将电子表格导出为CSV文件并使用CSV类进行解析可能要容易得多。

"Upper and lower ranch milk 125ML (3 * 8)".partition(/\d+ML/)
# => ["Upper and lower ranch milk ", "125ML", " (3 * 8)"]

"Upper and lower ranch milk 125ML (3 * 8)".partition(/\d+ML/).map(&:strip)
# => ["Upper and lower ranch milk", "125ML", "(3 * 8)"]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM