[英]Splitting this string using javascript or ruby
I have the following string: 我有以下字符串:
Upper and lower ranch milk 125ML (3 * 8)
and 1000 other similar ones that are not of identical format. 以及其他1000个格式不同的类似内容。 I want to separate the product (the text portion), the volume ( 125ML
), and the collation ( (3 * 8)
) into separate variables. 我想将产品(文本部分),卷( 125ML
)和排序规则( (3 * 8)
)分成单独的变量。
I tried with excel and with matlab to come up with a function, but have not managed to achieve the desired result. 我尝试使用excel和matlab来提出一个功能,但是未能达到预期的效果。 I want to come up with a clever way to do it than manually screening each one. 我想提出一种聪明的方法,而不是手动筛选每个方法。 All input appreciated. 所有输入表示赞赏。
You can use a regular expression, for example ^(.*)( \\d+ML) +\\((.*)\\)
您可以使用正则表达式,例如^(.*)( \\d+ML) +\\((.*)\\)
Explanation 说明
^(.*)
Group 1 : any characters from start ^(.*)
组1:从头开始的任何字符
( \\d+ML)
Group 2 : A space followed by a volume in digits and ML ( \\d+ML)
第2组:一个空格,后跟数字和ML的体积
+\\((.*)\\)
Group 3 : Anything between parenthesis after at least one space +\\((.*)\\)
第3组:括号中至少有一个空格后的任何内容
Applied to your sample string 应用于样本字符串
Full match Upper and lower 2 ranch milk 125ML (3 * 8)
完全匹配的Upper and lower 2 ranch milk 125ML (3 * 8)
Group 1: Upper and lower 2 ranch milk
第一组: Upper and lower 2 ranch milk
Group 2: 125ML
第2组: 125ML
Group 3: 3 * 8
第3组: 3 * 8
Sample snippet in JavaScript JavaScript中的示例代码段
Look at console 看控制台
function extractInformation(from) { var re = /^(.*)( \\d+ML) +\\((.*)\\)/; var matches = re.exec(from); if(matches) { return { "title" : matches[1].trim(), "volume": matches[2].trim(), "collation": matches[3].trim(), } } return {}; } console.log(extractInformation("Upper and lower ranch milk 125ML (3 * 8)")); console.log(extractInformation("Upper and lower 123 ranch milk 125ML (3 * 8)"))
Not a good solution but might save the day (JavaScript). 这不是一个好的解决方案,但可以节省一天的时间(JavaScript)。
var str = "Upper and lower ranch milk 125ML (3 * 8)"
f = str.match(/\d+ML/g)[0]
//"125ML"
[x,y] = str.split(f)
//Array [ "Upper and lower ranch milk ", " (3 * 8)" ]
x
//"Upper and lower ranch milk "
y
//" (3 * 8)"
In Ruby, you'd just need to split around some digits followed by ML
: 在Ruby中,您只需要在一些数字后面加上ML
:
text = "Upper and lower ranch milk 125ML (3 * 8)"
p text.split(/\s+(\d+ML)\s+/)
# ["Upper and lower ranch milk", "125ML", "(3 * 8)"]
The split
argument usually isn't returned in the list, except if you define a group (with ()
inside the regex). 除非您定义了一个组(在正则表达式中带有()
,否则通常不会在列表中返回split
参数。
To parse your Excel file, it might be much easier to export the spreadsheet as a CSV file and parse it with the CSV class . 要解析您的Excel文件,将电子表格导出为CSV文件并使用CSV类进行解析可能要容易得多。
"Upper and lower ranch milk 125ML (3 * 8)".partition(/\d+ML/)
# => ["Upper and lower ranch milk ", "125ML", " (3 * 8)"]
"Upper and lower ranch milk 125ML (3 * 8)".partition(/\d+ML/).map(&:strip)
# => ["Upper and lower ranch milk", "125ML", "(3 * 8)"]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.