简体   繁体   English

单一功能中有多个RegExp

[英]Multiple RegExp in single function

This is not exactly a problem, but more a question of method. 这不完全是问题,而是方法问题。 I am working on a project where people are able to type shorthand dates in input field, for example if you simply type "20", the input will automatically display the full date for 20th of this month. 我正在一个项目中,人们可以在输入字段中输入简写日期,例如,如果您只输入“ 20”,则输入将自动显示本月20日的完整日期。 There are many shorthand types possible, so I had to make multiple RegExp and then check each and every one. 可能有许多速记类型,因此我必须制作多个RegExp,然后再检查每一个。

My question is, is there a better way to deal with this? 我的问题是,有没有更好的方法来解决这个问题? I am no javaScript expert,but I have a feeling that this is not exacty "best practise". 我不是javaScript专家,但是我感觉这并不是完全的“最佳实践”。

Here is the function 这是功能

    function dateParser(date) {
            var splitDate = date.split(/[.:\s]/);
            var day = new RegExp(/\b\d{1,2}\b/);
            var dateHour = new RegExp(/\b\d{1,2}\s\d{1,2}\b/);
            var dateHourMin = new RegExp(/\b\d{1,2}\s\d{1,2}[:]\d{1,2}\b/);
            var dateMonth = new RegExp(/\b\d{1,2}[\/\-\,\.]\d{1,2}\b/);
            var dateMonthHour = new RegExp(/\b\d{1,2}[\/\-\,\.]\d{1,2}\s\d{1,2}\b/);
            var dateMonthHourMin = new RegExp(/\b\d{1,2}[\/\-\,\.]\d{1,2}\s\d{1,2}[:]\d{1,2}\b/);
            var dateMonthYear = new RegExp(/\b\d{1,2}[\/\-\,\.]\d{1,2}[\/\-\,\.]\d{1,4}\b/);
            var dateMonthYearHour = new RegExp(/\b\d{1,2}[\/\-\,\.]\d{1,2}[\/\-\,\.]\d{1,4}\s\d{1,2}\b/);
            var dateMonthYearHourMin = new RegExp(/\b\d{1,2}[\/\-\,\.]\d{1,2}[\/\-\,\.]\d{1,4}\s\d{1,2}[:]\d{1,2}\b/);
            var month = new Date().getMonth() + 1;
            var year = new Date().getFullYear();

            var newDate;
            if(dateMonthYearHourMin.test(date)) {
                newDate = splitDate[0]+"."+splitDate[1]+"."+splitDate[2]+" "+splitDate[3]+":"+splitDate[4];
            }
            else if(dateMonthYearHour.test(date)) {
                newDate = splitDate[0]+"."+splitDate[1]+"."+splitDate[2]+" "+splitDate[3]+":00";
            }
            else if(dateMonthYear.test(date)) {
                newDate = splitDate[0]+"."+splitDate[1]+"."+splitDate[2]+" 12:00";
            }
            else if(dateMonthHourMin.test(date)) {
                newDate = splitDate[0]+"."+splitDate[1]+"."+year+" "+splitDate[2]+":"+splitDate[3];
            }
            else if(dateMonthHour.test(date)) {
                newDate = splitDate[0]+"."+splitDate[1]+"."+year+" "+splitDate[2]+":00";
            }
            else if(dateMonth.test(date)) {
                newDate = splitDate[0]+"."+splitDate[1]+"."+year+" 12:00";
            }
            else if(dateHourMin.test(date)) {
                newDate = splitDate[0]+"."+month+"."+year+" "+splitDate[1]+":"+splitDate[2];
            }
            else if(dateHour.test(date)) {
                newDate = splitDate[0]+"."+month+"."+year+" "+splitDate[1]+":00";
            }
            else if(day.test(date)) {
                newDate = splitDate[0]+"."+month+"."+year+" 12:00";
            }
            return newDate;
        }

The problem you described is tackled by Natural Language Processing , Programming/Query Language Design fields. 您描述的问题由“ 自然语言处理” ,“编程/查询语言设计”字段解决。

One of the approaches for solving those kind of problems is manually written scanner using RegExp/other string scanning and working with the result the way you did. 解决此类问题的方法之一是使用RegExp /其他字符串扫描手动编写扫描程序,并按照您的方式处理结果。 It works for simple languages, doesn't require much knowledge in language design department and usually is intuitive to modify. 它适用于简单的语言,不需要语言设计部门的太多知识,并且通常很直观地进行修改。

If you however have feeling that input is going to grow to something more complicated I recommend replacing RegExp scanning with full-fledged parser/lexer using for example Jison "Your friendly JavaScript parser generator!" 但是,如果您觉得输入将变得更加复杂,我建议使用成熟的解析器/词法分析器(例如Jison “您友好的JavaScript解析器发生器!”)替换RegExp扫描 or anything else that suits you. 或其他适合您的内容。

I believe you can consolidate some of your regular expressions and make this more compact. 我相信您可以合并一些正则表达式并使其更紧凑。

The first thing I notice is that whitespace appears to be an important separator in your input strings. 我注意到的第一件事是空格似乎是输入字符串中的重要分隔符。 Specifically, the date (or day) is always separated from the hour/minute with a space. 具体来说,日期(或日期)始终与时/分之间用空格隔开。 So the first thing I would do is split your input on a space: 因此,我要做的第一件事是将您的输入分割在一个空格上:

var parts = date.split( /\s/ );
var datePart = parts[0];
var timePart = parts[1];     // could be undefined

Now we can process the date part and the time part (if it exists) separately. 现在,我们可以分别处理日期部分和时间部分(如果存在)。 The components of your date part are always separated by a slash, a dash, a comma, or a period, so again we can just split it: 日期部分的组成部分始终以斜杠,破折号,逗号或句点分隔,因此我们可以再次将其拆分:

parts = datePart.split( /[\/\-\,\.]/ );
var day = parts[0];
var month = parts[1]  // could be undefined
var year = parts[2];  // could be undefined

You can split the time similarly, since hours and minutes are always separated by a colon: 您可以类似地分配时间,因为小时和分钟总是用冒号隔开:

if( timePart ) {
    parts = timePart.split( /:/ );
    var hour = parts[0];
    var minute = parts[1];   // could be undefined
}

This should make this a little more compact and easy to read and maintain. 这应该使它更加紧凑并且易于阅读和维护。 You could go even more compact with a singular regular expression with groups, but I feel that this approach would be better. 使用带有组的单个正则表达式,您可以变得更加紧凑,但是我觉得这种方法会更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM