简体   繁体   English

我需要一个正则表达式在CSV文件中拆分成千上万个

[英]I need a regex to split thousdands in a CSV file

I have 2 Regex that I've tested so far but only do part of what I want them to... 到目前为止,我已经测试了2个Regex,但是只做了我希望他们做的一部分...

  • ((?:[^,"']|"[^"]*"|'[^']*')+)
  • ,(?=([^"]*"[^"]*")*[^"]*$)

Here's an example of the data I would like to split... 这是我要拆分的数据的示例...

  • 27230,419.37
  • 27232,688.95
  • 27238,409.4
  • 27240,861.92
  • 27250,176.4
  • 27254,"1,144.16"

Since it's an upload from a CSV most likely if the number is 1000 or greater it's going to have a comma inside of the quotes. 由于它是从CSV上载的,如果数字为1000或更大,则很有可能在引号内使用逗号。 Problem I'm running into is that when I do value.split(',') it splits in between the quotes. 我遇到的问题是,当我执行value.split(',')它会在引号之间进行拆分。 I would like to have a regular expression do this instead of a bunch of for loops and if statements. 我想用一个正则表达式来代替一堆for循环和if语句。 Any help would be greatly appreciated. 任何帮助将不胜感激。

(I'm using Apex so that's why it's ' and not " ) (我使用的是Apex,因此这是'而不是"

不要使用正则表达式,请使用CSV解析器

String input = "27254,\"1,144.16\"";
List<String> data = new ArrayList<String>();
boolean inQuotes = false;
boolean escaped = false;
StringBuilder buf = new StringBuilder();
for (int i = 0; i < input.length(); i++){
    char c = input.charAt(i);
    if (escaped){
        buf.append(c);
        escaped = false;
    } else if (c == '\\') {
        escaped = true;
    } else if (c == '"') {
        inQuotes = !inQuotes;
    } else if (c == ',' && !inQuotes){
        data.add(buf.toString());
        buf = new StringBuilder();
    } else {
        buf.append(c);
    }
}
data.add(buf.toString());
        for(String line : lines){    
        i++;

        if(skipFirst && i <= 1) continue;
        if(isBlank(line)) return error('Line ' + i + ' blank');
        pattern regex=pattern.compile(',(?=([^"]*"[^"]*")*[^"]*$)');


            cells=regex.split(line);
            string tmp0=cells.get(1).replace('"','');


        if(cells == null || cells.size() < 2) return error('Line ' + i + ' is either blank or contains only one cell');
        code = cells.get(0);
        if(code != null) code = code.trim();
        try{
            //If the amount is empty or null, assume it is 0
            if(cells.get(1) == null || cells.get(1) == ''){
                amount = 0;
            }
            else{
              if(!cells.get(1).contains('"')){
                    amount = Decimal.valueOf(cells.get(1));
              }else{
                    string tmp=cells.get(1).replace('"','');
                    amount = Decimal.valueOf(tmp.replace(',',''));
              }

            }
        }catch(System.TypeException e){
            return error('Line ' + i + ' contains invalid amount');
        }
        values.put(code,amount);

    }

This post was for posterity since I did figure out a solution inside of salesforce using a regular expression...it is however long and probably not necessary. 由于我确实使用正则表达式找出了Salesforce内部的解决方案,因此该帖子仅供后人参考。但是它很长,可能没有必要。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM