SimpleDateFormat.parse() 忽略模式中的字符數

Question

我正在嘗試解析一個日期字符串，它可以具有不同的樹格式。 即使 String 不應該與第二個模式匹配，它也會以某種方式匹配並因此返回錯誤的日期。

那是我的代碼：

import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;

public class Start {

    public static void main(String[] args) {
        SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy");
        try{
            System.out.println(sdf.format(parseDate("2013-01-31")));
        } catch(ParseException ex){
            System.out.println("Unable to parse");
        }
    }

    public static Date parseDate(String dateString) throws ParseException{
        SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy");
        SimpleDateFormat sdf2 = new SimpleDateFormat("dd-MM-yyyy");
        SimpleDateFormat sdf3 = new SimpleDateFormat("yyyy-MM-dd");

        Date parsedDate;
        try {
            parsedDate = sdf.parse(dateString);
        } catch (ParseException ex) {
            try{
                parsedDate = sdf2.parse(dateString);
            } catch (ParseException ex2){
                parsedDate = sdf3.parse(dateString);    
            }
        }
        return parsedDate;
    }
}

隨着輸入2013-01-31我得到輸出05.07.0036 。

如果我嘗試解析31-01-2013或31.01.2013我會按預期得到31.01.2013 。

我認識到如果我設置這樣的模式，程序會給我完全相同的輸出：

SimpleDateFormat sdf = new SimpleDateFormat("d.M.y");
SimpleDateFormat sdf2 = new SimpleDateFormat("d-M-y");
SimpleDateFormat sdf3 = new SimpleDateFormat("y-M-d");

為什么它會忽略我的模式中的字符數？

Answer 1

SimpleDateFormat 存在嚴重問題。 默認的 lenient 設置會產生垃圾答案，我想不出 lenient 有任何好處的情況。 寬松的設置不是對人工輸入的日期變化產生合理解釋的可靠方法。 這不應該是默認設置。

如果可以，請改用 DateTimeFormatter，請參閱 Ole VV 的回答。 這種較新的方法更勝一籌，可以生成線程安全且不可變的實例。 如果您在線程之間共享 SimpleDateFormat 實例，它們可以產生無錯誤或異常的垃圾結果。 可悲的是，我建議的實現繼承了這種不良行為。

禁用 lenient 只是解決方案的一部分。 您仍然可能會得到在測試中難以捕捉的垃圾結果。 有關示例，請參閱下面代碼中的注釋。

這是強制嚴格模式匹配的 SimpleDateFormat 的擴展。 這應該是該類的默認行為。

import java.text.DateFormatSymbols;
import java.text.ParseException;
import java.text.ParsePosition;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Locale;

/**
 * Extension of SimpleDateFormat that implements strict matching.
 * parse(text) will only return a Date if text exactly matches the
 * pattern. 
 * 
 * This is needed because SimpleDateFormat does not enforce strict 
 * matching. First there is the lenient setting, which is true
 * by default. This allows text that does not match the pattern and
 * garbage to be interpreted as valid date/time information. For example,
 * parsing "2010-09-01" using the format "yyyyMMdd" yields the date 
 * 2009/12/09! Is this bizarre interpretation the ninth day of the  
 * zeroth month of 2010? If you are dealing with inputs that are not 
 * strictly formatted, you WILL get bad results. You can override lenient  
 * with setLenient(false), but this strangeness should not be the default. 
 *
 * Second, setLenient(false) still does not strictly interpret the pattern. 
 * For example "2010/01/5" will match "yyyy/MM/dd". And data disagreement like 
 * "1999/2011" for the pattern "yyyy/yyyy" is tolerated (yielding 2011). 
 *
 * Third, setLenient(false) still allows garbage after the pattern match. 
 * For example: "20100901" and "20100901andGarbage" will both match "yyyyMMdd". 
 * 
 * This class restricts this undesirable behavior, and makes parse() and 
 * format() functional inverses, which is what you would expect. Thus
 * text.equals(format(parse(text))) when parse returns a non-null result.
 * 
 * @author zobell
 *
 */
public class StrictSimpleDateFormat extends SimpleDateFormat {

    protected boolean strict = true;

    public StrictSimpleDateFormat() {
        super();
        setStrict(true);
    }

    public StrictSimpleDateFormat(String pattern) {
        super(pattern);
        setStrict(true);
    }

    public StrictSimpleDateFormat(String pattern, DateFormatSymbols formatSymbols) {
        super(pattern, formatSymbols);
        setStrict(true);
    }

    public StrictSimpleDateFormat(String pattern, Locale locale) {
        super(pattern, locale);
        setStrict(true);
    }

    /**
     * Set the strict setting. If strict == true (the default)
     * then parsing requires an exact match to the pattern. Setting
     * strict = false will tolerate text after the pattern match. 
     * @param strict
     */
    public void setStrict(boolean strict) {
        this.strict = strict;
        // strict with lenient does not make sense. Really lenient does
        // not make sense in any case.
        if (strict)
            setLenient(false); 
    }

    public boolean getStrict() {
        return strict;
    }

    /**
     * Parse text to a Date. Exact match of the pattern is required.
     * Parse and format are now inverse functions, so this is
     * required to be true for valid text date information:
     * text.equals(format(parse(text))
     * @param text
     * @param pos
     * @return
     */
    @Override
    public Date parse(String text, ParsePosition pos) {
        Date d = super.parse(text, pos);
        if (strict && d != null) {
           String format = this.format(d);
           if (pos.getIndex() + format.length() != text.length() ||
                 !text.endsWith(format)) {
              d = null; // Not exact match
           }
        }
        return d;
    }
}

Answer 2

時間

java.time 是現代 Java 日期和時間 API，其行為方式符合您的預期。 所以這是一個簡單的代碼翻譯問題：

private static final DateTimeFormatter formatter1 = DateTimeFormatter.ofPattern("dd.MM.yyyy");
private static final DateTimeFormatter formatter2 = DateTimeFormatter.ofPattern("dd-MM-yyyy");
private static final DateTimeFormatter formatter3 = DateTimeFormatter.ofPattern("yyyy-MM-dd");

public static LocalDate parseDate(String dateString) {
    LocalDate parsedDate;
    try {
        parsedDate = LocalDate.parse(dateString, formatter1);
    } catch (DateTimeParseException dtpe1) {
        try {
            parsedDate = LocalDate.parse(dateString, formatter2);
        } catch (DateTimeParseException dtpe2) {
            parsedDate = LocalDate.parse(dateString, formatter3);
        }
    }
    return parsedDate;
}

（我將格式化程序放在您的方法之外，因此不會為每次調用重新創建它們。如果您願意，可以將它們放在里面。）

讓我們試試看：

    LocalDate date = parseDate("2013-01-31");
    System.out.println(date);

輸出是：

2013-01-31

對於數字DateTimeFormatter.ofPattern將模式字母的數量作為最小字段寬度。 此外，它還假定月份中的日期永遠不會超過兩位數。 因此，在嘗試dd-MM-yyyy格式時，它成功地將20解析為一個月中的某一天，然后拋出DateTimeParseException因為20之后沒有連字符（破折號）。 然后該方法繼續嘗試下一個格式化程序。

你的代碼出了什么問題

您嘗試使用的SimpleDateFormat類是出了名的麻煩，幸運的是已經過時了。 你遇到了它的眾多問題之一。 從 Teetoo 的答案中重復它如何處理數字的文檔中的重要句子：

對於解析，除非需要分隔兩個相鄰字段，否則將忽略模式字母的數量。

因此new SimpleDateFormat("dd-MM-yyyy")愉快地將2013解析為月份中的第幾天，將01解析為月份，將31解析為年份。 接下來我們應該預料到它會拋出異常，因為在 1 月 31 年沒有 2013 天。但是具有默認設置的SimpleDateFormat不會這樣做。 它只是在接下來的幾個月和幾年中不斷計算天數，並在五年半后的 7 月 5 日結束，這是您觀察到的結果。

關聯

Oracle 教程：解釋如何使用 java.time 的日期時間。

Answer 3

解決方法可能是使用正則表達式測試 yyyy-MM-dd 格式：

public static Date parseDate(String dateString) throws ParseException {
    SimpleDateFormat sdf = new SimpleDateFormat("dd.MM.yyyy");
    SimpleDateFormat sdf2 = new SimpleDateFormat("dd-MM-yyyy");
    SimpleDateFormat sdf3 = new SimpleDateFormat("yyyy-MM-dd");

    Date parsedDate;
    try {
        if (dateString.matches("\\d{4}-\\d{2}-\\d{2}")) {
            parsedDate = sdf3.parse(dateString);
        } else {
            throw new ParseException("", 0);
        }
    } catch (ParseException ex) {
        try {
            parsedDate = sdf2.parse(dateString);
        } catch (ParseException ex2) {
            parsedDate = sdf.parse(dateString);
        }
    }
    return parsedDate;
}

Answer 4

它記錄在SimpleDateFormat javadoc 中：

對於格式化，模式字母的數量是最小位數，較短的數字用零填充到這個數量。 對於解析，除非需要分隔兩個相鄰字段，否則將忽略模式字母的數量。

Answer 5

謝謝@Teetoo。 這幫助我找到了解決問題的方法：

如果我希望解析函數與模式完全匹配，我必須將SimpleDateFormat.setLenient “lenient”（ SimpleDateFormat.setLenient ）設置為false ：

SimpleDateFormat sdf = new SimpleDateFormat("d.M.y");
sdf.setLenient(false);
SimpleDateFormat sdf2 = new SimpleDateFormat("d-M-y");
sdf2.setLenient(false);
SimpleDateFormat sdf3 = new SimpleDateFormat("y-M-d");
sdf3.setLenient(false);

如果我只為每個段使用一個模式字母，這仍然會解析日期，但它會識別 2013 年不能是這一天，因此它與第二個模式不匹配。 結合長度檢查，我准確地收到了我想要的東西。

SimpleDateFormat.parse() 忽略模式中的字符數

問題描述

5 個解決方案

解決方案1
12 2013-10-21 19:26:10

解決方案2
4 2019-05-26 17:39:51

時間

你的代碼出了什么問題

關聯

解決方案3
3 2013-04-15 12:31:27

解決方案4
2 2013-04-15 12:05:26

解決方案5
0 2013-04-15 12:23:41

SimpleDateFormat.parse() 忽略模式中的字符數

問題描述

5 個解決方案

解決方案1 12 2013-10-21 19:26:10

解決方案2 4 2019-05-26 17:39:51

時間

你的代碼出了什么問題

關聯

解決方案3 3 2013-04-15 12:31:27

解決方案4 2 2013-04-15 12:05:26

解決方案5 0 2013-04-15 12:23:41

解決方案1
12 2013-10-21 19:26:10

解決方案2
4 2019-05-26 17:39:51

解決方案3
3 2013-04-15 12:31:27

解決方案4
2 2013-04-15 12:05:26

解決方案5
0 2013-04-15 12:23:41