簡體   English   中英

我需要在日期列上使用 max() 方法,但我沒有得到它。 因為該列是以字符串的形式出現的。 有人能幫我嗎?

[英]I need to use the max() method on the date column, but I'm not getting it. Because the column is coming as a String. Can someone help me?

我正在使用 pandas 來讀取 Dataframe()...

轉換為 datetime64 后,我需要在該 datetime 列上使用 max() 方法。 但問題是我無法轉換:(...

當我閱讀我的 Excel 數據框時,它以2020-03-10T00:00:00.000Z的方式顯示日期。 我需要轉換為 datetime64 類型,但它讀取為字符串。 這是下面的示例:

df['Date'].dtype

輸出:dtype('O')

df['Date'] = pd.to_datetime(df['Date'] )

輸出:

錯誤:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
c:\Users\XXXXXX\Anaconda3\lib\site-packages\pandas\core\arrays\datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   2084         try:
-> 2085             values, tz_parsed = conversion.datetime_to_datetime64(data)
   2086             # If tzaware, these values represent unix timestamps, so we

pandas\_libs\tslibs\conversion.pyx in pandas._libs.tslibs.conversion.datetime_to_datetime64()

TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

ParserError                               Traceback (most recent call last)
<ipython-input-66-917cb86929c6> in <module>
----> 1 df['Date'] = pd.to_datetime(df['Date'] )

c:\Users\XXXXXX\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, format, exact, unit, infer_datetime_format, origin, cache)
    799                 result = result.tz_localize(tz)
    800     elif isinstance(arg, ABCSeries):
--> 801         cache_array = _maybe_cache(arg, format, cache, convert_listlike)
    802         if not cache_array.empty:
    803             result = arg.map(cache_array)

c:\Users\XXXXXX\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in _maybe_cache(arg, format, cache, convert_listlike)
    176         unique_dates = unique(arg)
    177         if len(unique_dates) < len(arg):
--> 178             cache_dates = convert_listlike(unique_dates, format)
    179             cache_array = Series(cache_dates, index=unique_dates)
    180     return cache_array

c:\Users\XXXXXX\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike_datetimes(arg, format, name, tz, unit, errors, infer_datetime_format, dayfirst, yearfirst, exact)
    463         assert format is None or infer_datetime_format
    464         utc = tz == "utc"
--> 465         result, tz_parsed = objects_to_datetime64ns(
    466             arg,
    467             dayfirst=dayfirst,

c:\Users\XXXXXX\Anaconda3\lib\site-packages\pandas\core\arrays\datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   2088             return values.view("i8"), tz_parsed
   2089         except (ValueError, TypeError):
-> 2090             raise e
   2091 
   2092     if tz_parsed is not None:

c:\Users\XXXXXX\Anaconda3\lib\site-packages\pandas\core\arrays\datetimes.py in objects_to_datetime64ns(data, dayfirst, yearfirst, utc, errors, require_iso8601, allow_object)
   2073 
   2074     try:
-> 2075         result, tz_parsed = tslib.array_to_datetime(
   2076             data,
   2077             errors=errors,

pandas\_libs\tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas\_libs\tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas\_libs\tslib.pyx in pandas._libs.tslib.array_to_datetime_object()

pandas\_libs\tslib.pyx in pandas._libs.tslib.array_to_datetime_object()

pandas\_libs\tslibs\parsing.pyx in pandas._libs.tslibs.parsing.parse_datetime_string()

c:\Users\XXXXXX\Anaconda3\lib\site-packages\dateutil\parser\_parser.py in parse(timestr, parserinfo, **kwargs)
   1372         return parser(parserinfo).parse(timestr, **kwargs)
   1373     else:
-> 1374         return DEFAULTPARSER.parse(timestr, **kwargs)
   1375 
   1376 

c:\Users\XXXXXX\Anaconda3\lib\site-packages\dateutil\parser\_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
    650 
    651         if len(res) == 0:
--> 652             raise ParserError("String does not contain a date: %s", timestr)
    653 
    654         try:
    

ParserError: String does not contain a date: -

請查看此代碼段是否對您有所幫助,我已經處理了一些示例數據並添加了一些值來重現錯誤:

import numpy as np
df = pd.DataFrame({'Date': ['2023-03-15 00:00:00', '2020-03-10T00:00:00.000Z', '-', '23', 33, np.nan, '', None, '2020-03-10T00:00:00.000Z', '2026-01-15 00:00:00']})
df.head(10)

                           Date
    0       2023-03-15 00:00:00
    1  2020-03-10T00:00:00.000Z
    2                         -
    3                        23
    4                        33
    5                       NaN
    6                          
    7                      None
    8  2020-03-10T00:00:00.000Z
    9       2026-01-15 00:00:00
    
df['Date_new'] = pd.to_datetime(df['Date'], errors='coerce')
df.head(10)
    
                           Date                      Date_new
    0       2023-03-15 00:00:00 2023-03-15 00:00:00.000000000
    1  2020-03-10T00:00:00.000Z 2020-03-10 00:00:00.000000000
    2                         -                           NaT
    3                        23                           NaT
    4                        33 1970-01-01 00:00:00.000000033
    5                       NaN                           NaT
    6                                                     NaT
    7                      None                           NaT
    8  2020-03-10T00:00:00.000Z 2020-03-10 00:00:00.000000000
    9       2026-01-15 00:00:00 2026-01-15 00:00:00.000000000
    
df.Date_new.max()
    
    Timestamp('2026-01-15 00:00:00')

也許您應該制作自己的日期功能? 顯然,根據this ,Python沒有任何內置的ISO-8601解析器,所以......

from datetime import datetime

def fix_iso_date_str(dstr):
    return datetime.fromisoformat(dstr.replace('Z', '+00:00'))

df['Date'] = fix_iso_date_str(df['Date'] )

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM