简体   繁体   中英

Extracting dates from text in python

This is the data I scraped from a website.

[
 'Archive\nUpdated',
 'Sep 20,\n2021',
 'Data Tables',
 'Excel',
 'Sep\n03, 2019',
 'Nov 05, 2021',
 'Sep\n03, 2019',
 'Excel',
]

Now the thing is that I want to extract the dates, Month and years inside this list.

Assuming you know the format in which your dates appear - you can do something like this:

import datetime as dt

data = [
 'Archive\nUpdated',
  'Sep 20,\n2021',
   'Data Tables',
    'Excel',
     'Sep\n03, 2019',
      'Nov 05, 2021',
       'Sep\n03, 2019',
        'Excel',
        ]
data = [_.replace('\n', ' ') for _ in data]

for _ in data:
    try:
        data_date = dt.datetime.strptime(_, '%b %d, %Y')
        print(data_date.date())
    except ValueError:
        continue

#2021-09-20
#2019-09-03
#2021-11-05
#2019-09-03

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM