简体   繁体   English

使用BeautifulSoup 4提取日期

[英]Extract dates using BeautifulSoup 4

how to extract the date in this using BeautifulSoup? 如何使用BeautifulSoup提取日期?

<div class="month">                                            Dec                                          </div>                                             
<div class="edate">                                                 31                                             </div>                                             
<div class="day">                                                 Mon                                             </div

Take the parent element of those divs, then get the three strings and join them into one string: 获取这些div的元素,然后获取三个字符串并将它们连接成一个字符串:

date = ' '.join([unicode(t) for t in parent.stripped_strings])

which would result in Dec 31 Mon . 这将导致Dec 31 Mon

If you need to manipulate the date, you'll need to parse it out to a datetime.date object; 如果需要操作日期,则需要将其解析为datetime.date对象; I strongly suggest you use the dateutil external library to do that. 我强烈建议你使用dateutil外部库来做到这一点。 However, since the year is missing from this date, your mileage may vary. 但是,由于从这一天开始缺少这一年,您的里程可能会有所不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM