简体   繁体   中英

Pandas Dataframe - Issue when sorting Dates

I have read many questions about this subject but I can't seem to find the reason why my code does not work.

After doing some scraping from a website, I'm basically trying to sort a Pandas Dataframe by column ('Date'). I have tried converting the date to a datetime object, but that didn't work either, I would appreciate if any of you could advise me what I may be doing wrong here.

def find_data(soup):
    l = []
    for b in soup.find_all('div', class_= 'jobInfo'):
        d = {}
        company = b.find('h2').find('a')
        d["Role"] = company['title'].split(':')[0]
        d["URL"] = 'https://www.computerjobs.ie' + company['href']
        company_name = b.find('ul', class_= 'jobDetails').find('li', class_= 'jobCompanyName').get_text()
        d["Company"] = company_name.split(':')[1].strip()
        date = b.find('ul', class_= 'jobDetails').find('li', class_= 'jobLiveDate').get_text()
        d["Date"] = date.split(':')[1].strip()
        l.append(d)
    df = pd.DataFrame(l)
    #Rearranging the order of the columns
    df = df[['Date', 'Company', 'Role', 'URL']]
    #Dropping null rows
    df=df.dropna()
    #df['Date'] = pd.to_datetime(df.Date)
    #df.sort_values('by=["'Date"], ascending = True)
    df.sort_values(by = ['Date'])
    df.to_csv("csv_files/pandas_data.csv")

Output:

,Date,Employer,Title,URL
0,11/04/2018,nineDots - Technology Recruitment,Senior Python Developer,https://www.computerjobs.ie/jobs/7175653/senior-python-developer.asp
1,10/04/2018,Allen Recruitment,Lead Python Developer,https://www.computerjobs.ie/jobs/7158984/lead-python-developer.asp
2,10/04/2018,Allen Recruitment,Python Developer,https://www.computerjobs.ie/jobs/7158996/python-developer.asp
3,11/04/2018,Solas Consulting Group,Python Developer,https://www.computerjobs.ie/jobs/7231476/python-developer.asp
4,11/04/2018,nineDots - Technology Recruitment,Senior Python Developer,https://www.computerjobs.ie/jobs/7181828/senior-python-developer.asp
5,09/04/2018,realTime Recruitment Ltd.,Senior DevOps Engineer,https://www.computerjobs.ie/jobs/7240215/senior-devops-engineer.asp
6,11/04/2018,FRS Recruitment,Software Engineer/Cloud Engineer,https://www.computerjobs.ie/jobs/7140213/software-engineer-cloud-engineer.asp
7,11/04/2018,Solas Consulting Group,Junior .NET Developer,https://www.computerjobs.ie/jobs/7232494/junior-net-developer.asp
8,11/04/2018,Evolve Adviser Ltd,Data Architect,https://www.computerjobs.ie/jobs/7247685/data-architect.asp
9,11/04/2018,nineDots - Technology Recruitment,Senior DevOps Engineer,https://www.computerjobs.ie/jobs/7191814/senior-devops-engineer.asp

Re-assign sort statement:

df = df.sort_values(by = ['Date'])

df.sort_values is not inplace by default, you must reassign back to df to retain sorting, or use inplace=True as a parameter in the sort_values method.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM