I used multiple ways to access table rows but i couldn't.
import pandas as pd
url = "https://programsandcourses.anu.edu.au/catalogue"
d = pd.read_html(url, header =0, flavor = 'bs4')
print(d)
And not showing rows data just shown as below:
[ Code ... Delivery
0 Show all results... ... Show all results...
[1 rows x 7 columns], Code ... Delivery
0 Show all results... ... Show all results...
[1 rows x 6 columns], Code Title ... Career Units
0 Show all results... Show all results... ... Show all results... NaN
[1 rows x 5 columns], Code Title ... Career Units
0 Show all results... Show all results... ... Show all results... NaN
[1 rows x 5 columns], Code Title ... Career Units
0 Show all results... Show all results... ... Show all results... NaN
[1 rows x 5 columns]]
How can i access data to store in csv file? It needs any permissions?
May be content is dynamic so its hard to fetch from pandas
as well as beautifulsoup
what approach you can follow
Go to chrome developer mode and refresh your page and now go to the Network tab and click on xhr you will able to find links under Name tab
Click on links in which first link contains only first 20 data.
as you want all 416 data so go to web page click on show all result and xhr will have new link which is in code and it is type of json
Click on that link and copy the link address so now you can extract what so ever data you want from json data
Code:
import requests
res=requests.get("https://programsandcourses.anu.edu.au/data/ProgramSearch/GetPrograms?q=&client=anu_frontend&proxystylesheet=anu_frontend&site=default_collection&btnG=Search&filter=0&q=&client=anu_frontend&proxystylesheet=anu_frontend&site=default_collection&btnG=Search&filter=0&AppliedFilter=FilterByPrograms&Source=&ShowAll=true&PageIndex=0&MaxPageSize=20&PageSize=Infinity&SortColumn=&SortDirection=&InitailSearchRequestedFromExternalPage=true&SearchText=&SelectedYear=2021&Careers%5B0%5D=&Careers%5B1%5D=&Careers%5B2%5D=&Careers%5B3%5D=&Sessions%5B0%5D=&Sessions%5B1%5D=&Sessions%5B2%5D=&Sessions%5B3%5D=&Sessions%5B4%5D=&Sessions%5B5%5D=&DegreeIdentifiers%5B0%5D=&DegreeIdentifiers%5B1%5D=&DegreeIdentifiers%5B2%5D=&FilterByMajors=&FilterByMinors=&FilterBySpecialisations=&CollegeName=All+Colleges&ModeOfDelivery=All+Modes")
main_json=res.json()
len(main_json['Items'])
Image:
approach of point number 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.