简体   繁体   中英

How do you scrape multiple pages from same website on Rstudio

so I want to download data from multiple pages of the same website using RStudio https://www.irishjobs.ie/ShowResults.aspx?Keywords=Data&autosuggestEndpoint=%2fautosuggest&Location=0&Category=&Recruiter=Company&btnSubmit=Search&Page=2

The difference between page 2 and page 3, is …at the end of the hyperlink we just have a 3 instead of a 2 I have no problem getting what I need from 25 jobs in 1 page, but I want to get 100 jobs from 4 pages. I am using the selector gadget chrome extension.

I tried the for loop

for (page_result in seq(from =1, to = 101, by = 25)) {
link = paste0(“ https://www.irishjobs.ie/ShowResults.aspx?Keywords=Data&autosuggestEndpoint=%2fautosuggest&Location=0&Category=&Recruiter=Company&btnSubmit=Search&Page=2)
page = read_html(link)

I can't figure out how to do it

I think I need to fit in page_result into the link, but I don't know where. I welcome any ideas. i have the rvest package and the dplyr package. But I want the for loop to go through each page. Any idea how best to do this, thanks

在此处输入图像描述

4 links can be easily put in for loop. Copy the CSS link from DOM and iterate over 5 to 30 to get all 25 jobs.

AllJOBS <- vector()
for (i in 1:4) {
  print("s")
  url <- paste0("https://www.irishjobs.ie/ShowResults.aspx?Keywords=Data&autosuggestEndpoint=%2fautosuggest&Location=0&Category=&Recruiter=Company&btnSubmit=Search&Page=",i,sep="")
  for (k in 5:30) {
  jobs <- read_html(url) %>% html_node(css = paste0("#page > div.container > div.column-wrap.order-one-two > div.two-thirds > div:nth-child(",k,") > div > div.job-result-logo-title > div.job-result-title > h2 > a")) %>% html_text()
  AllJOBS <- append(AllJOBS,jobs)
  Sys.sleep(runif(1,1,2))
  print(k)
  } 
  print(paste0("Page",i))
}
  

output

> AllJOBS
 [1] "Senior Consultant - Fund Static Data"                                                                
 [2] "Data Warehouse Engineer"                                                                             
 [3] "Senior Software Engineer - Big Data DevOps"                                                          
 [4] "HR Data Analyst"                                                                                     
 [5] "Data Insights Engineer - Dublin - Permanent/Contract - SQL Server"                                   
 [6] NA                                                                                                    
 [7] "Data Engineer - Master Data Services - SQL Server - Permanent/Contract"                              
 [8] "Senior Data Protection Officer (DPO) - Contract"                                                     
 [9] "QC Data Analyst (Trending)"                                                                          
[10] "Senior Data Warehouse Developer"                                                                     
[11] "Senior Data Analyst FTC"                                                                             
[12] "Compliance Advisory and Data Protection Relationship Manager"                                        
[13] "Contracts Manager-Data Center"                                                                       
[14] "Payments Product Data Analyst"                                                                       
[15] "Data Center Product Hardware Platform Engineer"                                                      
[16] "People Data Privacy Program Lead"                                                                    
[17] "Head of Data Science"                                                                                
[18] "Data Protection Counsel (Product or Compliance)"                                                     
[19] "Data Engineer, GMS"                                                                                  
[20] "Data Protection Associate General Counsel"                                                           
[21] "Senior Data Engineer"                                                                                
[22] "Geospatial Data Scientist"                                                                           
[23] "Data Solutions Manager"                                                                              
[24] "Data Protection Solicitor"                                                                           
[25] "Junior Data Scientist"                                                                               
[26] "Master Data Specialist"                                                                              
[27] "Temp QC Electronic Data Management Analyst"                                                          
[28] "20725 -Data Scientist - Limerick"                                                                    
[29] "Technical Support Specialist - Data Centre"                                                          
[30] "Lead QC Micro Analyst (data review and compliance)"                                                  
[31] "Temp QC Data  Analyst"                                                                               
[32] "#Abbvie Compliance Engineer (Data Integrity)"                                                        
[33] "People Data Analyst"                                                                                 
[34] "Senior Electrical Design Engineer - Data Centre Ex"                                                  
[35] "Laboratory Data Entry Assistant, UCD NVRL"                                                           
[36] "Data Migrations Specialist"                                                                          
[37] "Data Protection Officer"                                                                             
[38] "Data Center Operations Engineer (Linux)"                                                             
[39] "Senior Electrical Engineer | Data Centre LV Design"                                                  
[40] "Data Scientist - (Process Sciences)"                                                                 
[41] "Mgr Supply Logistics Global Materials Data"                                                          
[42] "Data Protection / Privacy Delivery Consultant"                                                       
[43] "Global Supply Chain Data Analyst"                                                                    
[44] "QC Data Analyst"                                                                                     
[45] "0582GradeVIIFOIOLOL1120 - Grade VII Data Protection / Freedom of Information & Compliance Officer"   
[46] "DPO001 - Deputy Data Protection Officer (General Manager) Office of the Head of Data Protection, HSE"
[47] "Senior Campaign Data Analyst"                                                                        
[48] "Data & Reporting Analyst II"                                                                         
[49] "Azure Data Analytics Solution Architect"                                                             
[50] "Head of Risk Assurance for IT, Data, Projects and Outsourcing"                                       
[51] "Trainee Data Technician, Ireland"                                                                    
[52] NA 

You can deal with NAs separately. Does this answer your question or I misinterpreted it?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM