[英]How can I get the name of a seller on a specific website using golang?
我正在 go 中制作 web 刮刀。 给定一个特定的 web 页面,我试图获取位于右上角的卖家名称(在此 olx 网站的示例中,您可以看到卖家的名称是 Ionut)。 当我运行下面的代码时,它应该在 index.csv 文件中写入名称,但该文件是空的。 我认为问题出在 HTML 解析器上,尽管对我来说它看起来不错。
package main
import (
"encoding/csv"
"fmt"
"log"
"os"
"path/filepath"
"github.com/gocolly/colly"
)
func main() {
//setting up the file where we store collected data
fName := filepath.Join("D:\\", "go projects", "cwst go", "CWST-GO", "target folder", "index.csv")
file, err := os.Create(fName)
if err != nil {
log.Fatalf("Could not create file, error :%q", err)
}
defer file.Close()
//writer that writes the collected data into our file
writer := csv.NewWriter(file)
//after the file is written, what it is in the buffer goes in writer and then passed to file
defer writer.Flush()
//collector
c := colly.NewCollector(
colly.AllowedDomains("https://www.olx.ro/"),
)
//HTML parser
c.OnHTML(".css-1fp4ipz", func(e *colly.HTMLElement) { //div class that contains wanted info
writer.Write([]string{
e.ChildText("h4"), //specific tag of the info
})
})
fmt.Printf("Scraping page : ")
c.Visit("https://www.olx.ro/d/oferta/bmw-xdrixe-seria-7-2020-71000-tva-IDgp7iN.html")
log.Printf("\n\nScraping Complete\n\n")
log.Println(c)
}
您不需要在允许的域中添加https
或/
。
c := colly.NewCollector(
colly.AllowedDomains("www.olx.ro"),
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.