简体   繁体   English

如何使用 golang 获取特定网站上的卖家名称?

[英]How can I get the name of a seller on a specific website using golang?

I'm making a web scraper in go.我正在 go 中制作 web 刮刀。 Given a specific web page, I'm trying to get the name of the seller which is placed in the top right corner (in this example on this olx site you can see the name of the seller is Ionut).给定一个特定的 web 页面,我试图获取位于右上角的卖家名称(在此 olx 网站的示例中,您可以看到卖家的名称是 Ionut)。 When I run the down below code, it should write the name in the index.csv file, but the file is empty.当我运行下面的代码时,它应该在 index.csv 文件中写入名称,但该文件是空的。 I think the problem is at the HTML parser, though it looks fine to me.我认为问题出在 HTML 解析器上,尽管对我来说它看起来不错。

package main

import (
    "encoding/csv"
    "fmt"
    "log"
    "os"
    "path/filepath"

    "github.com/gocolly/colly"
)

func main() {
    //setting up the file where we store collected data
    fName := filepath.Join("D:\\", "go projects", "cwst go", "CWST-GO", "target folder", "index.csv")
    file, err := os.Create(fName)
    if err != nil {
        log.Fatalf("Could not create file, error :%q", err)
    }
    defer file.Close()
    //writer that writes the collected data into our file
    writer := csv.NewWriter(file)
    //after the file is written, what it is in the buffer goes in writer and then passed to file
    defer writer.Flush()

    //collector
    c := colly.NewCollector(
        colly.AllowedDomains("https://www.olx.ro/"),
    )

    //HTML parser
    c.OnHTML(".css-1fp4ipz", func(e *colly.HTMLElement) { //div class that contains wanted info

        writer.Write([]string{
            e.ChildText("h4"), //specific tag of the info
        })
    })

    fmt.Printf("Scraping page :  ")
    c.Visit("https://www.olx.ro/d/oferta/bmw-xdrixe-seria-7-2020-71000-tva-IDgp7iN.html")

    log.Printf("\n\nScraping Complete\n\n")
    log.Println(c)

}

You don't need to add https or / in the allowed domains.您不需要在允许的域中添加https/

c := colly.NewCollector(
    colly.AllowedDomains("www.olx.ro"),
)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM