简体   繁体   中英

Function that returns Reader to http.Response

This is a stripped-down version of the code I want to use for a page-specific web crawler. The idea is to have a function that gets a URL, deals with HTTP and returns a Reader to the response body http.Response :

package main

import (
    "io"
    "log"
    "net/http"
    "os"
)

func main() {
    const url = "https://xkcd.com/"
    r, err := getPageContent(url)
    if err != nil {
        log.Fatal(err)
    }
    f, err := os.Create("out.html")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    io.Copy(f, r)
}

func getPageContent(url string) (io.Reader, error) {
    res, err := http.Get(url)
    if err != nil {
        return nil, err
    }
    return res.Body, nil
}

The response body is never closed, which is bad. Closing it inside of the getPageContent function won't work, of course, for io.Copy won't be able to read anything from a closed resource.

My question is rather of general interest than for the specific use case: How can I use functions to abstract the gathering of external resources without having to store the whole resource in a temporary buffer? Or should I better avoid such abstractions?

As pointed out by the user leaf bebop in the comment section, the function getPageCount should return an io.ReadCloser instead of just an io.Reader :

package main

import (
    "io"
    "log"
    "net/http"
    "os"
)

func main() {
    const url = "https://xkcd.com/"
    r, err := getPageContent(url)
    if err != nil {
        log.Fatal(err)
    }
    defer r.Close()
    f, err := os.Create("out.html")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    io.Copy(f, r)
}

func getPageContent(url string) (io.ReadCloser, error) {
    res, err := http.Get(url)
    if err != nil {
        return nil, err
    }
    return res.Body, nil
}

Another solution is you can directly return the response and close it in main function. In general you can put checks on response StatusCode etc. if new requirements come. Here is the updated code:

package main

import (
    "io"
    "log"
    "net/http"
    "os"
)

func main() {
    const url = "https://xkcd.com/"
    r, err := getPageContent(url)
    if err != nil {
        log.Fatal(err)
    }
    defer r.Body.Close()

    if r.StatusCode !=http.StatusOK{
    //some operations
    }
    f, err := os.Create("out.html")
    if err != nil {
        log.Fatal(err)
    }
    defer f.Close()
    io.Copy(f, r.Body)
}

func getPageContent(url string) (*http.Response, error) {
    res, err := http.Get(url)
    if err != nil {
        return nil, err
    }
    return res, nil
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM