简体   繁体   English

Stream HTTP 内容,但在 Python 中完全跳过下载一些行

[英]Stream HTTP content but skip downloading some lines at all in Python

Edit- This is partially solved.编辑-这部分解决了。 The exact implementation details are not figured out yet, but the answer it to use HTTP range headers, as in Ezequiel's comment.确切的实现细节还没有弄清楚,但它使用 HTTP 范围标头的答案,如 Ezequiel 的评论中所述。

In case my explanation is not clear enough, I am trying to replicate the procedure here: https://www.cpc.ncep.noaa.gov/products/wesley/fast_downloading_grib.html in python.如果我的解释不够清楚,我将尝试在此处复制该过程: https://www.cpc.ncep.noaa.gov/products/wesley/fast_downloading_grib.html in Z23EEEB4347BDD75BDDZEBAFC6

edit: From a friends' kind advice, I've figured out part of the solution.编辑:根据朋友的建议,我已经找到了部分解决方案。 I need to just grab a specific byte range using my get request- that's all that NOAA's PERL scripts are doing.我只需要使用我的 get 请求获取一个特定的字节范围——这就是 NOAA 的 PERL 脚本所做的一切。

I'm attempting to download only a few fields from a "GRIB" file- a certain array-like format that the national weather service uses.我试图从“GRIB”文件中只下载几个字段——国家气象服务使用的某种类似数组的格式。 It is at a specific HTTPS url, eg https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.20201209/00/gfs.t00z.pgrb2.0p25.f000 . It is at a specific HTTPS url, eg https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.20201209/00/gfs.t00z.pgrb2.0p25.f000 . But very specifically, I need to only download the lines that are relevant to me- eg lines 5, 10, and 30. I'd like to avoid downloading the content of the other lines at all, but I'm not sure about the low-level behavior of the requests library here (or a suitable alternative).但非常具体地说,我只需要下载与我相关的行 - 例如第 5、10 和 30 行。我想完全避免下载其他行的内容,但我不确定此处请求库的低级行为(或合适的替代方案)。

This should be the code:这应该是代码:

req = request.get('https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gfs.20201209/00/gfs.t00z.pgrb2.0p25.f000',stream=True)
for line in req.iter_lines():
    next(line)
    x2 = next(line)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM