簡體   English   中英

Python中的正則表達式不起作用

[英]Regular expression in Python not working

我正在研究可以從Flickr等網站下載圖像的Python腳本。 我使用Flickr API拉出我要下載的圖像的各種尺寸,並標識原始尺寸的URL。 好吧,這就是我想要做的。 到目前為止,這是我的代碼...

URL = {a Flickr link}

flickr = re.match(r".*flickr\.com\/photos\/([^\/]+)\/([0-9^\/]+)\/", URL)
URL = "https://api.flickr.com/services/rest/?method=flickr.photos.getSizes&api_key=6002c84e96ff95c1a861eafafa4284ba&photo_id=" + flickr.group(2) + "&format=json&nojsoncallback=1"

request = requests.get(URL)
result = request.text

parsed = re.match(r".\"Original\".*\"source\"\: \"([^\"]+)", result)
URL = parsed.group(1)

在整個代碼中使用print()語句,我知道第一個正則表達式(用於解析原始Flickr URL以標識照片ID)正常運行,並且API請求正常運行,並返回以下結果(使用示例URL https) ://www.flickr.com/photos/matbellphotography/33413612735/sizes/h/ )...

{ "sizes": { "canblog": 0, "canprint": 0, "candownload": 1, 
"size": [
  { "label": "Square", "width": 75, "height": 75, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_s.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/sq\/", "media": "photo" },
  { "label": "Large Square", "width": "150", "height": "150", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_q.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/q\/", "media": "photo" },
  { "label": "Thumbnail", "width": 100, "height": 67, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_t.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/t\/", "media": "photo" },
  { "label": "Small", "width": "240", "height": "160", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_m.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/s\/", "media": "photo" },
  { "label": "Small 320", "width": "320", "height": "213", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_n.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/n\/", "media": "photo" },
  { "label": "Medium", "width": "500", "height": "333", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/m\/", "media": "photo" },
  { "label": "Medium 640", "width": "640", "height": "427", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_z.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/z\/", "media": "photo" },
  { "label": "Medium 800", "width": "800", "height": "534", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_c.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/c\/", "media": "photo" },
  { "label": "Large", "width": "1024", "height": "683", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_b.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/l\/", "media": "photo" },
  { "label": "Large 1600", "width": "1600", "height": "1067", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_4d92e2f70d_h.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/h\/", "media": "photo" },
  { "label": "Large 2048", "width": "2048", "height": "1365", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_81441ed1da_k.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/k\/", "media": "photo" },
  { "label": "Original", "width": "5760", "height": "3840", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_34cbc172c1_o.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/o\/", "media": "photo" }
] }, "stat": "ok" }

我的代碼顯然在那之后崩潰了。 第二個正則表達式旨在標識原始文件大小的圖像的下載URL,顯然找不到任何匹配項。 根據另一個print()聲明...

parsed.group(1) = none

我使用RegExr設置了表達式,該表達式從JSON結果中確定了我所需要的。 我做錯了什么?

也許您的requests.Response對象具有一個可以直接訪問的json屬性。 如果沒有,只需import json ,解析您的request.content並使用返回的字典。 例:

>>> import json
>>> json_response = """
... { "sizes": { "canblog": 0, "canprint": 0, "candownload": 1, 
... "size": [
...   { "label": "Square", "width": 75, "height": 75, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_s.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/sq\/", "media": "photo" },
...   { "label": "Large Square", "width": "150", "height": "150", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_q.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/q\/", "media": "photo" },
...   { "label": "Thumbnail", "width": 100, "height": 67, "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_t.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/t\/", "media": "photo" },
...   { "label": "Small", "width": "240", "height": "160", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_m.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/s\/", "media": "photo" },
...   { "label": "Small 320", "width": "320", "height": "213", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_n.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/n\/", "media": "photo" },
...   { "label": "Medium", "width": "500", "height": "333", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/m\/", "media": "photo" },
...   { "label": "Medium 640", "width": "640", "height": "427", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_z.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/z\/", "media": "photo" },
...   { "label": "Medium 800", "width": "800", "height": "534", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_c.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/c\/", "media": "photo" },
...   { "label": "Large", "width": "1024", "height": "683", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_645397d6a5_b.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/l\/", "media": "photo" },
...   { "label": "Large 1600", "width": "1600", "height": "1067", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_4d92e2f70d_h.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/h\/", "media": "photo" },
...   { "label": "Large 2048", "width": "2048", "height": "1365", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_81441ed1da_k.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/k\/", "media": "photo" },
...   { "label": "Original", "width": "5760", "height": "3840", "source": "https:\/\/farm3.staticflickr.com\/2855\/33413612735_34cbc172c1_o.jpg", "url": "https:\/\/www.flickr.com\/photos\/matbellphotography\/33413612735\/sizes\/o\/", "media": "photo" }
... ] }, "stat": "ok" }"""
>>> 
>>> json_parsed = json.loads(json_response)
>>> for img in json_parsed["sizes"]["size"]:
...     print img.get("source")
... 
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_s.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_q.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_t.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_m.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_n.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_z.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_c.jpg
https://farm3.staticflickr.com/2855/33413612735_645397d6a5_b.jpg
https://farm3.staticflickr.com/2855/33413612735_4d92e2f70d_h.jpg
https://farm3.staticflickr.com/2855/33413612735_81441ed1da_k.jpg
https://farm3.staticflickr.com/2855/33413612735_34cbc172c1_o.jpg
>>> 

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM