繁体   English   中英

使用 Python 提取 JSON 字符串中的 URL 使用 re.match() 或 split()

[英]Extract URL in JSON String with Python using re.match() or split()

使用我的 Python 代码,我提取了 JSON 文件的特殊部分(列表中的列表或字典的一部分):

import json
import urllib

f = open('json-test-file-for-insta-url-snippet.json')
data = json.load(f)

print(json.dumps(data["event"]["attachments"][0]["text"]))

我得到了这个结果:

"\u201cUNLIMITED LIVE\u201d world tour moved to 2021!\nDue to the Covid-19 pandemic and the subsequent regulations and concert restrictions, the world tour, originally planned for the autumn of 2020, could not take place. \n\"\u201eI was very much looking forward to our tour in autumn 2020 all over the world, so I\u2019m deeply sorry that these concerts had to be rescheduled due to the Covid-19 pandemic. I\u2019m very happy that we have already found new dates for our tour in autumn 2021, because I cannot wait to return to get back on stage and to play for you guys. Take care of yourselves \u2013 I hope to see you all happy and healthy again very, very soon!\u201d \nAll your tickets remain valid for the new dates! Please find them below: \n\nKAZ Almaty - Sep 11, 2021\nRUS Yekaterinburg - Sep 14, 2021\nRUS Kazan, Sep 16, 2021\nRUS Voronezh - Sep 18, 2021\nRUS Krasnodar - Sep 20, 2021\nRUS Moscow - Sep 22, 2021\nRUS St. Petersburg - Sep 24, 2021\nUKR Kharkiv - Sep 26 2021\nUKR Odessa - Sep 28, 2021\nUKR Kiev - Sep 30, 2021\nITA Bolzano - Oct 13, 2021\nITA Bologna - Oct 15, 2021\nITA Genoa - Oct 16, 2021\nITA Milano - Oct 17, 2021\nITA Conegliano Veneto - Oct 19, 2021\nBG Sofia - Oct 24, 2021\nRO Bucharest - Oct 26, 2021\nRO Cluj - Oct 29, 2021  #davidgarrett #tour2021 #unlimited #live #postponed\n*Score* -2.57x | *Likes* 338 (-830) | *Comments* 13 (-46)\n_Posted on Tuesday, August 18 at 9:59 AM CEST <https://www.instagram.com/p/CEBew-xHwhJ/|(Instagram)>_\n_Received via Viral Alert_"

现在我想最后提取 Insta-URL - 我如何在 Python 中做到这一点? 只能使用正则表达式还是有更聪明的方法? 我在 Stackoverflow 中读了很多,但对我没有任何作用。 请帮忙!

使用我的 Python 代码,我提取了 JSON 文件的特殊部分(列表中的列表或字典的一部分):

import json
import urllib

f = open('json-test-file-for-insta-url-snippet.json')
data = json.load(f)

print(json.dumps(data["event"]["attachments"][0]["text"]))

我得到了这个结果:

"\u201cUNLIMITED LIVE\u201d world tour moved to 2021!\nDue to the Covid-19 pandemic and the subsequent regulations and concert restrictions, the world tour, originally planned for the autumn of 2020, could not take place. \n\"\u201eI was very much looking forward to our tour in autumn 2020 all over the world, so I\u2019m deeply sorry that these concerts had to be rescheduled due to the Covid-19 pandemic. I\u2019m very happy that we have already found new dates for our tour in autumn 2021, because I cannot wait to return to get back on stage and to play for you guys. Take care of yourselves \u2013 I hope to see you all happy and healthy again very, very soon!\u201d \nAll your tickets remain valid for the new dates! Please find them below: \n\nKAZ Almaty - Sep 11, 2021\nRUS Yekaterinburg - Sep 14, 2021\nRUS Kazan, Sep 16, 2021\nRUS Voronezh - Sep 18, 2021\nRUS Krasnodar - Sep 20, 2021\nRUS Moscow - Sep 22, 2021\nRUS St. Petersburg - Sep 24, 2021\nUKR Kharkiv - Sep 26 2021\nUKR Odessa - Sep 28, 2021\nUKR Kiev - Sep 30, 2021\nITA Bolzano - Oct 13, 2021\nITA Bologna - Oct 15, 2021\nITA Genoa - Oct 16, 2021\nITA Milano - Oct 17, 2021\nITA Conegliano Veneto - Oct 19, 2021\nBG Sofia - Oct 24, 2021\nRO Bucharest - Oct 26, 2021\nRO Cluj - Oct 29, 2021  #davidgarrett #tour2021 #unlimited #live #postponed\n*Score* -2.57x | *Likes* 338 (-830) | *Comments* 13 (-46)\n_Posted on Tuesday, August 18 at 9:59 AM CEST <https://www.instagram.com/p/CEBew-xHwhJ/|(Instagram)>_\n_Received via Viral Alert_"

现在我想最后提取 Insta-URL - 我如何在 Python 中做到这一点? 只能使用正则表达式还是有更聪明的方法? 我在 Stackoverflow 中读了很多,但对我没有任何作用。 请帮忙!

使用我的 Python 代码,我提取了 JSON 文件的特殊部分(列表中的列表或字典的一部分):

import json
import urllib

f = open('json-test-file-for-insta-url-snippet.json')
data = json.load(f)

print(json.dumps(data["event"]["attachments"][0]["text"]))

我得到了这个结果:

"\u201cUNLIMITED LIVE\u201d world tour moved to 2021!\nDue to the Covid-19 pandemic and the subsequent regulations and concert restrictions, the world tour, originally planned for the autumn of 2020, could not take place. \n\"\u201eI was very much looking forward to our tour in autumn 2020 all over the world, so I\u2019m deeply sorry that these concerts had to be rescheduled due to the Covid-19 pandemic. I\u2019m very happy that we have already found new dates for our tour in autumn 2021, because I cannot wait to return to get back on stage and to play for you guys. Take care of yourselves \u2013 I hope to see you all happy and healthy again very, very soon!\u201d \nAll your tickets remain valid for the new dates! Please find them below: \n\nKAZ Almaty - Sep 11, 2021\nRUS Yekaterinburg - Sep 14, 2021\nRUS Kazan, Sep 16, 2021\nRUS Voronezh - Sep 18, 2021\nRUS Krasnodar - Sep 20, 2021\nRUS Moscow - Sep 22, 2021\nRUS St. Petersburg - Sep 24, 2021\nUKR Kharkiv - Sep 26 2021\nUKR Odessa - Sep 28, 2021\nUKR Kiev - Sep 30, 2021\nITA Bolzano - Oct 13, 2021\nITA Bologna - Oct 15, 2021\nITA Genoa - Oct 16, 2021\nITA Milano - Oct 17, 2021\nITA Conegliano Veneto - Oct 19, 2021\nBG Sofia - Oct 24, 2021\nRO Bucharest - Oct 26, 2021\nRO Cluj - Oct 29, 2021  #davidgarrett #tour2021 #unlimited #live #postponed\n*Score* -2.57x | *Likes* 338 (-830) | *Comments* 13 (-46)\n_Posted on Tuesday, August 18 at 9:59 AM CEST <https://www.instagram.com/p/CEBew-xHwhJ/|(Instagram)>_\n_Received via Viral Alert_"

现在我想最后提取 Insta-URL - 我如何在 Python 中做到这一点? 只能使用正则表达式还是有更聪明的方法? 我在 Stackoverflow 中读了很多,但对我没有任何作用。 请帮忙!

使用我的 Python 代码,我提取了 JSON 文件的特殊部分(列表中的列表或字典的一部分):

import json
import urllib

f = open('json-test-file-for-insta-url-snippet.json')
data = json.load(f)

print(json.dumps(data["event"]["attachments"][0]["text"]))

我得到了这个结果:

"\u201cUNLIMITED LIVE\u201d world tour moved to 2021!\nDue to the Covid-19 pandemic and the subsequent regulations and concert restrictions, the world tour, originally planned for the autumn of 2020, could not take place. \n\"\u201eI was very much looking forward to our tour in autumn 2020 all over the world, so I\u2019m deeply sorry that these concerts had to be rescheduled due to the Covid-19 pandemic. I\u2019m very happy that we have already found new dates for our tour in autumn 2021, because I cannot wait to return to get back on stage and to play for you guys. Take care of yourselves \u2013 I hope to see you all happy and healthy again very, very soon!\u201d \nAll your tickets remain valid for the new dates! Please find them below: \n\nKAZ Almaty - Sep 11, 2021\nRUS Yekaterinburg - Sep 14, 2021\nRUS Kazan, Sep 16, 2021\nRUS Voronezh - Sep 18, 2021\nRUS Krasnodar - Sep 20, 2021\nRUS Moscow - Sep 22, 2021\nRUS St. Petersburg - Sep 24, 2021\nUKR Kharkiv - Sep 26 2021\nUKR Odessa - Sep 28, 2021\nUKR Kiev - Sep 30, 2021\nITA Bolzano - Oct 13, 2021\nITA Bologna - Oct 15, 2021\nITA Genoa - Oct 16, 2021\nITA Milano - Oct 17, 2021\nITA Conegliano Veneto - Oct 19, 2021\nBG Sofia - Oct 24, 2021\nRO Bucharest - Oct 26, 2021\nRO Cluj - Oct 29, 2021  #davidgarrett #tour2021 #unlimited #live #postponed\n*Score* -2.57x | *Likes* 338 (-830) | *Comments* 13 (-46)\n_Posted on Tuesday, August 18 at 9:59 AM CEST <https://www.instagram.com/p/CEBew-xHwhJ/|(Instagram)>_\n_Received via Viral Alert_"

现在我想最后提取 Insta-URL - 我如何在 Python 中做到这一点? 只能使用正则表达式还是有更聪明的方法? 我在 Stackoverflow 中读了很多,但对我没有任何作用。 请帮忙!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM