[英]How do I remove a part of an URL using regex in Python?
I have an a list of URL that looks like this:我有一个 URL 列表,如下所示:
' https://www.superpopgadget.com/collections/best-sellers/products/sushi-roll-bazooka?Ffbclid=IwAR3WfVizYJF0RCP2AsSoulLjJK2_OUwQZ0Y1eep_b3Einm1XNJbcF_K3wYI ' ' https://www.superpopgadget.com/collections/best-sellers/products/sushi-roll-bazooka?Ffbclid=IwAR3WfVizYJF0RCP2AsSoulLjJK2_OUwQZ0Y1eep_b3Einm1XNJbcF_K3wYI '
I wanna scrape it to just get: ' https://www.superpopgadget.com/collections/best-sellers/products/sushi-roll-bazooka '我想把它刮下来得到:' https://www.superpopgadget.com/collections/best-sellers/products/sushi-roll-bazooka '
Not sure if there is any other more efficient method but this might work fine:不确定是否有任何其他更有效的方法,但这可能工作正常:
(.+)\?(.+)
It matches in the first group everything before the character ?
它在第一组中匹配字符之前的所有内容
?
and the second group is everything after it.第二组是它之后的一切。 What you need is the first group.
你需要的是第一组。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.