在python中使用正则表达式在字符串中查找多个事物

Question

My input string contains various entities like this: conn_type://host:port/schema#login#password 我的输入字符串包含各种不同的实体，例如： conn_type：// host：port / schema＃login＃password

I want to find out all of them using regex in python. 我想找出所有使用python中的正则表达式的人。

As of now, I am able to find them one by one, like 截至目前，我能够一一找到它们，例如

conn_type=re.search(r'[a-zA-Z]+',test_string)
  if (conn_type):
    print "conn_type:", conn_type.group()
    next_substr_len = conn_type.end()
    host=re.search(r'[^:/]+',test_string[next_substr_len:])

and so on. 等等。

Is there a way to do it without if and else ? 有没有if if else的方法 ？ I expect there to be some way, but not able to find it. 我希望有某种方法，但无法找到它。 Please note that every entity regex is different. 请注意，每个实体正则表达式都是不同的。

Please help, I don't want to write a boring code. 请帮忙，我不想写一个无聊的代码。

Answer 1

Why don't you use re.findall? 您为什么不使用re.findall？

Here is an example: 这是一个例子：

import re;

s = 'conn_type://host:port/schema#login#password asldasldasldasdasdwawwda conn_type://host:port/schema#login#email';

def get_all_matches(s):
    matches = re.findall('[a-zA-Z]+_[a-zA-Z]+:\/+[a-zA-Z]+:+[a-zA-Z]+\/+[a-zA-Z]+#+[a-zA-Z]+#[a-zA-Z]+',s);
    return matches;

print get_all_matches(s);

this will return a list full of matches to your current regex as seen in this example which in this case would be: 这将返回一个与当前正则表达式完全匹配的列表，如本例所示，在本例中为：

['conn_type://host:port/schema#login#password', 'conn_type://host:port/schema#login#email']

If you need help making regex patterns in Python I would recommend using the following website: 如果您需要使用Python创建正则表达式模式的帮助，建议您使用以下网站：

A pretty neat online regex tester 一个非常整洁的在线正则表达式测试器

Also check the re module's documentation for more on re.findall 另请参阅re模块的文档以获取有关re.findall的更多信息

Documentation for re.findall re.findall的文档

Hope this helps! 希望这可以帮助！

Answer 2

If you like it DIY, consider creating a tokenizer . 如果您喜欢DIY，请考虑创建一个tokenizer 。 This is very elegant "python way" solution. 这是非常优雅的“ python方式”解决方案。

Or use a standard lib: https://docs.python.org/3/library/urllib.parse.html but note, that your sample URL is not fully valid: there is no schema 'conn_type' and you have two anchors in the query string, so urlparse wouldn't work as expected. 或使用标准库： https ://docs.python.org/3/library/urllib.parse.html，但请注意，您的示例URL并非完全有效：没有模式'conn_type'并且您有两个锚点查询字符串，因此urlparse无法正常工作。 But for real-life URLs I highly recommend this approach. 但是对于现实生活中的URL，我强烈建议使用这种方法。

Answer 3

>>>import re
>>>uri = "conn_type://host:port/schema#login#password"
>>>res = re.findall(r'(\w+)://(.*?):([A-z0-9]+)/(\w+)#(\w+)#(\w+)', uri)
>>>res
[('conn_type', 'host', 'port', 'schema', 'login', 'password')]

No need for ifs. 无需ifs。 Use findall or finditer to search through your collection of connection types. 使用findall或finditer搜索您的连接类型集合。 Filter the list of tuples, as need be. 根据需要过滤元组列表。

在python中使用正则表达式在字符串中查找多个事物

问题描述

3 个解决方案

解决方案1
2 2017-02-02 06:35:56

解决方案2
1 2017-02-02 06:16:01

解决方案3
1 已采纳 2017-02-02 06:59:07

在python中使用正则表达式在字符串中查找多个事物

问题描述

3 个解决方案

解决方案1 2 2017-02-02 06:35:56

解决方案2 1 2017-02-02 06:16:01

解决方案3 1 已采纳 2017-02-02 06:59:07

解决方案1
2 2017-02-02 06:35:56

解决方案2
1 2017-02-02 06:16:01

解决方案3
1 已采纳 2017-02-02 06:59:07