[英]exclude a pattern using regex in python
I want to extract Name and number from a given string and save it into two lists.我想从给定的字符串中提取名称和数字并将其保存到两个列表中。
str = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs .'
I want to acheive:我想实现:
name = ['Dhoni','Kohli','Rohit','Dhawan']
values = ['100','150','50','250']
I tried to use negative-look ahead but did not succeed.我尝试使用负面展望但没有成功。 I am trying to use the approach as match a word then a number then again a word.我正在尝试使用这种方法来匹配一个单词,然后是一个数字,然后再匹配一个单词。 May be I am wrong in this approach.可能是我在这种方法上错了。 How this can be acheived?如何实现这一目标?
What I tried:我尝试了什么:
pattern = r'^[A-Za-z]+\s(?!)[a-z]'
print(re.findall(pattern,str))
You might use 2 capturing groups instead:您可以改用 2 个捕获组:
\b([A-Z][a-z]+)\s+scored\s+(\d+)\b
import re
pattern = r"\b([A-Z][a-z]+)\s+scored\s+(\d+)\b"
str = "Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs ."
matches = re.finditer(pattern, str)
name = []
values = []
for matchNum, match in enumerate(matches, start=1):
name.append(match.group(1))
values.append(match.group(2))
print(name)
print(values)
Output Output
['Dhoni', 'Kohli', 'Rohit', 'Dhawan']
['100', '150', '50', '250']
The pattern seems to be name scored value
.该模式似乎是name scored value
。
>>> res = re.findall(r'(\w+)\s*scored\s*(\d+)', s)
>>> names, values = zip(*res)
>>> names
('Dhoni', 'Kohli', 'Rohit', 'Dhawan')
>>> values
('100', '150', '50', '250')
This code basically give extract of **Name** and **Number** from a given string and save it into two lists and then store in dictionary in a form of key value pair.
import re
x = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.'
names=re.findall(r'[A-Z][a-z]*',x)
values=re.findall(r'[0-9]+',x)
dicts={}
for i in range(len(names)):
dicts[names[i]]=values[i]
print(dicts)
#Input: Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.
#Output: {'Dhoni': '100', 'Kohli': '150', 'Rohit': '50', 'Dhawan': '250'}
#Input: A has 5000 rupees and B has 15000 rupees.C has 85000 rupees and D has 50000 rupees .
#Output: {'A': '5000', 'B': '15000', 'C': '85000', 'D': '50000'}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.