[英]Match regex with unordered string of alphabets and numbers
I have product names for which I have to find the model numbers. 我有要查找其型号的产品名称。 For example 例如
KIPOR KDE38SS3 DIESEL 400V AGGREGAATTI # Result --> KDE38SS3
KIPOR KDE28SS3 DIESEL 400V AGGREGAATTI # Result --> KDE28SS3
KIPOR KDE19STA3 19 KW GENERAATTORI 400V # Result --> KDE19STA3
KRÄNZLE C895-1 KUUMAVESIPESURI KELALLA # Result --> C895-1
KRÄNZLE 1165-1 KUUMAVESIPESURI KELALLA # Result --> 1165-1
NILFISK MH 4M-200/960 FA KUUMAVESIPESURI # Result --> MH 4M-200/960 FA
WALLIUS LMP-452i MIG HITSAUSKONE # Result --> LMP-452i
KRÄNZLE C15/150 KUUMAVESIPESURI KELALLA # Result --> C15/150
My current code is simple and work in some cases but I want to get an efficient way. 我当前的代码很简单,并且在某些情况下可以工作,但是我想找到一种有效的方法。
for i in range (10):
modelnum = re.findall(r'\w+\d+\w+', productnames[i])
print(modelnum)
Results: 结果:
['KDE38SS3', '400V']
['KDE28SS3', '400V']
['KDE19STA3Â', '400V']
['C895']
['1165']
['200', '960']
['452i']
['C15', '150']
Is there a way I can only parse model no. 有没有办法我只能解析模型编号。 because in the results I am also getting 400V which is not a model no. 因为在结果中我还得到了400V,这不是型号。 and also one model no. 还有一个型号 is broken in two elements. 分为两个要素。
If you don't mind using a capturing group, and the model number is always the first match in the line, then you could do something like this: 如果您不介意使用捕获组,并且型号始终是该行中的第一个匹配项,则可以执行以下操作:
for i in range (10):
modelnum = re.findall(r'^.*?(\w+\d+\w+)', productnames[i])
print(modelnum)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.