简体   繁体   English

Python将某些字符匹配为字符串

[英]Python matching some characters into a string

I'm trying to extract/match data from a string using regular expression but I don't seem to get it. 我正在尝试使用正则表达式从字符串中提取/匹配数据,但是我似乎没有得到它。

I wan't to extract from the following string the i386 (The text between the last - and .iso): 我不会从以下字符串中提取i386(last-和.iso之间的文本):

/xubuntu/daily/current/lucid-alternate-i386.iso

This should also work in case of: 在以下情况下也应如此:

/xubuntu/daily/current/lucid-alternate-amd64.iso

And the result should be either i386 or amd64 given the case. 在这种情况下,结果应该是i386或amd64。

Thanks a lot for your help. 非常感谢你的帮助。

You could also use split in this case (instead of regex): 在这种情况下,您也可以使用split (而不是regex):

>>> str = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> str.split(".iso")[0].split("-")[-1]
'i386'

split gives you a list of elements on which your string got 'split'. split为您提供了一个字符串被“分割”的元素列表。 Then using Python's slicing syntax you can get to the appropriate parts. 然后使用Python的切片语法,您可以找到适当的部分。

r"/([^-]*)\.iso/"

您想要的位将在第一个捕获组中。

First off, let's make our life simpler and only get the file name. 首先,让我们简化生活,只获取文件名。

>>> os.path.split("/xubuntu/daily/current/lucid-alternate-i386.iso")
('/xubuntu/daily/current', 'lucid-alternate-i386.iso')

Now it's just a matter of catching all the letters between the last dash and the '.iso'. 现在只需要捕捉最后一个破折号和'.iso'之间的所有字母即可。

If you will be matching several of these lines using re.compile() and saving the resulting regular expression object for reuse is more efficient . 如果您将使用re.compile()匹配其中几行,并保存生成的正则表达式对象以供重用, 则效率更高

s1 = "/xubuntu/daily/current/lucid-alternate-i386.iso"
s2 = "/xubuntu/daily/current/lucid-alternate-amd64.iso"

pattern = re.compile(r'^.+-(.+)\..+$')

m = pattern.match(s1)
m.group(1)
'i386'

m = pattern.match(s2)
m.group(1)
'amd64'

The expression should be without the leading trailing slashes. 该表达式应没有前导斜杠。

import re

line = '/xubuntu/daily/current/lucid-alternate-i386.iso'
rex = re.compile(r"([^-]*)\.iso")
m = rex.search(line)
print m.group(1)

Yields 'i386' 产生“ i386”

reobj = re.compile(r"(\w+)\.iso$")
match = reobj.search(subject)
if match:
    result = match.group(1)
else:
    result = ""

Subject contains the filename and path. 主题包含文件名和路径。

>>> import os
>>> path = "/xubuntu/daily/current/lucid-alternate-i386.iso"
>>> file, ext = os.path.splitext(os.path.split(path)[1])
>>> processor = file[file.rfind("-") + 1:]
>>> processor
'i386'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM