[英]Python: if string starts with a string from a list
I'm reading a file and each line has a tag, followed by a colon and then the information that I want. 我正在读取文件,并且每行都有一个标记,然后是一个冒号,然后是我想要的信息。 A sample file would look like
一个样本文件看起来像
Package: com.something.something
Section: Utilities
Name: Something
etc, (It's an apt packages index if you're wondering) 等等,(如果您想知道的话,这是一个apt软件包索引)
so I want to loop through each line and see if that line starts with an element from a list. 所以我想遍历每一行,看看该行是否以列表中的元素开头。 I was thinking something like
我在想类似
PkgInfo={}
Tags=['Package', 'Section', 'Name']
for line in reader.readlines()
if line.startswith(element in Tags):
PkgInfo[element]=line.split(': ')[1]
This code doesn't work, but hopefully you understand what I am trying to do. 该代码不起作用,但是希望您理解我正在尝试做的事情。 How would I go about accomplishing this?
我将如何实现这一目标?
Working solution with slightly different logic: 逻辑稍有不同的工作解决方案:
PkgInfo={}
Tags=['Package', 'Section', 'Name']
for line in reader.readlines():
entry = line.strip().split(': ', 2)
if len(entry) != 2:
continue
element, value = entry[0], entry[1]
if element in Tags:
PkgInfo[element] = value
print PkgInfo
And pay attention to the fact that iteration over elements was not only one problem. 并注意以下事实:元素上的迭代不仅仅是一个问题。 'Package' in
Tags
was defined as 'Package: ', Tags
in loop referenced as tags
, split.line
instead line.split()
, value isn't stripped. 在“包”
Tags
被定义为“包:”, Tags
在环路中引用tags
, split.line
代替line.split()
值不被剥离。
I would suggest you just split line at :
and then test whether the first part is one of your keywords. 我建议您在
:
处分行,然后测试第一部分是否是您的关键字之一。 This can easily be done by using a set
and the in
operator: 这可以通过使用
set
和in
运算符轻松完成:
tags = set(['Package', 'Section', 'Name'])
pkgInfo = {k: v.strip() for k, v in (line.split(':') for line in reader) if k in tags}
Or the longer version: 或更长的版本:
tags = set(['Package', 'Section', 'Name'])
pkgInfo = {}
for line in reader:
k, v = line.split(':')
if k in tags:
pkgInfo[k] = v.strip()
But note that this will fail if there is not exactly one colon in each line. 但是请注意,如果每行中没有一个冒号,这将失败。
Try this: 尝试这个:
PkgInfo = {}
Tags = ['Package', 'Section', 'Name']
for line in reader.readlines():
for element in Tags:
if line.startswith(element):
PkgInfo[element] = line.split(': ')[1]
break
The problem with all solutions based on split() is that they will probably break if colon appears more than once. 所有基于split()的解决方案的问题在于,如果冒号出现不止一次,它们可能会中断。 This is less elegant but more robust:
这不是那么优雅,但更强大:
PkgInfo = {}
Tags = ['Package','Section','Name']
splitter = ': '
splitLen = len(splitter)
for line in reader.readlines():
firstColon = line.find(splitter)
if firstColon > 0:
key = line[:firstColon]
if key in Tags:
pkgInfo[key] = line[firstColon + splitLen:]
You need to iterate over Tags: 您需要遍历标签:
PkgInfo={}
Tags=['Package: ', 'Section', 'Name']
for line in reader.readlines():
for tag in Tags:
if line.startswith(tag):
PkgInfo[tag]=line.split(': ')[1]
break
I'd try something like this: 我会尝试这样的事情:
PkgInfo={}
#I assume it should be 'Package' not 'Package: '
Tags=['Package', 'Section', 'Name']
for line in reader.readlines()
k, v = line.split(': ')
if k in Tags:
PkgInfo[k] = v
or even quickier and dirtier two liner: 甚至更快更脏的两种衬里:
#I assume it should be 'Package' not 'Package: '
Tags=['Package', 'Section', 'Name']
PkgInfo = dict(line.split(': ') for line in reader.readlines() if line.split(': ')[0] in Tags)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.