简体   繁体   English

Python:如果字符串以列表中的字符串开头

[英]Python: if string starts with a string from a list

I'm reading a file and each line has a tag, followed by a colon and then the information that I want. 我正在读取文件,并且每行都有一个标记,然后是一个冒号,然后是我想要的信息。 A sample file would look like 一个样本文件看起来像

Package: com.something.something
Section: Utilities
Name: Something

etc, (It's an apt packages index if you're wondering) 等等,(如果您想知道的话,这是一个apt软件包索引)
so I want to loop through each line and see if that line starts with an element from a list. 所以我想遍历每一行,看看该行是否以列表中的元素开头。 I was thinking something like 我在想类似

PkgInfo={}
Tags=['Package', 'Section', 'Name']
for line in reader.readlines()
    if line.startswith(element in Tags):
        PkgInfo[element]=line.split(': ')[1]

This code doesn't work, but hopefully you understand what I am trying to do. 该代码不起作用,但是希望您理解我正在尝试做的事情。 How would I go about accomplishing this? 我将如何实现这一目标?

Working solution with slightly different logic: 逻辑稍有不同的工作解决方案:

PkgInfo={}
Tags=['Package', 'Section', 'Name']


for line in reader.readlines():
    entry = line.strip().split(': ', 2)
    if len(entry) != 2:
        continue
    element, value = entry[0], entry[1]
    if element in Tags:
        PkgInfo[element] = value

print PkgInfo

And pay attention to the fact that iteration over elements was not only one problem. 并注意以下事实:元素上的迭代不仅仅是一个问题。 'Package' in Tags was defined as 'Package: ', Tags in loop referenced as tags , split.line instead line.split() , value isn't stripped. 在“包” Tags被定义为“包:”, Tags在环路中引用tagssplit.line代替line.split()值不被剥离。

I would suggest you just split line at : and then test whether the first part is one of your keywords. 我建议您在:处分行,然后测试第一部分是否是您的关键字之一。 This can easily be done by using a set and the in operator: 这可以通过使用setin运算符轻松完成:

tags = set(['Package', 'Section', 'Name'])
pkgInfo = {k: v.strip() for k, v in (line.split(':') for line in reader) if k in tags}

Or the longer version: 或更长的版本:

tags = set(['Package', 'Section', 'Name'])
pkgInfo = {}

for line in reader:
    k, v = line.split(':')
    if k in tags:
        pkgInfo[k] = v.strip()

But note that this will fail if there is not exactly one colon in each line. 但是请注意,如果每行中没有一个冒号,这将失败。

Try this: 尝试这个:

PkgInfo = {}
Tags = ['Package', 'Section', 'Name']

for line in reader.readlines():
    for element in Tags:
        if line.startswith(element):
            PkgInfo[element] = line.split(': ')[1]
            break

The problem with all solutions based on split() is that they will probably break if colon appears more than once. 所有基于split()的解决方案的问题在于,如果冒号出现不止一次,它们可能会中断。 This is less elegant but more robust: 这不是那么优雅,但更强大:

PkgInfo = {}
Tags = ['Package','Section','Name']
splitter = ': '
splitLen = len(splitter)
for line in reader.readlines():
  firstColon = line.find(splitter)
  if firstColon > 0: 
    key = line[:firstColon]
    if key in Tags:
      pkgInfo[key] = line[firstColon + splitLen:] 

You need to iterate over Tags: 您需要遍历标签:

PkgInfo={}
Tags=['Package: ', 'Section', 'Name']
for line in reader.readlines():
    for tag in Tags:
        if line.startswith(tag):
            PkgInfo[tag]=line.split(': ')[1]
            break

I'd try something like this: 我会尝试这样的事情:

 PkgInfo={}
 #I assume it should be 'Package' not 'Package: '
 Tags=['Package', 'Section', 'Name']

 for line in reader.readlines()
    k, v = line.split(': ')
    if k in Tags:
        PkgInfo[k] = v

or even quickier and dirtier two liner: 甚至更快更脏的两种衬里:

 #I assume it should be 'Package' not 'Package: '
 Tags=['Package', 'Section', 'Name']

 PkgInfo = dict(line.split(': ') for line in reader.readlines() if line.split(': ')[0] in Tags)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM