简体   繁体   English

找到k个连续数字后如何终止字符串?

[英]How can you terminate a string after k consecutive numbers have been found?

Say I have some list with files of the form *.1243.* , and I wish to obtain everything before these 4 digits. 假设我有一些列表,文件格式为*.1243.* ,我希望获得这4位数字之前的所有内容。 How do I do this efficiently? 我如何有效地做到这一点?

An ugly, inefficient example of working code is: 工作代码的一个丑陋,低效的示例是:

names = []
for file in file_list:
    words = file.split('.')
    for i, word in enumerate(words):
        if word.isdigit():
            if int(word)>999 and int(word)<10000:
                names.append(' '.join(words[:i]))
                break
print(names)

Obviously though, this is far from ideal and I was wondering about better ways to do this. 显然,这远非理想,我想知道这样做的更好方法。

You may want to use regular expressions for this. 您可能要为此使用正则表达式。

import re

name = []
for file in file_list:
    m = re.match(r'^(.+?)\.\d{4}\.', file)
    if m:
        name.append(m.groups()[0])

Using a regular expression, this would become simpler 使用正则表达式,这将变得更简单

import re

names = ['hello.1235.sas','test.5678.hai']

for fn in names:
    myreg = r'(.*)\.(?:\d{4})\..*'
    output = re.findall(myreg,fn)
    print(output)

output: 输出:

['hello']
['test']

If you know that all entries has the same format, here is list comprehension approach: 如果您知道所有条目都具有相同的格式,则这里是列表理解方法:

[item[0] for item in filter(lambda start, digit, end: len(digit) == 4, (item.split('.') for item in file_list))]

To be fair I also like solution, provided by @James. 公平地说,我也喜欢@James提供的解决方案。 Note, that downside of this list comprehension is three loops: 1. On all items to split 2. Filtering all items, that match 3. Returning result. 请注意,此列表理解的缺点是三个循环:1.在要拆分的所有项目上2.筛选匹配的所有项目3.返回结果。

With regular for loop it could be be more sufficient: 使用常规的for循环可能就足够了:

output = []
for item in file_list:
    begging, digits, end = item.split('.')
    if len(digits) == 4:
        output.append(begging)

It does only one loop, which way better. 它只做一个循环,这样更好。

You can use Positive Lookahead (?=(\\.\\d{4})) 您可以使用Positive Lookahead (?=(\\.\\d{4}))

import re
pattern=r'(.*)(?=(\.\d{4}))'

text=['*hello.1243.*','*.1243.*','hello.1235.sas','test.5678.hai','a.9999']


print(list(map(lambda x:re.search(pattern,x).group(0),text)))

output: 输出:

['*hello', '*', 'hello', 'test', 'a']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何生成包含连续数字和字母混合的列表? - How can you generate a list containing a mix of consecutive numbers and letters? !=后只能有一个字符串吗? - Can you only have one string after != 经过增白转换和K-means聚类(python)后如何识别数据 - How to identify data after they have been through a whitening transformation and K-means clustering, python Python-找到匹配项后如何读取字符串的其余部分 - Python - how to read the remainder of a string after a match has been found 如何将字符串中的连续字母和连续数字分别合并到列表中? - How to merge consecutive letters and consecutive numbers from a string to a list separetely? 如何只连接字符串中的连续数字? - How to concatenate only consecutive numbers in a string? 在字符串中连续添加数字 - Consecutive addition of numbers in a string 如何生成连续数字列表? - How can I generate a list of consecutive numbers? 如何使用Discord机器人在文本文件中添加作者+数字对,然后在其后添加连续数字? - How can I add an author + number pair to a text file with my discord bot, and then add consecutive numbers after? 如何在python的列表算法中减少k个连续数字的最大和的执行时间 - How to decrease execution time of maximum sum of k consecutive numbers in a list algorithm in python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM