简体   繁体   English

在一行文本中定位单词

[英]targeting words in a line of text

I have identified a line in a text file that looks like this:我在文本文件中确定了如下所示的一行:

FLAGS                    = WORD1 WORD2 WORD3

I am reading several files in which the number of words can vary from 0 to a maximum of 3.我正在阅读几个文件,其中的字数可以从 0 到最多 3 不等。

I'm using this code:我正在使用这段代码:

flag_FLAG = 0
for i in range(len(materialfile)):
    if  "FLAG" in materialfile[i] and "=" in materialfile[i]:
        line_FLAG = i
        flag_FLAG = 1
        
    if flag_FLAG == 1:
        
        temp = materialfile[line_FLAG].split(" ")
        for elem in temp:
            if is_word(elem):
                flags = str(elem)

unfortunately this way I only get one word (the last one).不幸的是,这样我只能得到一个字(最后一个)。 "is_word" is a function that i creat: “is_word”是我创建的 function:

def is_word(s):
    try:
        str(s)
        return True
    except ValueError:
        return False

I would like to get all the words as targets.我想将所有单词作为目标。 I hope I have been clear.我希望我已经清楚了。

You want a nested loop, eg:你想要一个嵌套循环,例如:

materialfile = [
    "FLAGS                    = WORD1 WORD2 WORD3",
]

flags = [
    flag
    for line in materialfile if "FLAGS" in line and "=" in line
    for flag in line.split(" = ")[1].split() if flag
]

print(flags)  # ['WORD1', 'WORD2', 'WORD3']

Hard to say whether this exact code will work with your actual file, since you didn't provide a sample file, but hopefully this gets you pointed in the right direction.很难说这个确切的代码是否适用于您的实际文件,因为您没有提供示例文件,但希望这能让您指出正确的方向。

Note that your is_word function does nothing since these are already strings and will hence always convert to str() as a no-op without raising an exception.请注意,您的is_word function 什么都不做,因为这些已经是字符串,因此将始终转换为str()作为无操作而不引发异常。 The if flag in the above comprehension will filter out values of flag that are empty (eg if you had a line like FLAGS = ).上述理解中的if flag将过滤掉空flag的值(例如,如果您有类似FLAGS =的行)。

I solved in this way:我是这样解决的:

for i in range(len(materialfile)):
    if  "FLAGS" in materialfile[i] and "="  in materialfile[i]:
        line_flag = i
        flag_flag = 1
if flag_flag == 1:
    flags = materialfile[line_flag].split(" = ")[1].split()

I don't know if this is an elegant way but it seems to work.我不知道这是否是一种优雅的方式,但它似乎有效。 Thanks谢谢

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM