简体   繁体   English

从每一行的文本文件中提取子字符串?

[英]Extract substrings from a text file on each line?

Is there a way to extract substrings from a textfile from each like eg Say this is the text file but with alot more lines like this:有没有办法从每个文本文件中提取子字符串,例如说这是文本文件,但有更多这样的行:

president, Donald Trump, 74, USA

Priminster, Boris Johnson, 56, UK

I would need to loop through each line and get substrings which are split by commas.我需要遍历每一行并获取用逗号分隔的子字符串。 So the that the substring would be Donald Trump, 74 and so on for the other lines.所以 substring 将是Donald Trump, 74等等其他线路。

Here you go:这里是 go:

with open('data.file') as f:
    for line in f:
        parts = line.split(', ')
        if len(parts) == 4:
            print(', '.join(parts[1:3]).strip())

Output: Output:

Donald Trump, 74
Boris Johnson, 56

You can use split, for splitting string at a specific character.您可以使用 split 来在特定字符处拆分字符串。 You will get a list, that you can join later on.您将获得一个列表,您可以稍后加入。 Reading a file is easy.读取文件很容易。

with open('filename.txt', 'r') as rf:
    lines = rf.readlines()

For this specific example you can do对于这个特定的例子,你可以做

for line in lines:
    line = line.strip()
    row  = "{}, {}".format(line.split(',')[1], line.split(',')[2])
    print(row)

Otherwise, please be more clear about what you would like to achieve.否则,请更清楚您想要实现的目标。

You could do it easily using simple split() and join() methods of string in python -您可以使用 python 中字符串的简单split()join()方法轻松完成 -

Working Code -工作代码 -

# You could open your file like this
#file1 = open('myfile.txt', 'r') 

# For now I am assuming your file contains the following line of data. 
# You could uncomment above line and use.

file1 = ['president, Donald Trump, 74, USA','president, Donald Trump, 74, USA']
for line in file1: 
    print("".join(line.split(',')[1:3]))

Output: Output:

Donald Trump, 74
Donald Trump, 74

Explanation解释

  • Basically you are just splitting the string ( each line in file ) at comma and converting the string into array.基本上你只是用逗号分割字符串(文件中的每一行)并将字符串转换为数组。 So line.split(',') will give -所以line.split(',')会给 -

     ['president', ' Donald Trump', ' 74', ' USA']
  • Now, we are just joining the 2nd and the 3rd element of the list obtained in the above step.现在,我们只是加入在上述步骤中获得的列表的第二个和第三个元素。 This is done by ",".join() which will join each elements of list with ',' .这是由",".join()完成的,它将用','连接列表的每个元素。

  • Also, note that we have used [1:3] which will select only the 1st and the 2nd element from the list.另外,请注意,我们使用了[1:3] ,它将 select 仅是列表中的第一个和第二个元素。 So they will give the result which is displayed above所以他们会给出上面显示的结果

Hope this helps !希望这可以帮助 !

Open the file, read the file line by line, then use pythons string.split method with a delimiter of a comma to get a list of words you can filter through.打开文件,逐行读取文件,然后使用带有逗号分隔符的string.split方法获取可以过滤的单词列表。

with open('filename.txt', 'r') as my_file:
    line = my_file.readline()
    while line:
        word_list = line.split(',')
        print(f'{word_list[1]}, {word_list[2]}')
        line = my_file.readline()
    

Try this:尝试这个:

lst = []
with open("textfile.txt", "r") as file:
  for line in file:
    stripped_line = line.strip()
    #to save it as a list
    lst.append(stripped_line.split(",")[1:-1])
print(lst)

#to print each of the element
for i in lst:
    print(",".join(i))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从文本文件的每一行提取子字符串? - How to extract substrings from each line in text file? python-从文本文件中提取每一行的一部分 - python - extract a part of each line from text file 在文本文件的每一行中的第一个逗号之前提取文本 - Extract text before the first comma in each line of a text file 从文本文件的每一行提取字符串,并将输出保存在csv行中 - Extract string from each line of a text file and save the output in csv rows 循环浏览文本文件中的每一行以提取唯一列表 - Loop through each line in a text file to extract a unique list 读取 python 中的文本文件并在每一行中提取特定值? - read text file in python and extract specific value in each line? 如何使用python提取pdf文件每一行中的文本 - how to extract text in each line of a pdf file using python 在文本文件中两个子字符串的每次出现之间提取文本 - Extract text between every occurrence of two substrings in a text file 如何从巨大的文本文件(> 16GB)中提取特定行,其中每行以另一个输入文件中指定的字符串开头? - How to extract specific lines from a huge text file (>16GB) where each line starts with a string specified in another input file? 如何从python中的txt文件中逐行提取文本 - How to extract text, line by line from a txt file in python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM