简体   繁体   English

如何连接以字母开头的行?

[英]How do I concatenate lines starting with a letter?

I am trying to concatenate lines in a text file into two lists.我正在尝试将文本文件中的行连接到两个列表中。 First list would contain lines starting uppercase letter and the second for lines which start with a '_'.第一个列表将包含以大写字母开头的行,第二个列表包含以“_”开头的行。 For instance:例如:

_CAA35997.1 unnamed protein product [Bos taurus] MRTPMLLALLALATLCLAGRADAKPGDAESGKGAAFVSKQEGSEVVKRLRRYLDHWLGAPAPYPDPLEPK REVCELNPDCDELADHIGFQEAYRRFYGPV _CAA35997.1 未命名蛋白产品 [Bos taurus] MRTPMLLALLALATLCLAGRADAKPGDAESGKGAAFVSKQEGSEVVKRLRRYLDHWLGAPAPYPDPLEPK REVCELNPDCDELADHIGFQEAYRRFYGPV

_CAA42669.1 beta-2-glycoprotein I, partial [Bos taurus] PALVLLLGFLCHVAIAGRTCPKPDELPFSTVVPLKRTYEPGEQIVFSCQPGYVSRGGIRRFTCPLTGLWP INTLKCMPRVCPFAGILENGTVRYTTFEYPNTISFSCHTGFYLKGASSAKCTEEGKWSPDLPVCAPITCP _CAA42669.1 beta-2-糖蛋白 I,部分 [Bos taurus] PALVLLLGFLCHVAIAGRTCPKPDELPFSTVVPLKRTYEPGEQIVFSCQPGYVSRGGIRRFTCPLTGLWP INTLKCMPRVCPFAGILENGTVRYTTFEYPNTISFSCHTGFYLKGASSAKCTEEGKWSPDLPVCAPITCP

First list=['MRTPMLLALLALATLCLAGRADAKPGDAESGKGAAFVSKQEGSEVVKRLRRYLDHWLGAPAPYPDPLEPK REVCELNPDCDELADHIGFQEAYRRFYGPV','PALVLLLGFLCHVAIAGRTCPKPDELPFSTVVPLKRTYEPGEQIVFSCQPGYVSRGGIRRFTCPLTGLWPINTLKCMPRVCPFAGILENGTVRYTTFEYPNTISFSCHTGFYLKGASSAKCTEEGKWSPDLPVCAPITCP']第一个列表=['MRTPMLLALLALATLCLAGRADAKPGDAESGKGAAFVSKQEGSEVVKRLRRYLDHWLGAPAPYPDPLEPK REVCELNPDCDELADHIGFQEAYRRFYGPV','PALVLLLGFLCHVAIAGRTCPKPDELPFSTVVPLKRTYEPGEQIVFSCQPGYVSRGGIRRFTCPLTKTCPKTCPGTCPGWASSTTFEYPNTIKWSPCGLPVCAPI

Second list=['_CAA35997.1','_CAA42669.1']第二个列表=['_CAA35997.1','_CAA42669.1']

I have tried the following which does not seem to work.我尝试了以下似乎不起作用的方法。 I am running into an issue where each new line is stored as a new entry in the first list, and not concatenating the lines into one entry:我遇到了一个问题,每个新行都作为新条目存储在第一个列表中,而不是将这些行连接到一个条目中:

for i in seq.text:
  if (i=='_'):
    second_list.append(i)
  else:
    first_list.append(i)

The easiest way is just to do what you're currently doing, and then do str.join() afterwards to "concatenate" the entire list to each other at once, in order:最简单的方法就是执行您当前正在执行的操作,然后执行str.join()以一次将整个列表“连接”到彼此,按顺序:

for i in seq.text:
  if i.startswith('_'):
    second_list.append(i)
    # to more closely resemble the output you put in your question,
    # you might want to only append the part up to the first whitespace:
    # second_list.append(i.split()[0])
  else:
    first_list.append(i)

first_string = ''.join(first_list)
second_string = ''.join(second_list)

Using an empty string as the separator means that they're concatenated directly to each other, with nothing in between.使用空字符串作为分隔符意味着它们直接相互连接,中间没有任何内容。 You can also use anything else as a separator - a comma ',' , a space ' ' , a newline '\n' , or any combination depending on what your desired output is.您还可以使用其他任何东西作为分隔符 - 逗号',' 、空格' ' 、换行符'\n'或任何组合,具体取决于您想要的 output 是什么。

import re
a_file = open("your_path/test.txt", "r")
list1 = []
list2 = []
for line in a_file:
    if not line.strip(): continue  # skip the empty line
    stripped_line = line.strip()
    line_list = ''.join(stripped_line)
    # To consider '_' in the first list
    #x = re.findall(r"\b_\w+", line_list)
    if (line_list.isupper()): # if (x):
        list1.append(line_list)`
    else:
        list2.append(line_list)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 if 语句进行以某个字母开头的输入 - How do I use the if statement for input starting with a certain letter 如何从包含特定字母的列表中打印出单词? 我知道如何处理以特定字母开头但不以 BETWEEN 开头的单词 - How can I print out words from a list that contains a particular letter? I know what to do with words starting with specific letter but not in BETWEEN 如何连接整数和字母,以便结果保持整数 - How do you concatenate an integer and letter so that the result remains an integer 如何将文件中的行连接到一行代码? - How do I concatenate lines from a file onto a single line of code? 我如何有条件串联 - How Do I Conditionally Concatenate 我有一个包含几行的.txt 文件。 我必须从中删除所有带有字母“a”的行并将其写入另一个文件。 我该怎么做? - I've got a .txt file with a few lines. I have to remove all the lines with the letter "a" from it and write it to another file. How do I do it? 如何在列表中找到最常见的起始字母? - How can I find the most common starting letter in a list? 如何组织从特定字母开始的按字母顺序排列的字符串列表? - How can I organize a list of strings alphabetically starting on a specific letter? 如何从 python 中的关键字开始并以不同关键字结尾的字符串中提取特定行? - How do I extract specific lines from a string starting from a keyword and ending at a different keyword in python? 如何删除空行和从“/publications”开始的行? - how I can delete empty lines and lines starting from '/publications'?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM