Python - 从文件到数据结构？

Question

I have large file comprising ~100,000 lines. 我有大文件包含~100,000行。 Each line corresponds to a cluster and each entry within each line is a reference id for another file (protein structure in this case), eg 每行对应于一个簇，每一行中的每个条目是另一个文件的参考id（在这种情况下是蛋白质结构），例如

1hgn 1dju 3nmj 8kfn
9opu 7gfb 
4bui

I need to read in the file as a list of lists where each line is a sublist, thus preserving the integrity of the cluster, eg 我需要在文件中读取列表，其中每行是子列表，从而保持集群的完整性，例如

nested_list = [['1hgn', '1dju', '3nmj', '8kfn'], ['9opu', '7gfb'], ['4bui']]

My current code creates a nested list but the entries within each list are a single string and not comma separated. 我当前的代码创建了一个嵌套列表，但每个列表中的条目都是单个字符串，而不是逗号分隔。 Therefore, I cannot splice the list with indices so easily. 因此，我不能轻易地将索引与索引拼接在一起。

Any help greatly appreciated. 任何帮助非常感谢。

Thanks, S :-) 谢谢，S :-)

Answer 1

Super simple: 超级简单：

with open('myfile', 'r') as f:
    data = [line.split() for line in f]

Answer 2

You'll want to investigate the str.split() method. 您将要研究str.split()方法。

>>> '1hgn 1dju 3nmj 8kfn'.split()
['1hgn', '1dju', '3nmj', '8kfn']

Python - 从文件到数据结构？

问题描述

2 个解决方案

解决方案1
13 已采纳 2010-05-28 12:30:37

解决方案2
6 2010-05-28 12:28:53

Python - 从文件到数据结构？

问题描述

2 个解决方案

解决方案1 13 已采纳 2010-05-28 12:30:37

解决方案2 6 2010-05-28 12:28:53

解决方案1
13 已采纳 2010-05-28 12:30:37

解决方案2
6 2010-05-28 12:28:53