[英]Extract column from tabular data
My task is to pull a column out of table and write down its length len ()
.我的任务是从表中拉出一列并写下它的长度
len ()
。 But my code is emitting it into a column, which is why len ()
counts each element of the column separately, and not their total但是我的代码将它发送到一列中,这就是为什么
len ()
分别计算列的每个元素,而不是它们的总数
water = water.readlines()
for col in water:
el = list(col.split()[2])
water.txt:水.txt:
HETATM 1 H HOH A 1 27.265 36.739 58.126
HETATM 2 H HOH A 1 27.109 35.124 57.944
HETATM 3 O HOH A 1 27.486 35.958 57.542
...
HETATM 9999 O HOH A3333 30.490 83.899 10.929
Desired intermediary output:所需中介 output:
H
H
O
H
H
O
You are not correctly extracting the colum.您没有正确提取列。 The correct way is with a list comprehension:
正确的方法是使用列表理解:
with open(...) as water:
el = [line.split()[2] for line in water]
With your sample data, I get ['H', 'H', 'O']
for el, which is the third column.使用您的示例数据,我得到
['H', 'H', 'O']
的 el,这是第三列。
For the future you'll likely use other means to import data in tabular form.将来您可能会使用其他方式以表格形式导入数据。 But this is an important exercise, because the following will apply to most issues you will face.
但这是一项重要的练习,因为以下内容适用于您将面临的大多数问题。
The most important initial concept is to use plenty of print statements to understand what each step does.
file = "HETATM 1 H HOH A 1 27.265 36.739 58.126\nHETATM 2 H HOH A 1 27.109 35.124 57.944\nHETATM 3 O HOH A 1 27.486 35.958 57.542\n"
lines=file.split('\n')
print(lines)
output is a list of strings: output 是一个字符串列表:
['HETATM 1 H HOH A 1 27.265 36.739 58.126',
'HETATM 2 H HOH A 1 27.109 35.124 57.944',
'HETATM 3 O HOH A 1 27.486 35.958 57.542',
'']
Each line is now still a string, so you need to turn it into a list for example:每行现在仍然是一个字符串,因此您需要将其转换为一个列表,例如:
a=lines[2].split()
print(a)
output is a list of strings, each string one column value for this particular line/row: output 是一个字符串列表,每个字符串对应这个特定的行/行的一列值:
['HETATM', '3', 'O', 'HOH', 'A', '1', '27.486', '35.958', '57.542']
To do that for every line and keep the 3rd column (index 2):要为每一行执行此操作并保留第 3 列(索引 2):
col2=[] # make an empty list to hold the column
for l in lines:
if len(l)>1: # leaves empty lines, also at end of file
cols=l.split()
col2.append(cols[2])
print(col2)
output is a list representing your 2nd column output 是代表您的第二列的列表
['H', 'H', 'O']
Because Python is used with many packages that do a lot in a single line, and also because of duck-typing, it is more important than in other languages to always know what the result of your last line is, both in type and in meaning.因为 Python 与许多在一行中执行很多操作的包一起使用,并且还因为鸭式打字,所以始终知道最后一行的结果是什么比其他语言更重要,无论是类型还是含义.
In the future you will likely use numpy
or pandas
to read in tabular data in a single line.将来,您可能会使用
numpy
或pandas
在一行中读取表格数据。 But to understand that single line can sometimes be hard.但有时很难理解单行。 It is also hard to memorize.
也很难记住。 Doing it yourself in low level code as shown above will help you stay connected to your code.
如上所示,在低级代码中自己进行操作将帮助您与代码保持联系。 It will also help you to read how other people implemented higher level functions.
它还将帮助您了解其他人如何实现更高级别的功能。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.