繁体   English   中英

同时对 3 个列表进行数学运算

[英]math operations on 3 lists at the same time

我有六个文件(来自蛋白质数据库),其中包含两种称为 CYS 和 LYS 的蛋白质的 x 、 y 、 z 坐标。 最终目标是计算每个文件中每个 LYS 与每个 CYS 之间的距离。

我已经提取了坐标,然后放在六个单独的列表中。 现在我需要计算与 xyz 坐标的距离为:

dist = math.sqrt((xc - xl)**2 + (yc - yl)**2 + (zc - zl)**2)

但我不知道如何遍历六个列表来计算每个文件中 CYS 和 LYS 之间的距离。

以下是文件内容的样子(仅以从文件中复制包含 LYS 的部分为例):

ATOM     43  CA  LYS A   7     106.336  41.686 -11.244  1.00 21.93           C
ATOM     44  C   LYS A   7     106.561  41.901 -12.727  1.00 21.10           C
ATOM     45  O   LYS A   7     106.327  43.032 -13.214  1.00 24.85           O
ATOM     46  CB  LYS A   7     107.553  41.913 -10.402  1.00 24.26           C
ATOM     47  CG  LYS A   7     107.550  41.181  -9.058  1.00 33.89           C
ATOM     48  CD  LYS A   7     108.522  41.766  -8.051  1.00 35.19           C
ATOM     49  CE  LYS A   7     109.455  40.737  -7.453  1.00 58.09           C
ATOM     50  NZ  LYS A   7     110.799  40.722  -8.120  1.00 55.93           N
ATOM     51  N   THR A   8     106.979  40.859 -13.401  1.00 19.73           N
ATOM     52  CA  THR A   8     107.196  40.777 -14.860  1.00 21.18           C
ATOM     53  C   THR A   8     105.925  41.136 -15.620  1.00 21.07           C
ATOM     54  O   THR A   8     105.925  42.020 -16.497  1.00 14.72           O

这是我的代码:

BaseDir=os.getcwd()

all_files = np.sort(glob('*[0-600]*.ent'))

for filename in all_files:

    Xc = [] # X coordinate of CYS
    Yc = []
    Zc = []
    Xl = []  # X coordinate of LYS
    Yl = []
    Zl = []

    f = open(filename)
    Lines = f.readlines()
    for i in range(1, len(Lines)):
        if 'CA  CYS' in Lines[i]:
           linec = Lines[i].split()
           if 'CA  CYS' in Lines[i] and linec[0]=='ATOM':
              xc, yc, zc = linec[6] , linec[7], linec[8]
              Xc.append(xc)
              Yc.append(yc)
              Zc.append(zc)
        if 'CA  LYS' in Lines[i]:
            linel = Lines[i].split()
            if 'CA  LYS' in Lines[i] and linel[0]=='ATOM':
              xl, yl, zl = linel[6] , linel[7], linel[8]
              Xl.append(xl)
              Yl.append(yl)
              Zl.append(zl)
    dist = math.sqrt((xc - xl)**2 + (yc - yl)**2 + (zc - zl)**2)

当我打印(Xc,文件名)时,它返回:

['87.372', '73.504', '86.059', '82.490', '74.176', '80.312'] 1.ent
['22.872', '13.708'] 2.ent
[] 3.ent
['62.740', '33.741', '18.064', '46.480', '36.255', '63.534', '49.543', '22.826'] 4.ent
['23.404', '-2.617', '50.714', '11.544', '38.216', '-17.818', '-7.237', '21.019', '-19.612', '37.235', '8.371', '51.634'] 5.ent
['66.407', '63.032', '60.134', '14.158', '17.494', '20.312'] 6.ent

当我打印(Xl,文件名)时:

['106.336', '105.826', '101.645', '81.196', '90.656', '96.290', '97.616', '93.983'] 1.ent
['4.430', '5.438', '19.787', '14.569', '23.059', '22.801', '16.723', '15.916'] 2.ent
['22.609', '32.122', '43.387', '41.576', '41.878', '38.004', '33.163', '38.948', '30.836', '23.899'] 3.ent
['21.847', '11.694', '10.507', '11.545', '11.775', '19.945', '27.931', '37.720', '46.445', '32.629', '30.896', '20.769', '16.377', '9.590', '15.170', '14.925', '47.464', '41.800', '24.277', '51.964', '36.706', '30.401', '25.410', '30.474', '50.309', '49.434', '40.009', '44.067', '43.220', '47.551', '52.487', '48.386', '40.121', '37.329', '21.309', '29.918', '35.721', '16.986', '14.680', '11.808', '11.466', '12.679', '17.290', '27.441', '27.388', '16.853', '52.991', '63.359', '67.769', '73.203', '68.424', '71.665', '34.917', '43.296', '60.160', '34.711', '50.052', '56.439', '60.780', '55.977', '37.295', '37.875', '47.683', '44.875', '42.006', '37.175', '32.072', '39.541', '48.253', '49.848', '65.227', '57.237', '48.009', '67.401', '70.352', '73.582', '74.629', '73.458', '70.474', '61.632', '60.699', '68.440'] 4.ent
['-0.840', '32.630', '27.111', '5.772', '0.552', '5.795', '27.208', '25.416', '24.445', '15.503', '33.113', '19.430', '17.972', '22.147', '27.065', '16.759', '12.083', '-3.498', '10.533', '-10.681', '-8.709', '2.418', '-7.800', '-22.468', '-19.818', '-22.713', '-19.877', '-10.223', '-12.596', '-21.356', '1.043', '-4.927', '-21.858', '-21.388', '-15.276', '3.474', '1.652', '-0.966', '-8.278', '23.326', '-1.463', '9.358', '13.785', '18.642', '7.074', '1.475', '-6.532', '-3.374', '-14.994', '2.388', '18.468', '-1.254', '55.980'] 5.ent
['67.045', '49.407', '52.772', '52.214', '55.680', '55.832', '78.610', '67.134', '79.549', '80.258', '80.339', '74.666', '73.443', '65.523', '67.405', '70.133', '66.798', '61.540', '49.690', '49.952', '50.093', '43.900', '49.549', '45.703', '39.861', '54.826', '59.250', '66.840', '43.908', '37.976'] 6.ent

这是一个开始:

import numpy as np
from scipy.spatial.distance import cdist


cys_coords = np.loadtxt("cys_data.txt", usecols=(6, 7, 8))
lys_coords = np.loadtxt("lys_data.txt", usecols=(6, 7, 8))  # assuming the same format
distances = cdist(cys_coords, lys_coords)

您可以修改它以循环遍历文件路径字符串列表以读取您的数据。 如果您事先知道您有多少数据点,您可以为您的 CYS 和 LYS 数据预先分配 numpy 数组。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM