在python中减去匹配的行

Question

I got two files each containing a column with "time" and one with "id" like this: 我得到了两个文件，每个文件都包含带有“时间”的列和一个带有“ id”的列，如下所示：

File 1: 文件1：

time     id
11.24    1
11.26    2
11.27    3
11.29    5
11.30    6

File 2: 档案2：

time     id
11.25    1
11.26    3
11.27    4
11.31    6
11.32    7
11.33    8

Im trying to do a python script which can subtract the time of the rows with matching id from each other. 我正在尝试做一个python脚本，可以互相减去匹配ID的行的时间。 The files are of different length. 文件长度不同。

I tried using set(id's of file 1) & set(id's of file 2) to get the matching id, but now I'm stuck. 我尝试使用set(id's of file 1) & set(id's of file 2)获取匹配的id，但是现在我被卡住了。 Any help will be much appreciated, thank you. 任何帮助将不胜感激，谢谢。

Answer 1

List comprehensions can do the trick very easily: 列表理解可以很容易地达到目的：

#read these from file if you want to, included in this form for brevity
F1 = {1: 11.24, 2: 11.26, 3:11.27, 5:11.29, 6:11.30}
F2 = {1:11.25, 3:11.26, 4:11.27, 6:11.31, 7:11.32, 8:11.33}

K1 = set(F1.keys())
K2 = set(F2.keys())

result = dict([ (k, F1[k] - F2[k]) for k in (K1 & K2)])
print result

This will output: 这将输出：

{1: -0.009999999999999787, 3: 0.009999999999999787, 6: -0.009999999999999787}

Edit: As mhawke points out, the last line could read: 编辑：正如mhawke指出的那样，最后一行可能显示为：

result = {k: F1[k] - F2[k]) for k in (K1 & K2)}

I had forgotten all about dict comprehensions. 我已经忘记了所有关于字典理解的知识。

Answer 2

Python Set do not support ordering for the elements. Python Set不支持元素的排序。 I would store the data as a dictionary 我会将数据存储为字典

file1 = {1:'11:24', 2:'11:26', ... etc}
file2 = {1:'11:25', 3:'11:26', ... etc}

The loop over the intersection of the keys (or union based on your needs) to do the subtraction (time based or math based). 在键的交集（或根据您的需要的并集）上循环以进行减法（基于时间或基于数学）。

Answer 3

This is a bit old school. 这有点老派。 Look at using a default dict from the collections module for a more elegant approach. 查看使用collections模块中的默认dict以获得更优雅的方法。

This will work for any number of files, I've named mine f1 , f2 etc. The general idea is to process each file and build up a list of time values for each id. 这将适用于任何数量的文件，我将其命名为mine f1 ， f2等。一般的想法是处理每个文件并为每个id建立一个时间值列表。 After file processing, iterate over the dictionary subtracting each value as you go (via reduce on the values list). 在文件处理之后，遍历字典，同时减去每个值（通过在值列表上reduce ）。

from operator import sub

d = {}
for fname in ('f1','f2'):
    for l in open(fname):
        t, i = l.split()
        d[i] = d.get(i, []) + [float(t)]

results = {}
for k,v in d.items():
    results[k] = reduce(sub, v)

print results
{'1': -0.009999999999999787, '3': 0.009999999999999787, '2': 11.26, '5': 11.29, '4': 11.27, '7': 11.32, '6': -0.009999999999999787, '8': 11.33}

Updated 更新

If you want to include only those ids with more than one value: 如果您只想包含多个值的ID：

results = {}
for k,v in d.items():
    if len(v) > 1:
        results[k] = reduce(sub, v)

Answer 4

You can use this as a base (instead of treating '11.24' as a float, I guess you want to adapt for hours/minutes or minutes/seconds)... you can effectively union and subtract matching keys using a defaultdict . 您可以以此为基础（而不是将'11 .24'视为浮点数，我想您想适应小时/分钟或分钟/秒）...您可以使用defaultdict有效地合并和减去匹配键。

As long as you can get your data into a format like this: 只要您可以将数据转换成如下格式：

f1 = [
    [11.24, 1],
    [11.26, 2],
    [11.27, 3],
    [11.29, 5],
    [11.30, 6]
]

f2 = [
    [11.25, 1],
    [11.26, 3],
    [11.27, 4],
    [11.31, 6],
    [11.32, 7],
    [11.33, 8]
]

Then: 然后：

from collections import defaultdict
from itertools import chain

dd = defaultdict(float)
for k, v in chain(
    ((b, a) for a, b in f1),
    ((b, -a) for a, b in f2)): # negate a

    dd[k] += v

Results in: 结果是：

{1: -0.009999999999999787,
 2: 11.26,
 3: 0.009999999999999787,
 4: -11.27,
 5: 11.29,
 6: -0.009999999999999787,
 7: -11.32,
 8: -11.33}

For matches only 仅适用于比赛

matches = dict( (k, v) for v, k in f1 )
d2 = dict( (k, v) for v, k in f2 )

for k, v in matches.items():
    try:
        matches[k] = v - d2[k]
    except KeyError as e:
        del matches[k]

print matches
# {1: -0.009999999999999787, 3: 0.009999999999999787, 6: -0.009999999999999787}

在python中减去匹配的行

问题描述

4 个解决方案

解决方案1
3 2012-07-20 12:03:04

解决方案2
2 2012-07-20 10:42:16

解决方案3
0 2012-07-20 10:55:14

解决方案4
0 2012-07-20 10:57:04

在python中减去匹配的行

问题描述

4 个解决方案

解决方案1 3 2012-07-20 12:03:04

解决方案2 2 2012-07-20 10:42:16

解决方案3 0 2012-07-20 10:55:14

解决方案4 0 2012-07-20 10:57:04

解决方案1
3 2012-07-20 12:03:04

解决方案2
2 2012-07-20 10:42:16

解决方案3
0 2012-07-20 10:55:14

解决方案4
0 2012-07-20 10:57:04