简体   繁体   English

按列匹配python pyfits表

[英]Match python pyfits table by column

I am trying to match two pyfits data objects by a OBJNO (object number) column. 我正在尝试通过OBJNO(对象编号)列匹配两个pyfits数据对象。 In IDL this is done with the match, A.objno, B.objno, ii, jj procedure. 在IDL中,这是通过match, A.objno, B.objno, ii, jj过程完成的。 This returns two indicies ii and jj which index A and B such that A[ii].objno == B[ii].objno . 此返回两个indicies iijj哪个索引AB使得A[ii].objno == B[ii].objno

Is there a numpy/pythonic way of doing this? 有numpy / pythonic的方法吗? I would not like to make any stipulations on the ordering or size of either A or B since they may be large fits tables. 我不想对AB的顺序或大小作任何规定,因为它们可能是大尺寸的桌子。 This is what I am doing now: 这就是我现在正在做的:

ii = np.in1d(A.OBJNO, B.OBJNO).nonzero()[0]
jj = [np.where(B.OBJNO == objno)[0][0] for objno in A[ii].field('OBJNO')]

Is there a better numpy array matching algorithm? 是否有更好的numpy数组匹配算法?

Denoting the sizes of your arrays N and M ( N > M ), your solution is O(N*M). 表示数组NM的大小( N > M ),您的解决方案是O(N * M)。

Assuming your arrays are large enough, you'll be better off with a O(N*logN) solution. 假设您的数组足够大,那么使用O(N * logN)解决方案会更好。 You can achieve that by first sorting the bigger array (call is A ), and then performing a binary search (eg using bisect ) of each element from B in sorted_A . 您可以通过首先对较大的数组(称为A )进行排序,然后对sorted_A B中的每个元素执行二进制搜索(例如,使用bisect )来sorted_A

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM