简体   繁体   中英

How to store and query a lot of matrices?

Problem:

Given are k N x M dimensional matrices (eg M1 .. M5). Values are zero and ones only. How would you find all matrices that collide with a query matrix eg Q ? Collision means if the query matrix have a "1" at the same position as the matrices from the database.

Example :

For this simple example the algorithm should find M1, M2, M3, M4 for the query but not M5 since there are not ones matching with the query matrix.

M1:                        M3:
+-----------------+        +-----------------+
| 0 0 0 0 0 0 0 0 |        | 0 0 0 0 0 0 0 1 |
| 0 1 1 0 0 0 0 0 |        | 0 0 0 0 0 0 0 0 |
| 0 1 1 0 0 1 1 0 |        | 0 0 1 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 |        | 0 0 1 0 0 0 0 1 |
+-----------------+        +-----------------+

M2:                        M4:
+-----------------+        +-----------------+
| 0 0 0 0 0 1 1 0 |        | 0 0 0 0 0 0 0 0 |
| 0 0 1 1 0 0 0 0 |        | 1 1 1 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 |        | 0 0 0 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 |        | 0 0 0 0 0 0 0 0 |
+-----------------+        +-----------------+

M5:
+-----------------+
| 0 0 0 0 0 0 0 0 |
| 0 0 0 0 1 0 0 0 |
| 0 0 0 0 0 1 0 0 |
| 0 0 0 0 0 0 0 0 |
+-----------------+

Q:
+-----------------+
| 0 0 0 0 0 0 0 0 |
| 0 0 1 0 0 0 0 0 |
| 0 0 1 0 0 0 0 0 |
| 0 0 0 0 0 0 0 0 |
+-----------------+

Naive solution:

Iterate over all matrices and do a bitwise AND:

Match :

M1: 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 0 0 0 0
      Q: 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
--------------------------------------------------------------------------
M1 && Q: 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 
--------------------------------------------------------------------------

No match :

M5: 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
      Q: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
--------------------------------------------------------------------------
M5 && Q: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
--------------------------------------------------------------------------

Questions:

  1. Can this be done in sub-linear time?
  2. Are there better algorithms than the given naive approach?
  3. What would be a good way to store and query the matrix data in a database?

Note to question 3 : I thought about storing the integer values of the matrices in a MySQL table and use a MySQL bitwise query to find them. Would this work (scale) if the matrices become much bigger eg 100x100?

1 & 2. A sub linear ( < O(n*m) ) solution could be achieved using sparse matrix approaches and terminating at the first collision. Basically at each row you have a list of indexes that have a 1 and see if there is a collision. Technically this can be O(n*m) if the Q has 0's except for the last column of 1's and M is just the inverse of that.

3.The answer to this part depends on the restrictions of your system and how the matrices are composed. If the matrices are not sparse, and you are looking at memory usage you could store rows as a collection of ints that decompose to the 1's and 0's. If the matrices are sparse you could simply store a collection of points.

Construct another matrix, say P , of size N*M where every element is a bitset of size k . P(i,j) has the k -th bit set if the k -th matrix has 1 at (i,j) . Given Q , start off with an empty k -bitset, Result . For every (i,j) such that Q(i,j)==1 , do Result |= P(i,j) . This algorithm requires O(k*N*M) preprocessing time. Every subsequent query runs in O(N*M*(number of 1s in Q)) .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM