简体   繁体   English

Python内存表

[英]Python In-memory table

What is the right way to forming in-memory table in python with direct lookups for rows and columns. 在python中使用直接查找行和列来形成内存表的正确方法是什么。
I thought of using dict of dicts this way, 我想过以这种方式使用dicts的dict,

class Table(dict):
    def __getitem__(self, key):
        if key not in self:
             self[key]={}
        return dict.__getitem__(self, key)
table = Table()
table['row1']['column1'] = 'value11'
table['row1']['column2'] = 'value12'
table['row2']['column1'] = 'value21'
table['row2']['column2'] = 'value22'
>>>table
{'row1':{'column1':'value11','column2':'value12'},'row2':{'column1':'value21','column2':'value22'}}

I had difficulty in looking up for values in columns. 我很难在列中查找值。

>>>'row1' in table
True
>>>'value11' in table['row1'].values()
True

Now how do I do lookup if 'column1' has 'value11' 现在,如果'column1'具有'value11'我该如何查找
Is this method of forming tables wrong? 这种形成表的方法是错误的吗?
Is there a better way to implement such tables with easier lookups?. 是否有更好的方法来实现这些表更容易查找?

I'd use an in-memory database with SQLite for this. 为此,我将使用带有SQLite 的内存数据库 The sqlite module is even in the standard library since Python 2.5, which means this doesn't even add much to your requirements. 自Python 2.5以来,sqlite模块甚至在标准库中,这意味着它甚至不会增加您的要求。

Now how do I do lookup if 'column1' has 'value11' 现在,如果'column1'具有'value11',我该如何查找

any(arow['column1'] == 'value11' for arow in table.iteritems())

Is this method of forming tables wrong? 这种形成表的方法是错误的吗?

No, it's just very "exposed", perhaps too much -- it could usefully be encapsulated in a class which exposes the methods you need, then the issue of how best to implement them does not affect all the rest of your application. 不,它只是非常“暴露”,也许太多了 - 它可以有效地封装在一个暴露你需要的方法的类中,然后最好地实现它们的问题不会影响你的所有其他应用程序。

Is there a better way to implement such tables with easier lookups? 有没有更好的方法来实现这样的表,更容易查找?

Once you have designed a class whose interface you'd like to use, you can experiment with very different implementation approaches and benchmark them on a workload that's representative of your usage pattern, so you can find out what's best for you (assuming table manipulation and lookup are a big part of your application's runtime, of course -- to find out, profile your app). 一旦你设计了一个类,它的接口,你使用,你可以在一个工作负载代表你的使用模式非常不同的实现方法和比较基准实验,这样你就可以找出最适合 (假设表操作和查找是应用程序运行时的重要组成部分 - 当然 - 查找, 分析您的应用程序)。

I had similar but not identical needs in a large internal app I maintain at work, except that the row indices are integer (only the column names are strings), the column order is important, and the workload is more about "editing" the table (adding, removing, reordering rows or columns, renaming columns, etc). 我在工作中维护的大型内部应用程序中有类似但不完全相同的需求,除了行索引是整数(只有列名称是字符串),列顺序很重要,工作负载更多是关于“编辑”表(添加,删除,重新排序行或列,重命名列等)。 I started with a table exposing the functionality I needed, with the simplest rough-and-ready implementation internally (a list of dicts, plus a list of column names for the column ordering); 我开始使用一个表,公开我需要的功能,内部最简单的粗略实现(一个dicts列表,以及列排序的列名列表); and by now I have evolved it (independently of the actual "application-level" parts, but based on profiling and benchmarking thereof) to completely different implementations (currently based on numpy ). 到目前为止,我已经将它(独立于实际的“应用程序级”部分,但基于其分析和基准测试)演变为完全不同的实现(目前基于numpy )。

I think you should proceed along similar lines: "clothe" your current implementation into a nice "interface" with all the methods you need, profile your app -- unless this table object is a performance bottleneck, you're done; 我认为你应该沿着类似的路线前进:将你当前的实现“穿上”到你需要的所有方法的一个漂亮的“界面”,分析你的应用程序 - 除非这个表对象是一个性能瓶颈,你就完成了; if it is a bottleneck, you can optimize the implementation (experiment, measure, repeat;-) without disturbing any of the rest of your application. 如果它一个瓶颈,你可以优化实施(实验,测量,重复;-)而不会打扰你的任何其他应用程序。

Inheriting from dict is not a good idea because you probably don't want to expose all of dict 's rich functionality; 继承dict不是一个好主意,因为你可能不想暴露所有dict的丰富功能; plus, what you're doing is, roughly, an inefficient implementation of collections.defaultdict(dict) . 另外,你所做的大致是collections.defaultdict(dict)的低效实现。 So, encapsulate the latter: 所以, 封装后者:

import collections

class Table(object):
    def __init__(self):
        self.d = collections.defaultdict(dict)
    def add(self, row, col, val):
        self.d[row][col] = val
    def get(self, row, col, default=None):
        return self.d[row].get(col, default)
    def inrow(self, row, col):
        return col in self.d[row]
    def incol(self, col, val):
        return any(x[col]==val for x in self.d.iteritems())

etc, etc -- write all the methods your app needs, with useful, short names, then maybe see if you can alias some of them as special methods if they're often used that way, eg maybe (assuming Python 2.* -- requires slightly different syntax in 3.*): 等等 - 用有用的短名称编写你的应用程序所需的所有方法,然后可能看看你是否可以将它们中的一些作为特殊方法的别名,如果它们经常以这种方式使用,例如也许(假设Python 2. * - - 在3. *)中需要稍微不同的语法:

    def __setitem__(self, (row, col), val):
        self.add(row, col, val)

and so forth. 等等。 Once you have the code working, then comes the right time for profiling, benchmarking, and -- just perhaps -- internal optimization of the implementation. 一旦您的代码正常工作, 可以在适当的时候进行分析,基准测试,以及 - 或许 - 实现的内部优化。

A nested list should be able to do the job here. 嵌套列表应该能够在这里完成工作。 I would only use nested dictionaries if elements are spread thin across the grid. 如果元素在网格中分散,我只会使用嵌套字典。

grid = []
for row in height:
  grid.append([])
    for cell in width:
      grid[-1].append(value)

Checking rows is easy: 检查行很容易:

def valueInRow(value, row):
  return value in grid[row]

Checking collumns takes a little more work, because the grid is a list of rows, not a list of collumns: 检查collumns需要更多的工作,因为网格是行列表,而不是列的列表:

def collumnIterator(collumn):
  height = len(grid)
  for row in xrange(height):
    yield grid[row][collumn]

def valueInCollumn(value, collumn):
  return value in collumnIterator(collumn)

Now how do I do lookup if 'column1' has 'value11' 现在,如果'column1'具有'value11',我该如何查找

Are you asking about this? 你在问这个吗?

found= False
for r in table:
    if table[r]['column1'] == 'value11'
        found= True
        break

Is this what you're trying to do? 这是你想要做的吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM