简体   繁体   English

在Python(pymssql)中不进行类型检查就跳过键

[英]Skip keys without Type checking in Python (pymssql)

I need to access all the non-integer keys for a dict that looks like: 我需要访问所有非整数键以获取如下所示的dict

result = { 
           0 : "value 1",
           1 : "value 2",
           "key 1" : "value 1",
           "key 2" : "value 2", 
         }

I am currently doing this by: 我目前正在通过以下方式进行操作:

headers = [header for header in tmp_dict.keys() if not isinstance(header, int)]

My question: 我的问题:

  1. Is there a way to do this without type checking ? 有没有没有类型检查的方法吗?
  2. This tmp_dict is coming out of a query using pymssql with the as_dict=True attribute, and for some reason it returns all the column names with data as expected, but also includes the same data indexed by integers. tmp_dict来自使用pymssqlas_dict=True属性的查询,由于某种原因,它返回所有列名称以及预期的数据,但还包含由整数索引的相同数据。 How can I get my query result as a dictionary with only the column values and data? 如何将查询结果作为仅包含列值和数据的字典获得?

Thanks for your help! 谢谢你的帮助!

PS - Despite my issues being resolved by potentially answering 2 , I'm curious how this can be done without type checking. PS-尽管我的问题可以通过回答2解决,但我很好奇如何在不进行类型检查的情况下完成此任务。 Mainly for the people who say "never do type checking, ever." 主要针对那些说“永远不要进行类型检查”的人。

Looking at the source code of pymssql (1.0.2), it is clear that there is no option for the module to not generate data indexed by integers. 查看pymssql(1.0.2)的源代码,很明显,模块没有选项不生成以整数索引的数据。 But note that data indexed by column name can be omitted if the column name is empty. 但是请注意,如果列名为空,则可以省略按列名索引的数据。

/* mssqldbmodule.c */
PyObject *fetch_next_row_dict(_mssql_connection *conn, int raise) {
    [...]
    for (col = 1; col <= conn->num_columns; col++) {
        [...]
        // add key by column name, do not add if name == ''
        if (strlen(PyString_AS_STRING(name)) != 0)
            if ((PyDict_SetItem(dict, name, val)) == -1)
                return NULL;

        // add key by column number
        if ((PyDict_SetItem(dict, PyInt_FromLong(col-1), val)) == -1)
            return NULL;
    }
    [...]
}

Regarding your first question, filtering result set by type checking is surely the best way to do that. 关于第一个问题,通过类型检查过滤结果集无疑是最好的方法。 And this is exactly how pymssql is returning data when as_dict is False: 这正是as_dict为False时pymssql返回数据的方式:

if self.as_dict:
    row = iter(self._source).next()
    self._rownumber += 1
    return row
else:
    row = iter(self._source).next()
    self._rownumber += 1
    return tuple([row[r] for r in sorted(row.keys()) if type(r) == int])

With regard to your question about type checking, the duck-type approach would be to see whether it can be converted to or used as an int . 关于类型检查的问题,鸭子类型方法是查看它是否可以转换为int或用作int

def can_be_int(obj):
    try:
        int(obj)
    except (TypeError, ValueError):
        return False
    return True

headers = [header for header in tmp_dict.keys() if not can_be_int(header)]

Note that float s can be converted to int s by truncating them, so this isn't necessarily exactly equivalent. 需要注意的是float s时,可以转换为int通过截断它们S,所以这并不一定完全等同。

A slight variation on the above would be to use coerce(0, obj) in place of int(obj) . 上面的一个细微变化是使用coerce(0, obj)代替int(obj) This will allow any kind of object that can be converted to a common type with an integer. 这将允许将任何类型的对象转换为带整数的通用类型。 You could also do something like 0 + obj and 1 * obj which will check for something that can be used in a mathematical expression with integers. 您还可以执行0 + obj and 1 * obj ,这些操作将检查可用于整数的数学表达式中的内容。

You could also check to see whether its string representation is all digits: 您还可以检查其字符串表示形式是否为全数字:

headers = [header for header in tmp_dict.keys() if not str(header).isdigit()]

This is probably closer to a solution that doesn't use type-checking, although it will be slower, and it's of course entirely possible that a column name would be a string that is only digits! 这可能更接近于不使用类型检查的解决方案,尽管它会慢一些,并且当然很可能列名将是仅包含数字的字符串! (Which would fail with many of these approaches, to be honest.) (说实话,其中许多方法都会失败。)

Sometimes explicit type-checking really is the best choice, which is why the language has tools for letting you check types. 有时显式类型检查确实是最好的选择,这就是为什么该语言具有可让您检查类型的工具的原因。 In this situation I think you're fine, especially since the result dictionary is documented to have only integers and strings as keys. 在这种情况下,我认为您还不错,尤其是因为据记录结果字典仅包含整数和字符串作为键。 And you're doing it the right way by using isinstance() rather than explicitly checking type() == int . 而且您通过使用isinstance()而不是显式地检查type() == int来以正确的方式进行操作。

The rationale behind as_dict=True is that you can access by index and by name. as_dict=True的基本原理是您可以按索引和名称进行访问。 Normally you'd get a tuple you index into, but for compatibility reasons being able to index a dict as though it was a tuple means that code depending on column numbers can still work, without being aware that column names are available. 通常,您会得到一个要索引的tuple ,但是出于兼容性的原因,能够像一个tuple一样索引一个dict意味着依赖列号的代码仍然可以工作,而无需知道列名可用。

If you're just using result to retrieve columns (either by name or index), I don't see why you're concerned about removing them? 如果您仅使用result来检索列(按名称或索引),我不明白为什么您担心要删除它们? Just carry on regardless. 只要继续进行即可。 (Unless for some reason you plan to pickle or otherwise persist the data elsewhere...) (除非出于某种原因,您计划对其他数据进行腌制或以其他方式保留数据...)

The best way to filter them out though, is using isinstance - duck typing in this case is actually unpythonic and inefficient. 过滤掉它们的最好方法是使用isinstance在这种情况下,鸭子的输入实际上是非Python的且效率低下。 Eg: 例如:

names_only = dict( (k, v) for k,v in result.iteritems() if not isinstance(k, int) )

Instead of a try and except dance. except跳舞except不去try

>>> sorted(result)[len(result)/2:]
['key 1', 'key 2']

This will remove the duplicated integer-keyed entrys. 这将删除重复的integer-keyed条目。 I think what you're doing is fine though. 我认为您在做什么很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM