如何获取传递给函数调用的关键字参数的原始顺序？

Question

在我正在研究的特定项目中，检索通过** kwargs传递的关键字参数的顺序将非常有用。 这是关于制作一种具有有意义尺寸的nd numpy数组（现在称为dimarray），对地球物理数据处理特别有用。

现在说我们有：

import numpy as np
from dimarray import Dimarray   # the handy class I am programming

def make_data(nlat, nlon):
    """ generate some example data
    """
    values = np.random.randn(nlat, nlon)
    lon = np.linspace(-180,180,nlon)
    lat = np.linspace(-90,90,nlat)
    return lon, lat, values

什么有效：

>>> lon, lat, values = make_data(180,360)
>>> a = Dimarray(values, lat=lat, lon=lon)
>>> print a.lon[0], a.lat[0]
-180.0 -90.0

什么不是：

>>> lon, lat, data = make_data(180,180) # square, no shape checking possible !
>>> a = Dimarray(values, lat=lat, lon=lon)
>>> print a.lon[0], a.lat[0] # is random 
-90.0, -180.0  # could be (actually I raise an error in such ambiguous cases)

原因是Dimarray的__init__方法的签名是(values, **kwargs)并且由于kwargs是无序字典（dict），所以它能做的最好就是检查values的形状。

当然，我希望它适用于任何尺寸：

a = Dimarray(values, x1=.., x2=...,x3=...)

因此，必须使用**kwargs对其进行硬编码。歧义情况发生的机会会随维数的增加而增加。 可以通过以下方法解决，例如使用签名(values, axes, names, **kwargs) ，可以执行以下操作：

a = Dimarray(values, [lat, lon], ["lat","lon"])

但是这种语法在交互式使用（ipython）时比较麻烦，因为我希望此软件包确实成为我（和其他!!）日常使用python的一部分，作为地球物理学中numpy数组的实际替代。

我将对此感兴趣。 我现在能想到的最好的方法是使用检查模块的堆栈方法来解析调用者的语句：

import inspect
def f(**kwargs):
    print inspect.stack()[1][4]
    return tuple([kwargs[k] for k in kwargs])

>>> print f(lon=360, lat=180)
[u'print f(lon=360, lat=180)\n']
(180, 360)

>>> print f(lat=180, lon=360)
[u'print f(lat=180, lon=360)\n']
(180, 360)

一个可以解决的问题，但是由于stack（）捕获了所有内容，因此存在一些无法解决的问题：

>>> print (f(lon=360, lat=180), f(lat=180, lon=360))
[u'print (f(lon=360, lat=180), f(lat=180, lon=360))\n']
[u'print (f(lon=360, lat=180), f(lat=180, lon=360))\n']
((180, 360), (180, 360))

还有我不知道的其他检查技巧可以解决此问题吗？ （我对这个模块不熟悉），我想得到一段正确的代码，放在方括号lon=360, lat=180应该是可行的，不是吗？

因此，我第一次感觉到python在根据所有可用信息（用户提供的订购是有价值的信息！）进行理论上可行的操作方面遇到了困难。

我在那儿读过尼克的有趣建议： https : //mail.python.org/pipermail/python-ideas/2011-January/009054.html ，想知道这个想法是否已经向前发展了？

我明白了为什么通常不希望有一个订购的** kwarg，但是针对这些罕见情况的补丁会很整洁。 任何人都知道可靠的骇客吗？

注意：这与熊猫无关，我实际上是在尝试为它开发一种轻量级的替代品，其用法仍然非常接近numpy。 即将发布gitHub链接。

编辑：注意，这与dimarray的交互使用有关。 无论如何都需要双重语法。

EDIT2：我还看到反对者的论点，即知道数据不是有序的也可以视为有价值的信息，因为它使Dimarray可以自由检查values形状和自动调整顺序。 甚至可能是不记得数据的维度比两个维度具有相同的大小更经常发生。 所以现在，我想在模棱两可的情况下引发错误，要求用户提供names参数是可以的。 尽管如此，自由地做出这样的选择（Dimarray类应该如何表现），而不是受到python缺少功能的约束，将是一件很整洁的事情。

编辑3 ，解决方案：在kazagistar建议之后：

我没有提到还有其他可选的属性参数，例如name=""和units="" ，以及与切片有关的其他几个参数，因此*args构造将需要对kwargs进行关键字名称测试。

总之，有很多可能性：

*选择a：保留当前语法

a = Dimarray(values, lon=mylon, lat=mylat, name="myarray")
a = Dimarray(values, [mylat, mylon], ["lat", "lon"], name="myarray")

*选择b：kazagistar的第二个建议，通过**kwargs删除轴定义

a = Dimarray(values, ("lat", mylat), ("lon",mylon), name="myarray")

*选择c：kazagistar的第二个建议，通过**kwargs定义轴（请注意，这涉及到names=从**kwargs提取，请参见下面的背景）

a = Dimarray(values, lon=mylon, lat=mylat, name="myarray")
a = Dimarray(values, ("lat", mylat), ("lon",mylon), name="myarray")

*选择d：kazagistar的第3条建议，并通过**kwargs进行可选的轴定义

a = Dimarray(values, lon=mylon, lat=mylat, name="myarray")
a = Dimarray(values, [("lat", mylat), ("lon",mylon)], name="myarray")

嗯，这归结于美学和一些设计问题（在交互模式下，惰性订购是一项重要功能吗？）。 我在b）和c）之间犹豫。 我不确定**杂货真的带来了什么。 具有讽刺意味的是，当我开始思考时，我开始批评的东西成了一个功能。

非常感谢您的回答。 我会将这个问题标记为已回答，但是非常欢迎您投票给a），b）c）或d）！

=====================

编辑4 ： 更好的解决方案 ：选择a）!!，但是添加了from_tuples类方法。 这样做的原因是允许更多自由度。 如果未提供轴名称，则它们将自动生成为“ x0”，“ x1”等。要像熊猫一样使用，但要使用轴命名。 这也避免了将轴和属性混合到** kwarg中，而仅将其留给轴使用。 一旦我完成了文档，就会有更多内容。

a = Dimarray(values, lon=mylon, lat=mylat, name="myarray")
a = Dimarray(values, [mylat, mylon], ["lat", "lon"], name="myarray")
a = Dimarray.from_tuples(values, ("lat", mylat), ("lon",mylon), name="myarray")

编辑5 ： 更多的pythonic解决方案？ ：在用户api方面类似于上面的EDIT 4，但通过包装dimarray，但对如何实例化Dimarray却非常严格。 这也符合kazagistar提出的精神。

 from dimarray import dimarray, Dimarray 

 a = dimarray(values, lon=mylon, lat=mylat, name="myarray") # error if lon and lat have same size
 b = dimarray(values, [("lat", mylat), ("lon",mylon)], name="myarray")
 c = dimarray(values, [mylat, mylon, ...], ['lat','lon',...], name="myarray")
 d = dimarray(values, [mylat, mylon, ...], name="myarray2")

从类本身：

 e = Dimarray.from_dict(values, lon=mylon, lat=mylat) # error if lon and lat have same size
 e.set(name="myarray", inplace=True)
 f = Dimarray.from_tuples(values, ("lat", mylat), ("lon",mylon), name="myarray")
 g = Dimarray.from_list(values, [mylat, mylon, ...], ['lat','lon',...], name="myarray")
 h = Dimarray.from_list(values, [mylat, mylon, ...], name="myarray")

在d）和h）情况下，轴自动命名为“ x0”，“ x1”，依此类推，除非mylat，mylon实际上属于Axis类（我在本文中未提及，但Axes和Axis会这样做）工作，以建立轴并处理分度）。

说明：

class Dimarray(object):
    """ ndarray with meaningful dimensions and clean interface
    """
    def __init__(self, values, axes, **kwargs):
        assert isinstance(axes, Axes), "axes must be an instance of Axes"
        self.values = values
        self.axes = axes
        self.__dict__.update(kwargs)

    @classmethod
    def from_tuples(cls, values, *args, **kwargs):
        axes = Axes.from_tuples(*args)
        return cls(values, axes)

    @classmethod
    def from_list(cls, values, axes, names=None, **kwargs):
        if names is None:
            names = ["x{}".format(i) for i in range(len(axes))]
        return cls.from_tuples(values, *zip(axes, names), **kwargs)

    @classmethod
    def from_dict(cls, values, names=None,**kwargs):
        axes = Axes.from_dict(shape=values.shape, names=names, **kwargs)
        # with necessary assert statements in the above
        return cls(values, axes)

这是技巧（示意上）：

def dimarray(values, axes=None, names=None, name=..,units=..., **kwargs):
    """ my wrapper with all fancy options
    """
    if len(kwargs) > 0:
        new = Dimarray.from_dict(values, axes, **kwargs) 

    elif axes[0] is tuple:
        new = Dimarray.from_tuples(values, *axes, **kwargs) 

    else:
        new = Dimarray.from_list(values, axes, names=names, **kwargs) 

    # reserved attributes
    new.set(name=name, units=units, ..., inplace=True) 

    return new

我们唯一松散的实际上是* args语法，它不能容纳这么多的选项。 但这很好。

而且它也使子类化变得容易。 这里的python专家感觉如何？

（整个讨论实际上可以分为两部分）

=====================

有一点背景知识（编辑：仅在情况a），b），c），d）中部分过时），以防万一您感兴趣：

*选择涉及：

def __init__(self, values, axes=None, names=None, units="",name="",..., **kwargs):
    """ schematic representation of Dimarray's init method
    """
    # automatic ordering according to values' shape (unless names is also provided)
    # the user is allowed to forget about the exact shape of the array
    if len(kwargs) > 0:
        axes = Axes.from_dict(shape=values.shape, names=names, **kwargs)

    # otherwise initialize from list
    # exact ordering + more freedom in axis naming 
    else:
        axes = Axes.from_list(axes, names)

    ...  # check consistency

    self.values = values
    self.axes = axes
    self.name = name
    self.units = units

*选择b）和c）施加：

def __init__(self, values, *args, **kwargs):
    ...

b）所有属性都通过带有self.__dict__.update(kwargs) kwargs自然传递。 这很干净。

c）需要过滤关键字参数：

def __init__(self, values, *args, **kwargs):
   """ most flexible for interactive use
   """
   # filter out known attributes
   default_attrs = {'name':'', 'units':'', ...} 
   for k in kwargs:
       if k in 'name', 'units', ...:
           setattr(self, k) = kwargs.pop(k)
       else:
           setattr(self, k) = default_attrs[k]

   # same as before
   if len(kwargs) > 0:
       axes = Axes.from_dict(shape=values.shape, names=names, **kwargs)

   # same, just unzip
   else:
       names, numpy_axes = zip(*args)
       axes = Axes.from_list(numpy_axes, names)

这实际上非常好用，唯一（次要）的缺点是name =“”，units =“”的默认参数以及其他一些更相关的参数无法通过检查或完成来访问。

*选择d：清除__init__

def __init__(self, values, axes, name="", units="", ..., **kwaxes)

但是确实有些冗长。

==========

编辑，供参考 ：我最终使用的元组的列表axes参数，或者参数dims=和labels=对于轴名称和分别轴值。 相关项目dimarray在github上。 再次感谢kazagistar。

Answer 1

不，您不知道将项目添加到字典的顺序，因为这样做会大大增加实现字典的复杂性。 （对于当您确实非常需要此功能时，可以使用collections.OrderedDict ）。

但是，您是否考虑过一些基本的替代语法？ 例如：

a = Dimarray(values, 'lat', lat, 'lon', lon)

或（可能是最佳选择）

a = Dimarray(values, ('lat', lat), ('lon', lon))

或（最明确）

a = Dimarray(values, [('lat', lat), ('lon', lon)])

在某种程度上，需要排序本质上是位置上的。 ** kwargs通常被滥用作标签，但是参数名称通常不应该是“数据”，因为以编程方式进行设置很麻烦。 只需使相关联的数据的两个部分与元组清晰可见，并使用列表使顺序得以保留，并提供强有力的断言和错误消息以使输入何时无效以及为何原因清楚。

Answer 2

有专门用于处理此问题的模块：

https://github.com/claylabs/ordered-keyword-args

不使用模块

def multiple_kwarguments(first , **lotsofothers):
    print first

    for i,other in lotsofothers.items():
         print other
    return True

multiple_kwarguments("first", second="second", third="third" ,fourth="fourth" ,fifth="fifth")

输出：

first
second
fifth
fourth
third

关于使用orderedkwargs模块

from orderedkwargs import ordered kwargs  
@orderedkwargs  
def mutliple_kwarguments(first , *lotsofothers):
    print first

    for i, other in lotsofothers:
        print other
    return True


mutliple_kwarguments("first", second="second", third="third" ,fourth="fourth" ,fifth="fifth")

输出：

first
second
third
fourth
fifth

注意：将此模块与功能上方的装饰器一起使用时，需要使用单个星号。

如何获取传递给函数调用的关键字参数的原始顺序？

问题描述

2 个解决方案

解决方案1
4 已采纳 2013-12-01 17:54:30

解决方案2
1 2014-10-03 10:55:22

不使用模块

关于使用orderedkwargs模块

如何获取传递给函数调用的关键字参数的原始顺序？

问题描述

2 个解决方案

解决方案1 4 已采纳 2013-12-01 17:54:30

解决方案2 1 2014-10-03 10:55:22

不使用模块

关于使用orderedkwargs模块

解决方案1
4 已采纳 2013-12-01 17:54:30

解决方案2
1 2014-10-03 10:55:22