Python设计模式实现函数的不同变体

Python design pattern to implement different variations of a function

我有一个通用的平滑函数,基于一个配置文件(将作为字典加载的yaml),将调用不同的实现(boxcar或高斯)。这些实现有不同数量的参数,例如boxcar需要winsize,而高斯则需要winsize和variance。

以下是我当前的实现:

1
2
3
4
5
6
7
8
9
10
def smoothing(dataDf, selected_columns, kwargs):

    method = kwargs['method']

    if method == 'boxcar':
        boxcar(dataDf, selected_columns, kwargs['arguments'])
    elif method == 'gaussian':
        gaussian(dataDf, selected_columns, kwargs['arguments'])
    else:
        raise NotImplementedError

有没有更好的方法来实现这一点?


我会考虑两种选择

  • 使用功能字典:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    methods = {
        'boxcar': function1,
        'gaussian': function2
    }

    try:
        method = methods[kwargs['method']]
        ...
    except KeyError:
        raise NotImplementedError

    你可以让它更人性化一点

    1
    2
    3
    4
    5
    6
    7
    def smoothing(dataDf, selected_columns, method, *args, **kwargs):
        try:
            return methods[kwargs['method']](
                dataDf, selected_columns, *args, **kwargs
            )
        except KeyError:
            raise NotImplementedError('{} is not a valid method'.format(method))

  • 使用多个调度。它允许您分派函数签名和类型

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    In [1]: from multipledispatch import dispatch

    In [2]: @dispatch(int, int)
       ...: def f(x, y):
       ...:     return x, y
       ...:

    In [3]: @dispatch(int)
       ...: def f(x):
       ...:     return x
       ...:

    In [5]: f(1)
    Out[5]: 1

    In [6]: f(1, 2)
    Out[6]: (1, 2)

    在你的情况下

    1
    2
    3
    4
    5
    6
    7
    @dispatch(...list some/all argument types here...)
    def smoothing(...signature for boxcar...):
        pass

    @dispatch(...list some/all argument types here...)
    def smoothing(...signature for gaussian...)
        pass

  • 1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    methods = {
        'boxcar': boxcar,
        'gaussian': gaussian,
    }

    MESSAGES = {
        'MISSING_METHOD': 'No method named {} was found.'
    }

    def smooth(dataDf, selected_columns, **kwargs):
       """Smooth dataframe columns."""
        # Here, we are providing a default if the
        # user doesn't provide a keyword argument
        # for `method`. You're accessing it with
        # brackets and if it's not provided it will
        # raise a KeyError. If it's mandatory, put
        # it as such. But you can do better by
        # providing a default. Like below

        method_name = kwargs.get('method', 'gaussian')
        method = methods.get(method_name)

        if method is None:
            msg = MESSAGES['MISSING_METHOD'].format(method_name)
            raise NotImplementedError(msg)
        return method(dataDf, selected_columns, **kwargs)

    如果method是调用的一个相当重要的部分,并且用户知道该参数,则可以编写如下:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    def smooth(dataDf, selected_columns, method='gaussian', **kwargs):
       """Smooth dataframe columns."""
        # Note that kwargs no longer contains `method`;
        # our call to the 'real' method is not 'polluted'.

        _method = methods.get(method)

        if _method is None:
            msg = MESSAGES['MISSING_METHOD'].format(method)
            raise NotImplementedError(msg)
        return _method(dataDf, selected_columns, **kwargs)

    但是,在methods字典中有一个问题:

    • 它的键是函数名,值是函数对象,这要求我们访问函数。

    您希望得到一个字典,其中键和值是字符串(例如,您可以从yaml文件中获得)。

    我要做的假设是,函数存在于当前上下文中,无论您定义了它们还是导入了它们。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    from third.party.library import really_ugly_name_you_didnt_choose_but_imported

    def gaussian(dataDf, selected_columns, **kwargs):
        pass

    _methods = globals()

    # This is a dictionary where keys and values
    # only contain strings, like from a YAML file.

    methods = {
        'boxcar': 'boxcar',
        'gaussian': 'gaussian',
        'roundcar': 'really_ugly_name_you_didnt_choose_but_imported',
    }

    MESSAGES = {
        'MISSING_METHOD': 'No method named {} was found.'
    }

    def smooth(dataDf, selected_columns, method='gaussian', **kwargs):
       """Smooth dataframe columns."""

        # Lets check if it is in authorized methods
        # We're using globals() and we don't want
        # the user to accidentally use a function
        # that has no relation to smoothing.

        # So we're looking at the dictionary from
        # the YAML file.

        # Let's get the"real name" of the function
        # so if `method` were (str) 'roundcar', `method_name`
        # would be (str) 'really_ugly_name_you_didnt_choose_but_imported'

        method_name = methods.get(method)

        # Now that we have the real name, let's look for
        # the function object, in _methods.

        _method = _methods.get(method_name)

        if None in (_method, method_name):
            msg = MESSAGES['MISSING_METHOD'].format(method)
            # Note that we raise the exception for the function
            # name the user required, i.e: roundcar, not
            # the real function name the user might be unaware
            # of, 'really_ugly_name_you_didnt_choose_but_imported'.
            raise NotImplementedError(msg)
        return _method(dataDf, selected_columns, **kwargs)


    我想说这个问题是基于基本观点的,但我有点…无聊,所以我想说:

    在这些情况下,我总是倾向于优先考虑其他用户的可读性。我们都知道应该对代码进行适当的注释、解释,并提供大量文档,对吗?但是我也应该去健身房,但是,我在这里,从沙发的舒适度写这篇文章(直到…四月中旬,天气好转时)。

    对我来说,如果其他人要阅读你的代码,我认为利用这样一个事实非常重要:如果编写得当,python可以非常、非常清晰(它几乎就像运行的伪代码,对吧?)

    所以在您的例子中,我甚至不会创建这种包装函数。我会有一个包含所有平滑函数的模块smoothing.py。不仅如此,我还导入了模块(import smoothing而不是from smoothing import boxcar, gaussian,这样我就可以非常明确地调用:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    if method == 'boxcar':
       smoothing.boxcar(whatever whatever...)  
       # Someone reading this will be able to figure out that is an smoothing
       # function from a module called `"smoothing"`. Also, no magic done with
       # things like my_func = getattr(smoothing, method)... none of that: be
       # clear and explicit.
       # For instance, many IDEs allow you to navigate to the function's
       # definition, but for that to work properly, the code needs to be explicit
    elif method == 'gaussian':
       smoothing.gaussian(whatever whatever...)
    else:
       raise ValueError(
              'Unknown smoothing method"%s".'
              ' Please see available method in"%s" or add'
              ' a new smoothing entry into %s' % (
                     method,
                     os.path.abspath(smoothing.__file__),
                     os.path.abspath(__file__)
              )
       )

    像那样。如果有人收到错误,可以快速理解错误发生的位置和原因。

    否则,如果你仍然想保持你的结构,我会说,因为你总是需要你的"方法",不要把它放进你的禁运中。使其定位:

    1
    2
    3
    4
    5
    6
    7
    def smoothing(method, dataDf, selected_columns, kwargs):
        if method == 'boxcar':
            boxcar(dataDf, selected_columns, kwargs['arguments'])
        elif method == 'gaussian':
            gaussian(dataDf, selected_columns, kwargs['arguments'])
        else:
            raise NotImplementedError

    你可以做的另一件事是,不要在kwargs的口述中有坏的论据,而是强迫它包括适当的论据(如果有人在kwarg['arguments']中传递了method=boxcar的论据,但给了你kwarg['arguments']的论据。不要让这一切成为可能(让它尽快崩溃):

    1
    2
    3
    4
    5
    6
    7
    8
    def smoothing(method, dataDf, selected_columns, **kwargs):
        if method == 'boxcar':
            assert 'variance' not in kwargs  # If `boxcar` shouldn't have a"variance"
            boxcar(dataDf, selected_columns, kwargs['windsize'])
        elif method == 'gaussian':
            gaussian(dataDf, selected_columns, kwargs['windsize'], kwargs['variance'])
        else:
            raise NotImplementedError

    并在例外情况下始终提供适当的信息(对您的NotImplementedError给出适当的解释)

    Python有很多"魔力"。这并不意味着你必须这么做。例如,通过编写以下内容,您可以获得与您的smoothing函数实现所做的相当类似的行为:

    1
    2
    3
    4
    5
    def smoothing(dataDf, selected_columns, kwargs):
        return globals().get(
            kwargs.pop('method') or 'NOT_IMPLEMENTED_FOR_SURE!!',
            lambda *_: (_ for _ in ()).throw(NotImplementedError())
        )(dataDf, selected_columns, kwargs['arguments'])

    但如果有人读到…好。。。祝那个人好运-p


    您的算法函数是Strategies。您可以将它们存储在字典中以便于查找。

    对于缺失的算法,使用带有"未实现"策略的defaultdict

    1
    2
    3
    4
    def algorithm_not_implemented(*args, **kwargs):
        raise NotImplementedError

    algorithms = defaultdict(algorithm_not_implemented)

    这意味着如果您尝试访问一个不存在的算法,它将返回algorithm_not_implemented,当您调用它时,它将引发NotImplementedError

    1
    2
    3
    >>> algorithms['pete'](1, 2, 3)
    Traceback (most recent call last):
    NotImplementedError

    您可以添加算法:

    1
    2
    algorithms['boxcar'] = boxcar
    algorithms['gaussian'] = gaussian

    你可以叫他们:

    1
    2
    3
    4
    5
    def smoothing(dataDf, selected_columns, kwargs):
        method = kwargs['method']
        arguments = kwargs['arguments']

        algorithms[method](dataDf, selected_columns, arguments)