详解Python 函数参数的拆解

本文为阅读《Python Tricks: The Book》一书的 3.5 Function Argument Unpacking 的笔记与扩充理解。函数参数拆解是定义可变参数(VarArgs) *args 和 **kwargs 的反向特性。

*args 和 **kwars 是函数可定义一个形参来接收传入的不定数量的实参。

而这里的函数参数拆解是形参定义多个，在调用时只传入一个集合类型对象(带上 * 或 ** 前缀)，如 list, tuple, dict, 甚至是 generator, 然后函数能自动从集合对象中取得对应的值。

如果能理解下面赋值时的参数拆解和 Python 3.5 的新增 * ** 操作，那么于本文讲述的特性就好理解了。

唯一的不同时作为参数的集合传入函数时必须前面加上 * 或 **, 以此宣告该参数将被拆解，而非一个整体作为一个函数参数。加上 * 或 ** 与 Java 的 @SafeVarargs 有类似的功效，最接近的是 Scala 的 foo(Array[String]("d", "e") : _*) 写法。参见：Java 和 Scala 调用变参的方式

Python 的赋值拆解操作

>>> a, b = [1, 2]  # a, b = (1, 2) 也是一样的效果
>>> print(a, b)
1 2
>>> a, b = {'x': 1, 'y':2}
>>> print(a, b)
x y
>>> a, b = {'x': 1, 'y':2}.keys()
>>> print(a, b)
x y
>>> a, b = {'x': 1, 'y':2}.values()
>>> print(a, b)
1 2
>>> a, b = (x * x for x in range(2))
>>> print(a, b)
0 1

Python 3.5 的新增拆解操作

>>> [1, 2, *range(3), *[4, 5], *(6, 7)]  # * 号能把集合打散，flatten(unwrap)
[1, 2, 0, 1, 2, 4, 5, 6, 7]
>>> {'x': 1, **{'y': 2, 'z': 3}}      # ** 把字典打散, flatten(unwrap) 操作
{'x': 1, 'y': 2, 'z': 3}

有些像是函数编程中的 flatten 或 unwrap 操作。

有了上面的基础后，再回到原书中的例子，当我们定义如下打印 3-D 坐标的函数

def print_vector(x, y, z):
  print('<%s, %s, %s>' % (x, y, z))

依次传入三个参数的方式就不值不提了，现在就看如何利用函数的参数拆解特性，只传入一个集合参数，让该 print_vector 函数准确从集合中获得相应的 x, y, 和 z 的值。

函数参数拆解的调用举例

>>> list_vec = [2, 1, 3]
>>> print_vector(*list_vec)
<2, 1, 3>
>>> print_vector(*(2, 1, 3))
<2, 1, 3>
>>> dict_vec = {'y': 2, 'z': 1, 'x': 3}
>>> print_vector(*dict_vec)  # 相当于 print_vector(*dict_vec.keys())
<y, z, x>
>>> print_vector(**dict_vec)  # 相当于 print_vector(dict_vec['x'], dict_vec['y'], dict_vec['z']
<3, 2, 1>
>>> genexpr = (x * x for x in range(3))
>>> print_vector(*genexpr)
<0, 1, 4>
>>> print_vector(*dict_vec.values()) # 即 print_vector(*list(dict_vec.values()))
<2, 1, 3>

注意 **dict_vec 有点不一样，它的内容必须是函数 print_vector 的形参 'x', 'y', 'z' 作为 key 的三个元素。

以下是各种错误

**dict_vec 元素个数不对，或 key 不匹配时的错误

>>> print_vector(**{'y': 2, 'z': 1, 'x': 3})
<3, 2, 1>
>>> print_vector(**{'y': 2, 'z': 1, 'a': 3})    #元素个数是3 个，但出现 x, y, z 之外的 key
Traceback (most recent call last):
 File "<pyshell#39>", line 1, in <module>
  print_vector(**{'y': 2, 'z': 1, 'a': 3})
TypeError: print_vector() got an unexpected keyword argument 'a'
>>> print_vector(**{'y': 2, 'z': 1, 'x': 3, 'a': 4}) # 包含有 x, y, z, 但有四个元素，key 'a' 不能识别
Traceback (most recent call last):
 File "<pyshell#40>", line 1, in <module>
  print_vector(**{'y': 2, 'z': 1, 'x': 3, 'a': 4})
TypeError: print_vector() got an unexpected keyword argument 'a'
>>> print_vector(**{'y': 2, 'z': 1})     # 缺少 key 'x' 对应的元素
Traceback (most recent call last):
 File "<pyshell#41>", line 1, in <module>
  print_vector(**{'y': 2, 'z': 1})
TypeError: print_vector() missing 1 required positional argument: 'x'

不带星星的错误

>>> print_vector([2, 1, 3])
Traceback (most recent call last):
 File "<pyshell#44>", line 1, in <module>
  print_vector([2, 1, 3])
TypeError: print_vector() missing 2 required positional arguments: 'y' and 'z'

把集合对象整体作为第一个参数，所以未传入 y 和 z，因此必须用前缀 * 或 ** 通告函数进行参数拆解

集合长度与函数参数个数不匹配时的错误

>>> print_vector(*[2, 1])        # 拆成了 x=2, y=1, 然后 z 呢？
Traceback (most recent call last):
 File "<pyshell#47>", line 1, in <module>
  print_vector(*[2, 1])
TypeError: print_vector() missing 1 required positional argument: 'z'
>>> print_vector(*[2, 1, 3, 4])    # 虽然拆出了 x=2, y=1, z=3, 但也别想强塞第四个元素给该函数(只定义的三个参数)
Traceback (most recent call last):
 File "<pyshell#48>", line 1, in <module>
  print_vector(*[2, 1, 3, 4])
TypeError: print_vector() takes 3 positional arguments but 4 were given

上面这两个错误与赋值时的拆解因元素个数不匹配时的错误是相对应的

>>> a, b = [1]
Traceback (most recent call last):
 File "<pyshell#54>", line 1, in <module>
  a, b = [1]
ValueError: not enough values to unpack (expected 2, got 1)
>>> a, b = [1, 2, 3]
Traceback (most recent call last):
 File "<pyshell#55>", line 1, in <module>
  a, b = [1, 2, 3]
ValueError: too many values to unpack (expected 2)

当然在赋值时 Python 可以像下面那样做

a, b, *c = [1, 2, 3, 4]
>>> print(a, b, c)
1 2 [3, 4]

补充(2020-07-02): 迭代的拆解在 Python 中的术语是 Iterable Unpacking, 找到两个相关的 PEP 448, PEP 3132。在实际上用处还是很大的，比如在拆分字符串时只关系自己有兴趣的字段

line = '2020-06-19 22:14:00    2688 abc.json'
date, time, size, name = line.split()  # 获得所有字段值
_, time, _, name = line.split()     # 只对 time 和 name 有兴趣
date, *_ = line.split()         # 只对第一个 date 有兴趣
*_, name = line.split()         # 只对最后的 name 有兴趣
date, *_, name = line.split()      # 对两边的 date, name 有兴趣

这样就避免了用索引号来引用拆分后的值，如 split[0], splint[2] 等，有名的变量不容易出错。注意到 Python 在拆解时非常聪明，它知道怎么去对应位置，用了星号(*) 的情况，明白如何处理前面跳过多少个，中间跳过多少个，或最后收集多少个元素。

链接：

PEP 448 -- Additional Unpacking Generalizations
PEP 3132 -- Extended Iterable Unpacking

频道导航

详解Python 函数参数的拆解

您可能感兴趣的文章: