Python代码实现列表分组计数

1. count_by

def count_by(arr, fn=lambda x: x):
  key = {}
  for el in map(fn, arr):
    key[el] = 1 if el not in key else key[el] + 1
  return key

# EXAMPLES
from math import floor
count_by([6.1, 4.2, 6.3], floor) # {6: 2, 4: 1}
count_by(['one', 'two', 'three'], len) # {3: 2, 5: 1}

count_by根据给定的函数对列表中的元素进行分组，并返回每组中元素的数量。该使用map()使用给定函数映射给定列表的值。在映射上迭代，并在每次出现时增加元素数。

该函数使用not in判断目前字典中是否含有指定的key，如果不含有，就将该key加入字典，并将对应的value设置为1；如果含有，就将value加1。

2. 使用字典推导式

字典推导式有{ key_expr: value_expr for value in collection if condition }这样的形式。group_by函数中字典推导式的value_expr是一个列表，该列表使用了列表推导式来生成。即

{ key_expr: [x for x in collection2 if condition2] for value in collection1 if condition1 }

同时，我们可以看到根据group_by代码中的字典推导式，可能计算出key相同的项，根据Pyrhon中字典的类型的规则，key相同的，只保留最新的key-value对。实际上当key相同时，value值也一样。[el for el in lst if fn(el) == key]推导式的for语句中只有key一个变量。

>>> d = {'one': 1, 'two': 2, 'three': 3, 'two': 2}
>>> d
{'one': 1, 'two': 2, 'three': 3}
>>> d = {'one': 1, 'two': 2, 'three': 3, 'two': 22}
>>> d
{'one': 1, 'two': 22, 'three': 3}
>>>

这里也可以使用同样的方式，在分组之后直接获取列表长度。不过这种写法遍历了两次列表，会使程序效率变低。

def count_by(lst, fn):
  return {key : len([el for el in lst if fn(el) == key]) for key in map(fn, lst)}

3. 使用collections.defaultdict简化代码

class collections.defaultdict([default_factory[, ...]])

collections.defaultdict包含一个default_factory属性，可以用来快速构造指定样式的字典。

当使用int作为default_factory，可以使defaultdict用于计数。因此可以直接使用它来简化代码。相比字典推导式的方法，只需要对列表进行一次循环即可。

 from collections import defaultdict

def count_by(lst, fn):
  d = defaultdict(int)
  for el in lst:
    d[fn(el)] += 1
  return d

当使用 list 作为 default_factory时，很轻松地将（键-值对组成的）序列转换为（键-列表组成的）字典。

def group_by(lst, fn):
  d = defaultdict(list)
  for el in lst:
    d[fn(el)].append(el)
  return d

# EXAMPLES
from math import floor
group_by([6.1, 4.2, 6.3], floor) # {4: [4.2], 6: [6.1, 6.3]}
group_by(['one', 'two', 'three'], len) # {3: ['one', 'two'], 5: ['three']}

频道导航

Python代码实现列表分组计数

目录

1. count_by

2. 使用字典推导式

3. 使用collections.defaultdict简化代码

您可能感兴趣的文章: