将Python列表拆分为重叠块列表

Splitting a Python list into a list of overlapping chunks

这个问题类似于将一个列表分割成一个子列表,但在我的例子中,我希望将每个前一个子列表的最后一个元素作为下一个子列表中的第一个元素包含进来。必须考虑到最后一个元素总是至少有两个元素。

例如:

1
list_ = ['a','b','c','d','e','f','g','h']

3号子列表的结果:

1
resultant_list = [['a','b','c'],['c','d','e'],['e','f','g'],['g','h']]

通过简单地缩短传递到范围的"step"参数,您链接的答案中的列表理解很容易适应以支持重叠块:

1
2
3
4
5
>>> list_ = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
>>> n = 3  # group size
>>> m = 1  # overlap size
>>> [list_[i:i+n] for i in range(0, len(list_), n-m)]
[['a', 'b', 'c'], ['c', 'd', 'e'], ['e', 'f', 'g'], ['g', 'h']]

对于这个问题的其他访问者来说,他们可能没有使用输入列表(可滑动的、已知长度的、有限的)的奢侈。下面是一个基于生成器的解决方案,可以处理任意iterables:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from collections import deque

def chunks(iterable, chunk_size=3, overlap=0):
    # we'll use a deque to hold the values because it automatically
    # discards any extraneous elements if it grows too large
    if chunk_size < 1:
        raise Exception("chunk size too small")
    if overlap >= chunk_size:
        raise Exception("overlap too large")
    queue = deque(maxlen=chunk_size)
    it = iter(iterable)
    i = 0
    try:
        # start by filling the queue with the first group
        for i in range(chunk_size):
            queue.append(next(it))
        while True:
            yield tuple(queue)
            # after yielding a chunk, get enough elements for the next chunk
            for i in range(chunk_size - overlap):
                queue.append(next(it))
    except StopIteration:
        # if the iterator is exhausted, yield any remaining elements
        i += overlap
        if i > 0:
            yield tuple(queue)[-i:]

注:我已经在wimpy.util.chunks中发布了这个实现。如果不介意添加依赖项,可以使用pip install wimpyfrom wimpy import chunks,而不是复制粘贴代码。


more_itertools有一个窗口工具,用于重叠iterables。

鉴于

1
2
3
4
5
import more_itertools as mit

iterable = list("abcdefgh")
iterable
# ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

代码

1
2
3
windows = list(mit.windowed(iterable, n=3, step=2))
windows
# [('a', 'b', 'c'), ('c', 'd', 'e'), ('e', 'f', 'g'), ('g', 'h', None)]

如果需要,可以通过过滤窗口来删除NonefillValue:

1
2
[list(filter(None, w)) for w in windows]
# [['a', 'b', 'c'], ['c', 'd', 'e'], ['e', 'f', 'g'], ['g', 'h']]

有关more_itertools.windowed的详细信息,另请参见more_itertools文件。


我想到的是:

1
2
3
4
5
6
7
8
9
10
l = [1, 2, 3, 4, 5, 6]
x = zip (l[:-1], l[1:])
for i in x:
    print (i)

(1, 2)
(2, 3)
(3, 4)
(4, 5)
(5, 6)

Zip接受任何数量的iTerables,也有zip_longest


1
[list_[i:i+n] for i in xrange(0,len(list_), n-m)]