# "Python" guide to using itertools of Python standard library

2022-02-02 05:09:43 XQGang

# 0 Preface

Built in module itertools Implemented many iterator building blocks , suffer APL、Haskell and SML And so on , Standardized a Fast Efficient use of memory Core toolset , Provides functions for manipulating iteration objects , Together they form a “ Iterator algebra （iterator algebra）”, This makes in pure Python It is possible to create concise and efficient special tools .

Use `next( iterator )` You can get the next generated value of the iterator object .

# 1 Infinite iterator infinite iterators

## count

Parameters ：`itertools.count(start=0, step=1)`

purpose ： Create an iterator , It is from start Values start , Returns evenly spaced values , That is, the initial value generated is start, The tolerance is step Equal difference sequence of . Commonly used in map() To generate continuous data points . Besides , Also used zip() To add a serial number .

application ：

``````count(10)  # 10 11 12 13 ...
count(10, 0.5)  # 10 10.5 11 11.5 ...
#  notes ： Floating point count , Better accuracy can be obtained by the following methods
(10 + 0.5 * i for i in count())  # 10.0 10.5 11 11.5 ...
map(lambda x: 2 * x**2 + 1, count())  # 1 3 9 19 33 51 ...
zip(count(), ['a', 'b', 'c'])  # [(0, 'a'), (1, 'b'), (2, 'c')]
Copy code ``````

## cycle

Parameters ：`itertools.cycle(iterable)`

purpose ： Create an iterator , return iterable And save a copy . When finished iterable All elements in , Return all elements in the copy . Infinite repetition .

application ：

``````cycle(['a','b','c'])  # a b c a b c a ...
cycle(range(3))  # 0 1 2 0 1 2 0 ...
Copy code ``````

## repeat

Parameters ：`itertools.repeat(object, times=None)`

purpose ： Create an iterator , Keep repeating object . Unless you set parameters times , Otherwise, it will repeat indefinitely . Can be used for map() Parameters in function , The called function can get an invariant parameter . It can also be used for zip() To create an invariant part in the tuple record .

application ：

``````repeat('abc')  # abc abc abc ...
repeat(range(3))  # range(0, 3) range(0, 3) range(0, 3) ...
repeat(1, 3)  # 1 1 1
map(pow, range(5), repeat(2))  # 0 1 4 9 16
zip(repeat('num'), [1,2,3])  # [('num', 1), ('num', 2), ('num', 3)]
Copy code ``````

# 2 Finite iterator Iterators terminating on the shortest input sequence

## accumulate

【Python 3.8 change ： Added optional initial Shape parameter 】

Parameters ：`itertools.accumulate(iterable, func=operator.add, *, initial=None)`

purpose ： Create an iterator , Returns the cumulative summary value or the cumulative result value of other binocular operation functions （ Through optional func Parameter assignment ）. Usually , The number of output elements is consistent with the input iteratable objects . however , If keyword parameters are provided initial, Then the accumulation will be initial Values start , In this way, the output has one more element than the input iteratable object .

application ：

``````accumulate([1,2,3,4,5])  # 1 3 6 10 15
accumulate([1,2,3,4,5], operator.mul)  # 1 2 6 24 120
accumulate([1,2,3,4,5], min)  # 1 1 1 1 1
accumulate([1,2,3,4,5], max)  # 1 2 3 4 5
accumulate([2,4,5,8,10], lambda x, _: 1/x)  # 2 0.5 2.0 0.5 2.0
accumulate([2,4,5,8,10], lambda _, x: 1/x)  # 2 0.25 0.2 0.125 0.1
accumulate([2,4,5,8,10], lambda x, y: x*y)  # 2 8 40 320 3200
#  notes ：functools.reduce()  Only the final cumulative value is obtained
reduce(lambda x, y: x*y, [2,4,5,8,10])  # 3200
accumulate([1,2,3,4,5], initial=100)  # 100 101 103 106 110 115
Copy code ``````

## chain

Parameters ：`itertools.chain(*iterables)`

purpose ： Create an iterator , It first returns all the elements in the first iteratable object , Then return all the elements in the next iteratable object , Until all the elements in the iteratable object are exhausted . Multiple sequences can be processed into a single sequence .

application ：

``````chain([1,2], [3,4], )  # 1 2 3 4 5
chain([[1,2], [3,4], ])  # [[1, 2], [3, 4], ]
chain(*[[1,2], [3,4], ])  # 1 2 3 4 5
chain(['ABC', 'DEF'])  # ABC DEF
chain(*['ABC', 'DEF'])  # A B C D E F
Copy code ``````

## chain.from_iterable

Parameters ：`itertools.chain.from_iterable(iterable)`

purpose ： Get chained input from a single iteratable parameter , This parameter is calculated with delay .

application ：

``````chain.from_iterable([1,2], [3,4], )  # Error
chain.from_iterable([[1,2], [3,4], ])  # 1 2 3 4 5
chain.from_iterable(*[[1,2], [3,4], ])  # Error
chain.from_iterable(['ABC', 'DEF'])  # A B C D E F
chain.from_iterable(*['ABC', 'DEF'])  # Error
Copy code ``````

## compress

Parameters ：`itertools.compress(data, selectors)`

purpose ： Create an iterator , It returns data Middle meridian selectors The truth test is `True` The elements of . The iterator stops at the shorter length of both . The purpose is to filter the current sequence with another related sequence .

application ：

``````compress('ABCDEF', [1,0,1,0,1,1])  # A C E F
compress('ABCDEF', )  # A
compress('ABCDEF', [True, -1, 1, 0.5, 'C', ])  # A B C D E F
compress('ABCDEF', [False, 0, 0.0, '', [], None])  # (Empty)
#  notes ： The Boolean value is False Including  [0, -0, 0.0, 0j, [], (), {}, None, ""]  etc.
Copy code ``````

## dropwhile

Parameters ：`itertools.dropwhile(predicate, iterable)`

purpose ： Create an iterator , If predicate by true, The iterator discards these elements , Then return other elements . Be careful , Iterator in predicate For the first time false No output will be generated before , So it may take a certain length of start-up time . The goal is to skip the beginning of the iteratable object .

application ：

``````dropwhile(lambda x: x<3, [1,2,3,4,5])  # 3 4 5
dropwhile(lambda line: line.startswith('#'), f)  #  When reading a file, filter the file header to # Beginning line
Copy code ``````

## filterfalse

Parameters ：`itertools.filterfalse(predicate, iterable)`

purpose ： Create an iterator , Only return iterable in predicate by `False` The elements of . If predicate yes `None`, The return truth test is false The elements of .

application ：

``````filterfalse(lambda x: x%2, range(10))  # 0 2 4 6 8
filterfalse(None, range(10))  # 0
#  notes ：filter()  return  iterable  in  predicate  by  True  The elements of
filter(lambda x: x%2, range(10))  # 1 3 5 7 9
filter(None, range(10))  # 1 2 3 4 5 6 7 8 9
Copy code ``````

## groupby

Parameters ：`itertools.filterfalse(iterable, key=None)`

purpose ： Create an iterator , return iterable Continuous keys and groups in .key Is a function that calculates the key value of an element . If not specified or `None`,key The default is the identity function （identity function）, Return element unchanged . Generally speaking ,iterable Use the same key value function to pre sort . The returned group itself is also an iterator , It is associated with `groupby()` Share the underlying iteratable objects . Because the source is shared , When `groupby()` When the object iterates backward , The previous group will disappear .

application ：

``````#  notes ： In the actual returned results value For the iterator
groupby('AABB')  # {'A': ['A', 'A'], 'B': ['B', 'B']}
groupby('AaBb', key=lambda x: x.upper())  # {'A': ['A', 'a'], 'B': ['B', 'b']}
div_size = lambda x: 'small' if x < 3 else 'medium' if x == 3 else 'big'
groupby([1,2,3,4,5], key=div_size)  # {'small': [1, 2], 'medium': , 'big': [4, 5]}
Copy code ``````

## islice

Parameters ：`itertools.islice(iterable, stop)` or `itertools.islice(iterable, start, stop[, step])`

purpose ： Create an iterator , Return from iterable The selected element in . If start No 0, skip iterable The elements in , Until arrival start This position . After that, the iterator returns the elements continuously , Unless step The value set is too high to be skipped . If stop by `None`, Until the iterator runs out ; otherwise , Stop at the specified position . Different from ordinary slices ,`islice()` No support will be made. start , stop , or step Set to negative .

application ：

``````islice('ABCDEFG', 2)  # A B
islice('ABCDEFG', 2, 5)  # C D E
islice('ABCDEFG', 2, None)  # C D E F G
islice('ABCDEFG', 2, 5, 2)  # C E
islice('ABCDEFG', 2, None, 2)  # C E G
#  notes ： If  start  by  None, Iterative from 0 Start . If  step  by  None , The default step size is 1.
islice('ABCDEFG', None, None)  # A B C D E F G
Copy code ``````

## pairwise

【Python 3.10 New characteristics 】

Parameters ：`itertools.pairwise(iterable)`

purpose ： Create an iterator , Return from iterable Get contiguous overlapping pairs .

application ：

``````#  notes ： In the output iterator  2-tuples  The quantity of will be one less than the quantity entered .
pairwise('ABCDEFG')  # AB BC CD DE EF FG
#  notes ： If the input iteratable object is less than two values , Then the value is empty .
pairwise('A')  # (Empty)
Copy code ``````

## takewhile

Parameters ：`itertools.takewhile(predicate, iterable)`

purpose ： Create an iterator , as long as predicate True returns the element from the iteratable object .

application ：

``````takewhile(lambda x: x<5, [1,2,3,4,5])  # 1 2
takewhile(lambda x: x.isdigit(), '123abc456')  # 1 2 3
Copy code ``````

## tee

Parameters ：`itertools.tee(iterable, n=2)`

purpose ： Returns... From an iteratable object n A separate iterator . once tee() Implemented a split , The original iterable Should no longer be used ; otherwise tee The object cannot know iterable May have iterated back .`tee` Iterators are not thread safe .

application ：

``````#  notes ： The actual return result is the iterator
tee(range(3))  # [0, 1, 2] [0, 1, 2]
Copy code ``````

## zip_longest

Parameters ：`itertools.zip_longest(*iterables, fillvalue=None)`

purpose ： Create an iterator , Collect elements from each iteratable object . If the length of the iteratable object is not aligned , Based on the fillvalue Fill in missing values . Iterations continue to the longest consuming iteratable object .

application ：

``````zip_longest('ABCD', 'xy')  # ('A','x') ('B','y'), ('C',None) ('D',None)
zip_longest('ABCD', 'xy', fillvalue='-')  # ('A','x') ('B','y'), ('C','-') ('D','-')
Copy code ``````

# 3 Combined iterators Combinatoric iterators

## product

Parameters ：`itertools.product(*iterables, repeat=1)`

purpose ： Create an iterator , Returns the input of an iteratable object Cartesian product .

application ：

``````product('AB', range(3))  # ('A',0) ('A',1) ('A',2) ('B',0) ('B',1) ('B',2)
product(range(2), repeat=3)  # (0,0) (0,1) (1,0) (1,1)
Copy code ``````

## permutations

Parameters ：`itertools.permutations(iterable, r=None)`

purpose ： Create an iterator , Continuous return by iterable The generation length of the element is r Of array . If r Not specified or is `None` ,r The default setting is iterable The length of , In this case , Generate all full-length permutations .

application ：

``````permutations(range(3))  # (0,1,2) (0,2,1) (1,0,2) (1,2,0) (2,0,1) (2,1,0)
permutations('ABC')  # ('A','B','C') ('A','C','B') ('B','A','C') ('B','C','A') ('C','A','B') ('C','B','A')
permutations('ABC', 2)  # ('A','B') ('A','C') ('B','A') ('B','C') ('C','A') ('C','B')
Copy code ``````

## combinations

Parameters ：`itertools.combinations(iterable, r)`

purpose ： Create an iterator , Returns the value entered by iterable The element composition length in is r The subsequence （ Combine ）.

application ：

``````combinations(range(3), 2)  # (0,1) (0,2) (1,2)
combinations('ABC', 3)  # ('A','B','C')
combinations('ABC', 2)  # ('A','B') ('A','C') ('B','C')
Copy code ``````

## combinations_with_replacement

Parameters ：`itertools.combinations_with_replacement(iterable, r)`

purpose ： Create an iterator , Returns the value entered by iterable The length of the element composition in is r The subsequence （ Combine ）, Allow each element to appear repeatedly .

application ：

``````#  notes ：combinations  Equivalent to not putting back ,combinations_with_replacement  Equivalent to putting back
combinations_with_replacement(range(3), 2)  # (0,0) (0,1) (0,2) (1,1) (1,2) (2,2)
combinations_with_replacement('ABC', 2)  # ('A','A') ('A','B') ('A','C') ('B','B') ('B','C') ('C','C')
Copy code ``````

# 4 itertools formula Recipes

itertools The recipe uses existing itertools Create an extended toolset as a base component .

``````pip install more-itertools
Copy code ``````

The extended tool provides the same high performance as the underlying tool set . Maintain excellent memory utilization , Because only one element is processed at a time , Instead of loading the entire iteratable object into memory . Keep the amount of code small , Connect these tools in a functional style , Helps eliminate temporary variables . It's still very fast , Because they tend to use “ Vectorization ” Component to replace the interpreter, which is expensive for Circulation and generator .

For details, please refer to Python Package Index Upper more-itertools project .

notes ：PyPI = Python Package Index