current position:Home>"Python" guide to using itertools of Python standard library

"Python" guide to using itertools of Python standard library

2022-02-02 05:09:43 XQGang

0 Preface

Built in module itertools Implemented many iterator building blocks , suffer APL、Haskell and SML And so on , Standardized a Fast Efficient use of memory Core toolset , Provides functions for manipulating iteration objects , Together they form a “ Iterator algebra (iterator algebra)”, This makes in pure Python It is possible to create concise and efficient special tools .

Use next( iterator ) You can get the next generated value of the iterator object .

1 Infinite iterator infinite iterators

count

Parameters :itertools.count(start=0, step=1)

purpose : Create an iterator , It is from start Values start , Returns evenly spaced values , That is, the initial value generated is start, The tolerance is step Equal difference sequence of . Commonly used in map() To generate continuous data points . Besides , Also used zip() To add a serial number .

application :

count(10)  # 10 11 12 13 ...
count(10, 0.5)  # 10 10.5 11 11.5 ...
#  notes : Floating point count , Better accuracy can be obtained by the following methods 
(10 + 0.5 * i for i in count())  # 10.0 10.5 11 11.5 ...
map(lambda x: 2 * x**2 + 1, count())  # 1 3 9 19 33 51 ...
zip(count(), ['a', 'b', 'c'])  # [(0, 'a'), (1, 'b'), (2, 'c')]
 Copy code 

cycle

Parameters :itertools.cycle(iterable)

purpose : Create an iterator , return iterable And save a copy . When finished iterable All elements in , Return all elements in the copy . Infinite repetition .

application :

cycle(['a','b','c'])  # a b c a b c a ...
cycle(range(3))  # 0 1 2 0 1 2 0 ...
 Copy code 

repeat

Parameters :itertools.repeat(object, times=None)

purpose : Create an iterator , Keep repeating object . Unless you set parameters times , Otherwise, it will repeat indefinitely . Can be used for map() Parameters in function , The called function can get an invariant parameter . It can also be used for zip() To create an invariant part in the tuple record .

application :

repeat('abc')  # abc abc abc ...
repeat(range(3))  # range(0, 3) range(0, 3) range(0, 3) ...
repeat(1, 3)  # 1 1 1
map(pow, range(5), repeat(2))  # 0 1 4 9 16
zip(repeat('num'), [1,2,3])  # [('num', 1), ('num', 2), ('num', 3)]
 Copy code 

2 Finite iterator Iterators terminating on the shortest input sequence

accumulate

【Python 3.8 change : Added optional initial Shape parameter 】

Parameters :itertools.accumulate(iterable, func=operator.add, *, initial=None)

purpose : Create an iterator , Returns the cumulative summary value or the cumulative result value of other binocular operation functions ( Through optional func Parameter assignment ). Usually , The number of output elements is consistent with the input iteratable objects . however , If keyword parameters are provided initial, Then the accumulation will be initial Values start , In this way, the output has one more element than the input iteratable object .

application :

accumulate([1,2,3,4,5])  # 1 3 6 10 15
accumulate([1,2,3,4,5], operator.mul)  # 1 2 6 24 120
accumulate([1,2,3,4,5], min)  # 1 1 1 1 1
accumulate([1,2,3,4,5], max)  # 1 2 3 4 5
accumulate([2,4,5,8,10], lambda x, _: 1/x)  # 2 0.5 2.0 0.5 2.0
accumulate([2,4,5,8,10], lambda _, x: 1/x)  # 2 0.25 0.2 0.125 0.1
accumulate([2,4,5,8,10], lambda x, y: x*y)  # 2 8 40 320 3200
#  notes :functools.reduce()  Only the final cumulative value is obtained 
reduce(lambda x, y: x*y, [2,4,5,8,10])  # 3200
accumulate([1,2,3,4,5], initial=100)  # 100 101 103 106 110 115
 Copy code 

chain

Parameters :itertools.chain(*iterables)

purpose : Create an iterator , It first returns all the elements in the first iteratable object , Then return all the elements in the next iteratable object , Until all the elements in the iteratable object are exhausted . Multiple sequences can be processed into a single sequence .

application :

chain([1,2], [3,4], [5])  # 1 2 3 4 5
chain([[1,2], [3,4], [5]])  # [[1, 2], [3, 4], [5]]
chain(*[[1,2], [3,4], [5]])  # 1 2 3 4 5
chain(['ABC', 'DEF'])  # ABC DEF
chain(*['ABC', 'DEF'])  # A B C D E F
 Copy code 

chain.from_iterable

Parameters :itertools.chain.from_iterable(iterable)

purpose : Get chained input from a single iteratable parameter , This parameter is calculated with delay .

application :

chain.from_iterable([1,2], [3,4], [5])  # Error
chain.from_iterable([[1,2], [3,4], [5]])  # 1 2 3 4 5
chain.from_iterable(*[[1,2], [3,4], [5]])  # Error
chain.from_iterable(['ABC', 'DEF'])  # A B C D E F
chain.from_iterable(*['ABC', 'DEF'])  # Error
 Copy code 

compress

Parameters :itertools.compress(data, selectors)

purpose : Create an iterator , It returns data Middle meridian selectors The truth test is True The elements of . The iterator stops at the shorter length of both . The purpose is to filter the current sequence with another related sequence .

application :

compress('ABCDEF', [1,0,1,0,1,1])  # A C E F
compress('ABCDEF', [1])  # A
compress('ABCDEF', [True, -1, 1, 0.5, 'C', [1]])  # A B C D E F
compress('ABCDEF', [False, 0, 0.0, '', [], None])  # (Empty)
#  notes : The Boolean value is False Including  [0, -0, 0.0, 0j, [], (), {}, None, ""]  etc. 
 Copy code 

dropwhile

Parameters :itertools.dropwhile(predicate, iterable)

purpose : Create an iterator , If predicate by true, The iterator discards these elements , Then return other elements . Be careful , Iterator in predicate For the first time false No output will be generated before , So it may take a certain length of start-up time . The goal is to skip the beginning of the iteratable object .

application :

dropwhile(lambda x: x<3, [1,2,3,4,5])  # 3 4 5
dropwhile(lambda line: line.startswith('#'), f)  #  When reading a file, filter the file header to # Beginning line 
 Copy code 

filterfalse

Parameters :itertools.filterfalse(predicate, iterable)

purpose : Create an iterator , Only return iterable in predicate by False The elements of . If predicate yes None, The return truth test is false The elements of .

application :

filterfalse(lambda x: x%2, range(10))  # 0 2 4 6 8
filterfalse(None, range(10))  # 0
#  notes :filter()  return  iterable  in  predicate  by  True  The elements of 
filter(lambda x: x%2, range(10))  # 1 3 5 7 9
filter(None, range(10))  # 1 2 3 4 5 6 7 8 9
 Copy code 

groupby

Parameters :itertools.filterfalse(iterable, key=None)

purpose : Create an iterator , return iterable Continuous keys and groups in .key Is a function that calculates the key value of an element . If not specified or None,key The default is the identity function (identity function), Return element unchanged . Generally speaking ,iterable Use the same key value function to pre sort . The returned group itself is also an iterator , It is associated with groupby() Share the underlying iteratable objects . Because the source is shared , When groupby() When the object iterates backward , The previous group will disappear .

application :

#  notes : In the actual returned results value For the iterator 
groupby('AABB')  # {'A': ['A', 'A'], 'B': ['B', 'B']}
groupby('AaBb', key=lambda x: x.upper())  # {'A': ['A', 'a'], 'B': ['B', 'b']}
div_size = lambda x: 'small' if x < 3 else 'medium' if x == 3 else 'big'
groupby([1,2,3,4,5], key=div_size)  # {'small': [1, 2], 'medium': [3], 'big': [4, 5]}
 Copy code 

islice

Parameters :itertools.islice(iterable, stop) or itertools.islice(iterable, start, stop[, step])

purpose : Create an iterator , Return from iterable The selected element in . If start No 0, skip iterable The elements in , Until arrival start This position . After that, the iterator returns the elements continuously , Unless step The value set is too high to be skipped . If stop by None, Until the iterator runs out ; otherwise , Stop at the specified position . Different from ordinary slices ,islice() No support will be made. start , stop , or step Set to negative .

application :

islice('ABCDEFG', 2)  # A B
islice('ABCDEFG', 2, 5)  # C D E
islice('ABCDEFG', 2, None)  # C D E F G
islice('ABCDEFG', 2, 5, 2)  # C E
islice('ABCDEFG', 2, None, 2)  # C E G
#  notes : If  start  by  None, Iterative from 0 Start . If  step  by  None , The default step size is 1.
islice('ABCDEFG', None, None)  # A B C D E F G
 Copy code 

pairwise

【Python 3.10 New characteristics 】

Parameters :itertools.pairwise(iterable)

purpose : Create an iterator , Return from iterable Get contiguous overlapping pairs .

application :

#  notes : In the output iterator  2-tuples  The quantity of will be one less than the quantity entered .
pairwise('ABCDEFG')  # AB BC CD DE EF FG
#  notes : If the input iteratable object is less than two values , Then the value is empty .
pairwise('A')  # (Empty)
 Copy code 

takewhile

Parameters :itertools.takewhile(predicate, iterable)

purpose : Create an iterator , as long as predicate True returns the element from the iteratable object .

application :

takewhile(lambda x: x<5, [1,2,3,4,5])  # 1 2
takewhile(lambda x: x.isdigit(), '123abc456')  # 1 2 3
 Copy code 

tee

Parameters :itertools.tee(iterable, n=2)

purpose : Returns... From an iteratable object n A separate iterator . once tee() Implemented a split , The original iterable Should no longer be used ; otherwise tee The object cannot know iterable May have iterated back .tee Iterators are not thread safe .

application :

#  notes : The actual return result is the iterator 
tee(range(3))  # [0, 1, 2] [0, 1, 2]
 Copy code 

zip_longest

Parameters :itertools.zip_longest(*iterables, fillvalue=None)

purpose : Create an iterator , Collect elements from each iteratable object . If the length of the iteratable object is not aligned , Based on the fillvalue Fill in missing values . Iterations continue to the longest consuming iteratable object .

application :

zip_longest('ABCD', 'xy')  # ('A','x') ('B','y'), ('C',None) ('D',None)
zip_longest('ABCD', 'xy', fillvalue='-')  # ('A','x') ('B','y'), ('C','-') ('D','-')
 Copy code 

3 Combined iterators Combinatoric iterators

product

Parameters :itertools.product(*iterables, repeat=1)

purpose : Create an iterator , Returns the input of an iteratable object Cartesian product .

application :

product('AB', range(3))  # ('A',0) ('A',1) ('A',2) ('B',0) ('B',1) ('B',2)
product(range(2), repeat=3)  # (0,0) (0,1) (1,0) (1,1)
 Copy code 

permutations

Parameters :itertools.permutations(iterable, r=None)

purpose : Create an iterator , Continuous return by iterable The generation length of the element is r Of array . If r Not specified or is None ,r The default setting is iterable The length of , In this case , Generate all full-length permutations .

application :

permutations(range(3))  # (0,1,2) (0,2,1) (1,0,2) (1,2,0) (2,0,1) (2,1,0)
permutations('ABC')  # ('A','B','C') ('A','C','B') ('B','A','C') ('B','C','A') ('C','A','B') ('C','B','A')
permutations('ABC', 2)  # ('A','B') ('A','C') ('B','A') ('B','C') ('C','A') ('C','B')
 Copy code 

combinations

Parameters :itertools.combinations(iterable, r)

purpose : Create an iterator , Returns the value entered by iterable The element composition length in is r The subsequence ( Combine ).

application :

combinations(range(3), 2)  # (0,1) (0,2) (1,2)
combinations('ABC', 3)  # ('A','B','C')
combinations('ABC', 2)  # ('A','B') ('A','C') ('B','C')
 Copy code 

combinations_with_replacement

Parameters :itertools.combinations_with_replacement(iterable, r)

purpose : Create an iterator , Returns the value entered by iterable The length of the element composition in is r The subsequence ( Combine ), Allow each element to appear repeatedly .

application :

#  notes :combinations  Equivalent to not putting back ,combinations_with_replacement  Equivalent to putting back 
combinations_with_replacement(range(3), 2)  # (0,0) (0,1) (0,2) (1,1) (1,2) (2,2)
combinations_with_replacement('ABC', 2)  # ('A','A') ('A','B') ('A','C') ('B','B') ('B','C') ('C','C')
 Copy code 

4 itertools formula Recipes

itertools The recipe uses existing itertools Create an extended toolset as a base component .

pip install more-itertools
 Copy code 

The extended tool provides the same high performance as the underlying tool set . Maintain excellent memory utilization , Because only one element is processed at a time , Instead of loading the entire iteratable object into memory . Keep the amount of code small , Connect these tools in a functional style , Helps eliminate temporary variables . It's still very fast , Because they tend to use “ Vectorization ” Component to replace the interpreter, which is expensive for Circulation and generator .

For details, please refer to Python Package Index Upper more-itertools project .

notes :PyPI = Python Package Index

Reference material

  1. itertools — Functions creating iterators for efficient looping — Python 3.10.0b4 documentation

copyright notice
author[XQGang],Please bring the original link to reprint, thank you.
https://en.pythonmana.com/2022/02/202202020509399535.html

Random recommended