/documentations/python_libraries/itertools/

Itertools

The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an "iterator algebra" making it possible to construct specialized tools succinctly and efficiently in pure Python.(see docs)

Before looking at itertools one should have an idea about iterables an iterators (see at the end of this text). It's also good to know how to produce a list given an iterator. For finite iterators this can be achieved with the built-in function list:

>>> import itertools as it
>>> it.repeat(0, 10)
repeat(0, 10)
>>> list(it.repeat(0, 10))
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Applying list to an infinite iterator leads to an infinite loop. For infinite or really long iterators islice returns a slice of the original iterator, which then in turn can be converted to a list:

>>> import itertools as it
>>> list(it.islice(it.count(), 1000, 1004))
[1000, 1001, 1002, 1003]

The best place to get an idea on how to use itertools, is at the end of the itertools documentation at: Recipes. This recipes show how to use itertools as building blocks to create some useful functions. An example is take:

import itertools as it
def take(n, iterable):
    "Return first n items of the iterable as a list"
    return list(it.islice(iterable, n))

Then it's easy to have a look at the first n elements:

>>> import itertools as it
>>> numbers = it.count()
>>> take(10, numbers)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Basic tools

This functions return infinite iterators. repeat returns a finite iterator only if times is specified. This allows to work with infinite series, like all the natural numbers count(). Some caution is required to avoid infinite loops.

count(start=0, step=1)

import itertools as it
it.count()      # yields: 0, 1, 2, 3, ...
it.count(8, -4) # yields: 8, 4, 0, -4, ...

cycle(iterable)

import itertools as it
it.cycle(xrange(3)) # yields: 0, 1, 2, 0, 1, ...
it.cycle('abc')    # yields: a, b, c, a, b, ...

repeat(object[, times])

import itertools as it
it.repeat(1)     # yields: 1, 1, 1, 1, ...
it.repeat(1, 3)  # yields: 1, 1, 1

Combinatoric Generators:

The basic combinatoric tools:

product(*iterables[, repeat]):

# below bc means (b, c)
import itertools as it
it.product('abc', 2)        # yields: aa, ab, ac, ba, bb, bc, ca, cb, cc
it.product('abc', xrange(2)) # yields: a0, a1, b0, b1, c0, c1

permutations(iterable[, r])

combinations(iterable, r)

combinations_with_replacement(iterable, r)

combinatoric

Others

For the full list see the python documentation. All the following tools operating on multiple iterators, terminate on the shortest iterator. Before giving some examples it's useful to know about anonymous functions(see lambda): f = lambda x: x*x is equivalent to def f(x): return x*x. Another useful python module in this context is the operator module:

The operator module exports a set of efficient functions corresponding to the intrinsic operators of Python. For example, operator.add(x, y) is equivalent to the expression x+y.

imap(function, *iterables):

if function takes n arguments, then imap requires n iterables. Then imap(f, a, b) yields[f(a0, b0), f(a1, b1), ...]:

import itertools as it
it.imap(lambda x: x*x, xrange(5))     # yields: 0, 1, 4, 9, 16
import operator as op
it.imap(op.add, xrange(5), it.count()) # yields: 0, 2, 4, 6, 8

ifilter(predicate, iterable):

yields the elements in iterable satisfying the predicate:

import itertools as it
it.ifilter(lambda x: (x%2 != 0), it.count()) # yields: 1, 3, 5, ...

dropwhile(predicate, iterable):

yields all elements in iterable, starting with the first not satisfying predicate

import itertools as it
it.dropwhile(lambda x: x<1000, it.count()) # yields: 1000, 1001, 1002, ...

izip(*iterables):

yields all tuples of i-th entries. izip(a, b, c) yields (a0, b0, c0), (a1, b1, c1), ...:

import itertools as it
it.izip('abc', it.count()) # yields: a0, b1, c2

The prefix * allows to pass the entries of a list as arguments: izip(*[a, b, c]) is equivalent to izip(a, b, c):

import itertools as it
it.izip(*(it.izip([1, 2, 3],[3, 4, 5))) # yields: (1, 2, 3), (3, 4, 5)

groupby(iterable[, key]):

groups consecutive elements having the same image under key. key defaults to the identity function:

import itertools as it
# below aaa means an iterator 
it.groupby('aaabbaa') # yields: (a, iter('aaa')), (b, iter('bb')), (a, iter('aa'))
it.groupby(xrange(7), lambda x: x//3) # yields: (0, iter([0,1,2])), (1, iter([3,4,5])), (2, iter([6]))

Examples

The examples below were taken from sahandsaba.com. The reader is encouraged to have a look at sahandsaba.com/category/python.html .

[...] let's use generator expressions and itertools.combinations and itertools.permutations to calculate the inversion number of a permutation, then sum the inversion numbers of all the permutations of a list. The sum of inversion numbers of all the permutations of n is OEIS A001809 and it is relatively easy to show that the closed form is n!n(n - 1)/4.

import itertools
import math


def inversion_number(A):
    """Return the number of inversions in list A."""
    return sum(1 for x, y in itertools.combinations(xrange(len(A)), 2) if A[x] > A[y])


def total_inversions(n):
    """Return total number of inversions in permutations of n."""
    return sum(inversion_number(A) for A in itertools.permutations(xrange(n)))

# Usage:
>>> [total_inversions(n) for n in xrange(10)]
[0, 0, 1, 9, 72, 600, 5400, 52920, 564480, 6531840]

>>> [math.factorial(n)*n*(n-1)/4 for n in xrange(10)]
[0, 0, 1, 9, 72, 600, 5400, 52920, 564480, 6531840]

[...] let's calculate the recontres numbers using a brute-force counting method. Given n and k , the recontres number Dn,k is defined as the number of permutations of a set of size n that have exactly k fixed points. First we write a function that uses a generator expression inside a sum to count the number of fixed points of a permutation. Then using itertools.permutations and another generator expression inside a sum, we count the total number of permutations of n numbers which have exactly k fixed points. Here's the result.

def count_fixed_points(p):
    """Return the number of fixed points of p as a permutation."""
    return sum(1 for x in p if p[x]==x)

# Note: This is NOT an efficient way of doing this.
# Just to demonstrate the use of itertools and generators
def count_partial_derangements(n,k):
    """Returns the number of permutations of n with k fixed points."""
    return sum(1 for p in itertools.permutations(xrange(n)) if count_fixed_points(p) == k)

# Usage:
>>> [count_partial_derangements(6,i) for i in xrange(7)]
[265, 264, 135, 40, 15, 0, 1]

Iterators and iterables

iterable:

A container object capable of returning its members one at a time. Examples of iterables include all sequence types (such as list, str, and tuple) and some non-sequence types like dict and file [...] Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), ...). [...] This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter() or deal with iterator objects yourself. The for statement does that automatically for you [...] (see: Glossary)

iterator:

An object representing a stream of data. Repeated calls to the iterator's next() method return successive items in the stream. When no more data are available a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its next() method just raise StopIteration again.[...] A container object (such as a list) produces a fresh new iterator each time you pass it to the iter() function or use it in a for loop.[...] (see: Glossary)

Example

Here a simple example that instantiates an iterator object for the list [1,2,3], and calls next() till the iterator is exhausted:

>>> # instantiate an iterator
>>> list_iterator = iter([1,2,3])
>>> # call next till exhaustion
>>> list_iterator.next()
1
>>> list_iterator.next()
2
>>> list_iterator.next()
3
>>> # now the iterator is exhausted
>>> # and the next call of next() raise an exception
>>> list_iterator.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

It's possible to define a new iterator type by following the pattern given at: Iterator Types.

''' My stupid iterator
file: myiter.py
'''
class MyIter:
    i = 0
    def __iter__(self): return self
    def next(self):
        if self.i < 3:
            self.i += 1
            return self.i
        else:
            raise StopIteration()

Which can be used as:

>>> from myiter import MyIter
>>> mi = MyIter()
>>> for element in mi:
...     print element
... 
1
2
3

This just to show it's possible. A better idea is to use generators (see week6) ore the itertools module.

Links