Generators in Python

Generators in Python

This article explains generators in Python.

YouTube Video

Generators in Python

Overview

Python generators are a type of iterator and a powerful feature for performing repetitive processing efficiently. They enable you to write memory-efficient code when dealing with large amounts of data.

What is a Generator?

A generator in Python is a special function that produces one value at a time, defined using the yield keyword. Its characteristic is that it pauses execution while retaining its state and can resume later.

Basics of yield

yield is a keyword that returns a value and pauses the function execution at the same time.

 1def simple_generator():
 2    yield 1
 3    yield 2
 4    yield 3
 5
 6gen = simple_generator()
 7
 8print(next(gen))  # 1
 9print(next(gen))  # 2
10print(next(gen))  # 3
  • When called, this function returns a generator object that yields values one by one.
  • If you call next() when there is no next value, a StopIteration error will occur.

next() and StopIteration

 1def simple_generator():
 2    yield 1
 3    yield 2
 4    yield 3
 5
 6gen = simple_generator()
 7
 8try:
 9    while True:
10        value = next(gen)
11        print(value)
12except StopIteration:
13    print("Finished")
  • By explicitly handling the StopIteration error like this, you can detect when a generator has finished.

send(value)

Calling send(value) resumes the generator and sends value to the position of the yield expression. The sent value can be received on the generator side as the return value of the yield expression. On the first call, you cannot send anything other than None with send(value), so you must use next() or send(None).

 1def gen():
 2    x = yield 1
 3    print(f"x = {x}")
 4    y = yield 2
 5    print(f"y = {y}")
 6
 7g = gen()
 8print(next(g))       # -> 1 (value from yield 1)
 9print(g.send(10))    # -> x = 10, 2 (value from yield 2)
10print(g.send(20))    # -> y = 20, StopIteration occurs
  • With send(10), the generator's yield becomes an expression that returns 10, and 10 is assigned to x.

throw()

Calling throw resumes the generator and raises an exception at the position of the paused yield. You can handle the exception inside the generator to continue processing. If the exception is not caught, it propagates outward and the generator ends.

 1def gen():
 2    try:
 3        yield 1
 4    except ValueError as e:
 5        print(f"Caught: {e}")
 6        yield "recovered"
 7
 8g = gen()
 9print(next(g))   # -> 1
10print(g.throw(ValueError("boom")))  # -> Caught: boom, "recovered"
  • In this code, throw is called to inject an exception into the generator. On the generator side, the exception is handled and recovered is returned.

close()

Calling close() terminates the generator. Inside the generator, you can perform cleanup using finally. Calling next() or send() after calling close() raises a StopIteration error.

1def gen():
2    try:
3        yield 1
4    finally:
5        print("Cleaning up...")
6
7g = gen()
8print(next(g))  # -> 1
9g.close()       # -> Cleaning up...
  • This code shows that calling close() terminates the generator and triggers the cleanup process in finally.

yield from

yield from is syntax used for delegating to a subgenerator. It is a simple way to call another generator inside a generator and pass through all of its values to the outer scope.

1def sub_gen():
2    yield 1
3    yield 2
4
5def main_gen():
6    yield from sub_gen()
7    yield 3
8
9print(list(main_gen()))  # -> [1, 2, 3]
  • This code delegates all values from the subgenerator to the outer generator using yield from, and then yields 3.

Relationship with Iterators

Generators internally implement __iter__() and __next__(), making them a type of iterator. Therefore, they are fully compatible with iterable operations such as for loops.

Integration with for Loops

In Python, a for loop internally uses next() to automatically retrieve values.

1def simple_generator():
2    yield 1
3    yield 2
4    yield 3
5
6for value in simple_generator():
7    print(value)

With this method, handling of StopIteration is also automatic.

Creating Infinite Generators

1def count_up(start=0):
2    while True:
3        yield start
4        start += 1
5
6counter = count_up()
7print(next(counter))  # 0
8print(next(counter))  # 1

It is possible to create infinite loops, but care must be taken when using them.

Generator Expressions

Generator expressions, written using parentheses, allow you to define generators with syntax similar to list comprehensions.

1# List comprehension (generates the entire list at once)
2squares_list = [x**2 for x in range(5)]
3print(squares_list)
4
5# Generator expression
6squares_gen = (x**2 for x in range(5))
7for square in squares_gen:
8    print(square)

Unlike list comprehensions, they do not load all elements into memory at once, making them more memory efficient.

Error Handling in Generators

Exceptions may occur inside a generator. In such cases, you use try-except just like in regular Python code.

 1def safe_divide_generator(numbers, divisor):
 2    """Yields results of dividing numbers by a given divisor safely."""
 3    for number in numbers:
 4        try:
 5            yield number / divisor  # Attempt to divide and yield result.
 6        except ZeroDivisionError:
 7            yield float('inf')  # Return infinity if division by zero occurs.
 8
 9# Example usage
10numbers = [10, 20, 30]
11gen = safe_divide_generator(numbers, 0)  # Create generator with divisor as 0.
12for value in gen:
13    print(value)  # Output: inf, inf, inf

In this example, proper error handling is performed in the event of a division by zero.

Stack Trace of a Generator

If an exception occurs inside the generator, it will be raised when the generator is resumed.

 1def error_generator():
 2    """A generator that yields values and raises an error."""
 3    yield 1
 4    raise ValueError("An error occurred")  # Raise a ValueError intentionally.
 5    yield 2
 6
 7gen = error_generator()
 8print(next(gen))       # Output: 1 (first value yielded)
 9try:
10    print(next(gen))   # Attempt to get the next value, which raises an error
11except ValueError as e:
12    print(e)           # Output: An error occurred (exception message is printed)
  • This generator returns 1 first. The error raised upon resuming is caught and displayed as an error message.

Examples of Using Generators

Reading a file line by line (suitable for large files)

1def read_large_file(filepath):
2    with open(filepath, 'r') as f:
3        for line in f:
4            yield line.strip()
  • This function reads a text file line by line using an iterator, trims whitespace from each line, and returns it as a generator, allowing large files to be processed with low memory usage.

Generator for the Fibonacci Sequence

1def fibonacci(limit):
2    a, b = 0, 1
3    while a < limit:
4        yield a
5        a, b = b, a + b
6
7for n in fibonacci(100):
8    print(n)
  • This code uses a generator to sequentially generate Fibonacci numbers less than the upper limit and outputs them using a for loop.

Use Cases

Generators can also be used in the following scenarios.

  • Sequential processing of large CSV or log files
  • API pagination
  • Processing streaming data (e.g., Kafka, IoT devices)

Summary

Concept Key Point
yield Pauses and returns a value
Generator Function A function that contains yield and returns an iterator when called
Advantages Memory-efficient and ideal for processing large data sets
Generator Expression Allows concise syntax like (x for x in iterable)

By using generators, you can efficiently process large data sets while conserving memory and keeping your code concise.

You can follow along with the above article using Visual Studio Code on our YouTube channel. Please also check out the YouTube channel.

YouTube Video