Python's `copy` Module
This article explains Python's copy module.
Focusing on the differences between shallow and deep copies, we provide a clear explanation—from the basic mechanisms of object duplication to applications in custom classes—along with practical examples.
YouTube Video
Python's copy Module
Python's copy module is the standard module for handling object duplication (copying). The goal is to understand the difference between shallow and deep copies and to be able to control copying behavior for custom objects.
Basics: What is a shallow copy?
Here we illustrate the behavior of shallow copies for mutable objects such as lists and dictionaries. A shallow copy duplicates only the top-level object and shares the references inside.
1# Example: shallow copy with lists and dicts
2import copy
3
4# A nested list
5original = [1, [2, 3], 4]
6shallow = copy.copy(original)
7
8# Mutate nested element
9shallow[1].append(99)
10
11print("original:", original) # nested change visible in original
12print("shallow: ", shallow)In this code, the top-level list is duplicated, but the inner sublists are shared by reference, so shallow[1].append(99) is also reflected in original. This is a typical behavior of a shallow copy.
What is a deep copy?
A deep copy recursively duplicates the object and all of its internal referents, producing an independent object. Use it when you want to safely duplicate complex nested structures.
1# Example: deep copy with nested structures
2import copy
3
4original = [1, [2, 3], {'a': [4, 5]}]
5deep = copy.deepcopy(original)
6
7# Mutate nested element of the deep copy
8deep[1].append(100)
9deep[2]['a'].append(200)
10
11print("original:", original) # original remains unchanged
12print("deep: ", deep)In this example, changes to deep do not affect original. deepcopy copies the entire object graph, yielding an independent replica.
How to decide which one to use
Which copy to use should be determined by the object's structure and your purpose. If you do not want changes to internal mutable elements to affect the original object, deepcopy is appropriate. This is because it recursively duplicates the entire object, creating a completely independent copy.
On the other hand, if duplicating only the top-level object is sufficient and you value speed and memory efficiency, copy (a shallow copy) is more suitable.
- You want to avoid the original changing when you modify internal mutable elements → use
deepcopy. - Duplicating only the top level is enough and you want to prioritize performance (speed/memory) → use
copy(shallow copy).
Handling built-in and immutable objects
Immutable objects (e.g., int, str, and tuples whose contents are immutable) usually do not need to be copied. copy.copy may return the same object when appropriate.
1# Example: copying immutables
2import copy
3
4a = 42
5b = copy.copy(a)
6print(a is b) # True for small ints (implementation detail)
7
8s = "hello"
9t = copy.deepcopy(s)
10print(s is t) # True (strings are immutable)Since copying immutable objects provides little efficiency benefit, the same object may be reused. This is rarely something you need to worry about in application design.
Custom classes: defining __copy__ and __deepcopy__
If the default copying is not what you expect for your class, you can provide your own copy logic. If you define __copy__ and __deepcopy__, copy.copy and copy.deepcopy will use them.
1# Example: customizing copy behavior
2import copy
3
4class Node:
5 def __init__(self, value, children=None):
6 self.value = value
7 self.children = children or []
8
9 def __repr__(self):
10 return f"Node({self.value!r}, children={self.children!r})"
11
12 def __copy__(self):
13 # Shallow copy: create new Node, but reuse children list reference
14 new = self.__class__(self.value, self.children)
15 return new
16
17 def __deepcopy__(self, memo):
18 # Deep copy: copy value and deepcopy each child
19 new_children = [copy.deepcopy(child, memo) for child in self.children]
20 new = self.__class__(copy.deepcopy(self.value, memo), new_children)
21 memo[id(self)] = new
22 return new- By implementing
__copy__and__deepcopy__, you can flexibly control class-specific copying behavior. For example, you can handle cases where certain attributes should refer to (share) the same object, while other attributes should be duplicated as entirely new objects. - The
memoargument passed to__deepcopy__is a dictionary that records objects already processed during recursive copying and prevents infinite loops caused by circular references.
1# Build tree
2root_node = Node('root', [Node('child1'), Node('child2')])
3
4shallow_copy = copy.copy(root_node) # shallow copy
5deep_copy = copy.deepcopy(root_node) # deep copy
6
7# Modify children
8# affects both root_node and shallow_copy
9shallow_copy.children.append(Node('child_shallow'))
10# affects deep_copy only
11deep_copy.children.append(Node('child_deep'))
12
13# Print results
14print("root_node:", root_node)
15print("shallow_copy:", shallow_copy)
16print("deep_copy:", deep_copy)- In this code, the variable
shallow_copyis created via a 'shallow copy', so only the top level of the object is duplicated, and the objects it references internally (in this case, thechildrenlist) are shared with the originalroot_node. As a result, when you add a new node toshallow_copy, the sharedchildrenlist is updated and the change is reflected in the contents ofroot_nodeas well. - On the other hand, the variable
deep_copyis created via a 'deep copy', so the internal structure is recursively duplicated, producing a tree completely independent ofroot_node. Therefore, even if you add a new node todeep_copy, it does not affectroot_node.
Cyclic references (recursive objects) and the importance of memo
When a complex object graph references itself (cyclic references), deepcopy uses a memo dictionary to track objects already copied and prevent infinite loops.
1# Example: recursive list and deepcopy memo demonstration
2import copy
3
4a = []
5b = [a]
6a.append(b) # a -> [b], b -> [a] (cycle)
7
8# deepcopy can handle cycles
9deep = copy.deepcopy(a)
10print("deep copy succeeded, length:", len(deep))- Internally,
deepcopyusesmemoto refer to objects already duplicated, enabling safe copying even in the presence of cycles. You need a similar mechanism when performing recursion manually.
copyreg and serialization-style custom copying (advanced)
If you want to register special copying behavior in coordination with libraries, you can use the standard copyreg module to register how objects are (re)constructed. This is useful for complex objects and C extension types.
1# Example: using copyreg to customize pickling/copy behavior (brief example)
2import copy
3import copyreg
4
5class Wrapper:
6 def __init__(self, data):
7 self.data = data
8
9def reduce_wrapper(obj):
10 # Return callable and args so object can be reconstructed
11 return (Wrapper, (obj.data,))
12
13copyreg.pickle(Wrapper, reduce_wrapper)
14
15w = Wrapper([1, 2, 3])
16w_copy = copy.deepcopy(w)
17print("w_copy.data:", w_copy.data)- Using
copyreg, you can provide reconstruction rules that affect bothpickleanddeepcopy. It's an advanced API, and in most cases__deepcopy__is sufficient.
Practical caveats and pitfalls
There are several important points to ensure correct behavior when using the copy module. Below we explain common pitfalls you may encounter in development and how to avoid them.
- Performance
deepcopycan consume significant memory and CPU, so consider carefully whether it's necessary. - Handling elements you want to share
If you want some attributes, such as large caches, to remain shared, intentionally avoid copying their references inside
__deepcopy__. - Immutable internal state If you hold immutable data internally, copying may be unnecessary.
- Threads and external resources Resources that cannot be copied, such as sockets and file handles, are either meaningless to copy or will cause errors, so you must avoid copying them by design.
Practical example: a pattern for safely updating dictionaries
When updating a complex configuration object, this example uses deepcopy to protect the original settings.
1# Example: safely update a nested configuration using deepcopy
2import copy
3
4default_config = {
5 "db": {"host": "localhost", "ports": [5432]},
6 "features": {"use_cache": True, "cache_sizes": [128, 256]},
7}
8
9# Create a working copy to modify without touching defaults
10working = copy.deepcopy(default_config)
11working["db"]["ports"].append(5433)
12working["features"]["cache_sizes"][0] = 512
13
14print("default_config:", default_config)
15print("working:", working)Using deepcopy, you can safely create derived configurations without risking damage to the default settings. This is especially useful for nested mutable structures such as configurations.
Best practices
To use the copy module safely and effectively, it's important to keep the following practical guidelines in mind.
- Choose between shallow and deep copies based on whether changes will occur and their scope.
- Implement
__copy__and__deepcopy__in custom classes to achieve the expected behavior. - Because deep copies are costly, reduce the need for copying through design when possible. Consider immutability and explicit clone methods, among other techniques.
- When dealing with cyclic references, either leverage
deepcopyor provide a memo-like mechanism manually. - Design so that external resources such as file handles and threads are not copied.
Conclusion
The copy module is the standard tool for object duplication in Python. By correctly understanding the differences between shallow and deep copies and implementing custom copying behavior when needed, you can perform duplication safely and predictably. By clarifying at the design stage whether copying is truly necessary and what should be shared, you can avoid unnecessary bugs and performance issues.
You can follow along with the above article using Visual Studio Code on our YouTube channel. Please also check out the YouTube channel.