Python is very simple and high level language that everyone can learn to prototype their ideas. But this does not come without a price. Because of its high level library, it is much harder to know what is implemented inside it. I know that you can now say “well you don’t need to know, that’s beauty of encapsulation etc…” but when it comes to performance, you should really be careful of what data structures you use. Here in this article I am going to explore few builtin data structures in Python and explain how to implement missing ones.

Lists … or are they?

You probably know about lists in Python, if you don’t, here is example:

l = [1,2,3]

Pretty simple, right? You can access elements by indexing:

l[2] # will be 3 
l[0] # will be 1

you can even add elements to it:

l.append(5)

Now the problem with these lists is that they are not actually lists! I bet computer science professors are angry at Python developers for assigning wrong names to things. Lists in Python are actually dynamic arrays. They are dynamic arrays of pointers to objects. In C++, one can implement this behaviour using std::vector for instance like this:

std::vector<PythonObject> l;
l.push_back(PythonObject::Integer(1));
l.push_back(PythonObject::Integer(2));
l.push_back(PythonObject::Integer(3));

so lists in Python behave like vector in C++. Accessing elements is fast and adding elements is fast on average. When capacity of array is full, addition will double its capacity and copy contents to new array.

But what about real lists?

Well, as far as I know, there is no implementation of linked lists in Python. Please correct me if I am wrong down in comment section. Linked lists can be easily implemented and I will explain its implementation now.

First of all, we will need simple class called list_node to represent node of our list. This object will contain value and pointer to next node.Python provides really nice way to implement such simple class and it is called named tuple:

from collections import namedtuple
list_node = namedtuple("list_node", ["value", "next"])
my_node = list_node(value=2, next=None)

This is implementation of simple node of single-linked list.

This is implementation of whole container:

class list_node:
    def __init__(self, value, next):
        self.value = value
        self.next = next

class linked_list:
    
    def __init__(self, values=None):
        self.head = None
        self.first = None
        if values is not None:
            for v in values:
                self.append(v)

    def append(self, value):
        if self.head is None:
            self.head = list_node(value=value, next=None)
            self.first = self.head
        else:
            self.head.next = list_node(value=value, next=None)
            self.head = self.head.next

    def __iter__(self):
        curr = self.first
        while curr is not None:
            yield curr.value
            curr = curr.next

and now you can do something like:

l = linked_list([1,2,3])
l.append(4)
for v in l:
    print(v)

From here, we can implement double-linked list:

class list_node:
    def __init__(self, value, next, prev):
        self.value = value
        self.next = next
        self.prev = prev

class linked_list:
    
    def __init__(self, values=None):
        self.head = None
        self.first = None
        if values is not None:
            for v in values:
                self.append(v)

    def append(self, value):
        if self.head is None:
            self.head = list_node(value=value, next=None, prev=None)
            self.first = self.head
        else:
            self.head.next = list_node(value=value, next=None, prev=self.head)
            self.head = self.head.next

    def __iter__(self):
        curr = self.first
        while curr is not None:
            yield curr.value
            curr = curr.next

    def pop_last(self):
        if self.head is not None:
            self.head = self.head.prev
            if self.head is None:
                self.first = None
            else:
                self.head.next = None
        else:
            raise ValueError("List is empty")

Now, other data structures can be implemented from this.