Open In Colab

4: Data structure (Lists and Tuples)#

A data structure is a variable able to contain several values at the same time. A data structure can form a sequence if the values that compose it are arranged in a certain order. This is the case for lists and tuples. Conversely, a Dictionary does not form a sequence.

Creating Lists and Tuples#

A list or a tuple can contain any kind of value (int, float, bool, string). We say that they are heterogeneous structures.

The difference between the 2 is that a list is mutable whereas a Tuple is not (you can’t change it after it is created)

# Lists
list_1 = [1, 4, 2, 7, 35, 84]
cities = ['Paris', 'Berlin', 'London', 'Brussels']
nested_list = [list_1, cities] # a list can even contain lists! This is called a nested list

#Tuples
tuple_1 = (1, 2, 6, 2)

print(cities)
['Paris', 'Berlin', 'London', 'Brussels']

2. indexing and slicing#

In a sequence, each element is ordered according to an index (the first index being index 0)

To access an element of a list or a tuple, we use a technique called Indexing.

To access several elements of a list or a tuple, we use a technique called Slicing

# INDEXING

print('sequence complete:', cities)
print('index 0:', cities[0])
print('index 1:', cities[1])
print('last index (-1):', cities[-1])
sequence complete: ['Paris', 'Berlin', 'London', 'Brussels']
index 0: Paris
index 1: Berlin
last index (-1): Brussels
# SLICING [start (included) : end (excluded) : step]

print('complete sequence:', cities)
print('index 0-2:', cities[0:3])
print('index 1-2:', cities[1:3])
print('reverse order:', cities[::-1])
complete sequence: ['Paris', 'Berlin', 'London', 'Brussels']
index 0-2: ['Paris', 'Berlin', 'London']
index 1-2: ['Berlin', 'London']
reverse order: ['Brussels', 'London', 'Berlin', 'Paris']

3. Useful actions on lists#

cities = ['Paris', 'Berlin', 'London', 'Brussels'] # initial list
print(cities)

cities.append('Dublin') # Add an element at the end of the list
print(cities)

cities.insert(2, 'Madrid') # Add an element to the indicated index
print(cities)

cities.extend(['Amsterdam', 'Rome']) # Add a list at the end of our list
print(cities)

print('length of the list:', len(cities)) # display the length of the list

cities.sort(reverse=False) # sort the list alphabetically / numerically
print(cities)

print(cities.count('Paris')) # count the number of times an element appears in the list
['Paris', 'Berlin', 'London', 'Brussels']
['Paris', 'Berlin', 'London', 'Brussels', 'Dublin']
['Paris', 'Berlin', 'Madrid', 'London', 'Brussels', 'Dublin']
['Paris', 'Berlin', 'Madrid', 'London', 'Brussels', 'Dublin', 'Amsterdam', 'Rome']
length of the list: 8
['Amsterdam', 'Berlin', 'Brussels', 'Dublin', 'London', 'Madrid', 'Paris', 'Rome']
1

Lists and tuples work in harmony with the if/else and For control structures

if 'Paris' in cities:
  print('yes')
else:
  print('no')

for element in cities:
  print(element)
yes
Amsterdam
Berlin
Brussels
Dublin
London
Madrid
Paris
Rome

The enumerate function is very useful to output both the elements of a list and their index. It’s a very used function in datascience

for index, element in enumerate(cities):
  print(index, element)
0 Amsterdam
1 Berlin
2 Brussels
3 Dublin
4 London
5 Madrid
6 Paris
7 Rome

The zip function is also very useful to iterate through 2 lists in parallel. If one list is shorter than the other, the for loop stops at the shorter list

list_2 = [312, 52, 654, 23, 65, 12, 678]
for element_1, element_2 in zip(cities, list_2):
  print(element_1, element_2)
Amsterdam 312
Berlin 52
Brussels 654
Dublin 23
London 65
Madrid 12
Paris 678

4. exercise and Solution#

Transform the following code which gives the Fibonacci sequence to save the results in a list and return this list at the end of the function

Exercise :#

def fibonacci(n):
    a = 0
    b = 1
    while b < n:
      a, b = b, a+b
      print(a)

Solution :#

Hide code cell content
def fibonacci(n):
    a = 0
    b = 1
    fib = [a] # Create a list fib 
    while b < n:
        a, b = b, a+b
        fib.append(a) # append the new value of a to the end of fib
    return fib

print(fibonacci(1000))
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987]