Python: the basics

Python is a general purpose programming language that supports rapid development of scripts and applications.

Python's main advantages:

Interpreter

Python is an interpreted language* which can be used in two ways:

user:host:~$ python
Python 3.5.1 (default, Oct 23 2015, 18:05:06)
[GCC 4.8.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 2 + 2
4
>>> print("Hello World")
Hello World
user:host:~$ python my_script.py
Hello World

Using interactive Python in Jupyter-style notebooks

A convenient and powerful way to use interactive-mode Python is via a Jupyter Notebook, or similar browser-based interface.

This particularly lends itself to data analysis since the notebook records a history of commands and shows output and graphs immediately in the browser.

There are several ways you can run a Jupyter(-style) notebook - locally installed on your computer or hosted as a service on the web. Today we will use a Jupyter notebook service provided by Google: https://colab.research.google.com (Colaboratory).

Jupyter-style notebooks: a quick tour

Go to https://colab.research.google.com and login with your Google account.

Select NEW NOTEBOOK → NEW PYTHON 3 NOTEBOOK - a new notebook will be created.


Type some Python code in the top cell, eg:

print("Hello Jupyter !")

Shift-Enter to run the contents of the cell


You can add new cells.

Insert → Insert Code Cell


NOTE: When the text on the left hand of the cell is: In [*] (with an asterisk rather than a number), the cell is still running. It's usually best to wait until one cell has finished running before running the next.

Let's begin writing some code in our notebook.

print("Hello Jupyter !")
output
Hello Jupyter !

In Jupyter/Collaboratory, just typing the name of a variable in the cell prints its representation:

message = "Hello again !"
message
output
'Hello again !'
# A 'hash' symbol denotes a comment
# This is a comment. Anything after the 'hash' symbol on the line is ignored by the Python interpreter

print("No comment")  # comment
output
No comment

Variables and data types

Integers, floats, strings

a = 5
a
output
5
type(a)
output
int

Adding a decimal point creates a float

b = 5.0
b
output
5.0
type(b)
output
float

int and float are collectively called 'numeric' types

(There are also other numeric types like hex for hexidemical and complex for complex numbers)

Challenge - Types

What is the type of the variable letters defined below ?

letters = "ABACBS"

Write some code the outputs the type - paste your answer into the Etherpad.

Strings

some_words = "Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇"
some_words
output
'Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇'
type(some_words)
output
str

The variable some_words is of type str, short for "string". Strings hold sequences of characters, which can be letters, numbers, punctuation or more exotic forms of text (even emoji!).

Operators

We can perform mathematical calculations in Python using the basic operators:

+ - * / % **

2 + 2  # Addition
output
4
6 * 7  # Multiplication
output
42
2 ** 16  # Power
output
65536
13 % 5  # Modulo
output
3
# int + int = int
a = 5
a + 1
output
6
# float + int = float
b = 5.0
b + 1
output
6.0
a + b
output
10.0
some_words = "I'm a string"
a = 6
a + some_words

Outputs:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-781eba7cf148> in <module>()
      1 some_words = "I'm a string"
      2 a = 6
----> 3 a + some_words

TypeError: unsupported operand type(s) for +: 'int' and 'str'
str(a) + " " + some_words
output
'5 Python3 strings are Unicode (UTF-8) ❤❤❤ 😸 蛇'
# Shorthand: operators with assignment
a += 1
a

# Equivalent to:
# a = a + 1
output
6

Boolean operations

We can also use comparison and logic operators: <, >, ==, !=, <=, >= and statements of identity such as and, or, not. The data type returned by this is called a boolean.

3 > 4
output
False
True and True
output
True
True or False
output
True

Lists and sequence types

Lists

numbers = [2, 4, 6, 8, 10]
numbers
output
[2, 4, 6, 8, 10]
# `len` get the length of a list
len(numbers)
output
5
# Lists can contain multiple data types, including other lists
mixed_list = ["asdf", 2, 3.142, numbers, ['a','b','c']]
mixed_list
output
['asdf', 2, 3.142, [2, 4, 6, 8, 10], ['a', 'b', 'c']]

You can retrieve items from a list by their index. In Python, the first item has an index of 0 (zero).

numbers[0]
output
2
numbers[3]
output
8

You can also assign a new value to any position in the list.

numbers[3] = numbers[3] * 100
numbers
output
[2, 4, 6, 800, 10]

You can append items to the end of the list.

numbers.append(12)
numbers
output
[2, 4, 6, 800, 10, 12]

You can add multiple items to the end of a list with extend.

numbers.extend([14, 16, 18])
numbers
output
[2, 4, 6, 800, 10, 12, 14, 16, 18]

Loops

A for loop can be used to access the elements in a list or other Python data structure one at a time. We will learn about loops in other lesson.

for num in numbers:
    print(num)
output
2 4 6 800 10 12 14 16 18

Indentation is very important in Python. Note that the second line in the example above is indented, indicating the code that is the body of the loop.

To find out what methods are available for an object, we can use the built-in help command:

help(numbers)
output
Help on list object: class list(object) | list() -> new empty list | list(iterable) -> new list initialized from iterable's items | | Methods defined here: | | __add__(self, value, /) | Return self+value. | | __contains__(self, key, /) | Return key in self. | | __delitem__(self, key, /) | Delete self[key]. | | __eq__(self, value, /) | Return self==value. | | __ge__(self, value, /) | Return self>=value. | | __getattribute__(self, name, /) | Return getattr(self, name). | | __getitem__(...) | x.__getitem__(y) <==> x[y] | | __gt__(self, value, /) | Return self>value. | | __iadd__(self, value, /) | Implement self+=value. | | __imul__(self, value, /) | Implement self*=value. | | __init__(self, /, *args, **kwargs) | Initialize self. See help(type(self)) for accurate signature. | | __iter__(self, /) | Implement iter(self). | | __le__(self, value, /) | Return self<=value. | | __len__(self, /) | Return len(self). | | __lt__(self, value, /) | Return self None -- append object to end | | clear(...) | L.clear() -> None -- remove all items from L | | copy(...) | L.copy() -> list -- a shallow copy of L | | count(...) | L.count(value) -> integer -- return number of occurrences of value | | extend(...) | L.extend(iterable) -> None -- extend list by appending elements from the iterable | | index(...) | L.index(value, [start, [stop]]) -> integer -- return first index of value. | Raises ValueError if the value is not present. | | insert(...) | L.insert(index, object) -- insert object before index | | pop(...) | L.pop([index]) -> item -- remove and return item at index (default last). | Raises IndexError if list is empty or index is out of range. | | remove(...) | L.remove(value) -> None -- remove first occurrence of value. | Raises ValueError if the value is not present. | | reverse(...) | L.reverse() -- reverse *IN PLACE* | | sort(...) | L.sort(key=None, reverse=False) -> None -- stable sort *IN PLACE* | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | __hash__ = None

Tuples

A tuple is similar to a list in that it's an ordered sequence of elements. However, tuples can not be changed once created (they are "immutable"). Tuples are created by placing comma-separated values inside parentheses ().

tuples_are_immutable = ("bar", 100, 200, "foo")
tuples_are_immutable
output
('bar', 100, 200, 'foo')
tuples_are_immutable[1]
output
100
tuples_are_immutable[1] = 666

Outputs:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-c91965b0815a> in <module>()
----> 1 tuples_are_immutable[1] = 666

TypeError: 'tuple' object does not support item assignment

Dictionaries

Dictionaries are a container that store key-value pairs. They are unordered.

Other programming languages might call this a 'hash', 'hashtable' or 'hashmap'.

pairs = {'Apple': 1, 'Orange': 2, 'Pear': 4}
pairs
output
{'Apple': 1, 'Orange': 2, 'Pear': 4}
pairs['Orange']
output
2
pairs['Orange'] = 16
pairs
output
{'Apple': 1, 'Orange': 16, 'Pear': 4}

The items method returns a sequence of the key-value pairs as tuples.

values returns a sequence of just the values.

keys returns a sequence of just the keys.


In Python 3, the .items(), .values() and .keys() methods return a 'dictionary view' object that behaves like a list or tuple in for loops but doesn't support indexing. 'Dictionary views' stay in sync even when the dictionary changes.

You can turn them into a normal list or tuple with the list() or tuple() functions.

pairs.items()
# list(pairs.items())
output
dict_items([('Apple', 1), ('Orange', 16), ('Pear', 4)])
pairs.values()
# list(pairs.values())
output
dict_values([1, 16, 4])
pairs.keys()
# list(pairs.keys())
output
dict_keys(['Apple', 'Orange', 'Pear'])
len(pairs)
output
3
dict_of_dicts = {'first': {1:2, 2: 4, 4: 8, 8: 16}, 'second': {'a': 2.2, 'b': 4.4}}
dict_of_dicts
output
{'first': {1: 2, 2: 4, 4: 8, 8: 16}, 'second': {'a': 2.2, 'b': 4.4}}

Challenge - Dictionaries

Given the dictionary:

jam_ratings = {'Plum': 6, 'Apricot': 2, 'Strawberry': 8}

How would you change the value associated with the key Apricot to 9.

A) jam_ratings = {'apricot': 9}

B) jam_ratings[9] = 'Apricot'

C) jam_ratings['Apricot'] = 9

D) jam_ratings[2] = 'Apricot'