In [146]:
%matplotlib inline

LAB 1 - Python basics and course teaser¶

MARI4600/BIO5600: Ecosystem Modelling for Aquaculture (Dalhousie University)


Lab mechanics (how to work through the labs of this course)¶

For the labs in this course you will need 3 programs (sometimes 4):

  1. Lab manual: It is a web-page with instructions on what to do in the lab. If you are reading "this" you already found lab manual. The Lab manual is viewed inside your browser (e.g. Firefox, Chrome, Safari, etc).
  2. Brightspace LAB Quiz: In the Lab manual there are several questions that you will need to answer in the corresponding quiz in Brightspace. Same as Lab manual, the Brightspace LAB Quiz is viewed inside your browser (e.g. Firefox, Chrome, Safari, etc). You can get to the Quizzes section in Brightspace following Assessment > Quizzes
  3. Spyder: This is the main program where you will use to write and run Python code. We'll talk more about Spyder below.
  4. Terminal: Every once in a while you will need to install additional Python modules. This is done in the Terminal. More on this below.

Work flow in a typical Lab¶

In most labs, you will be required to read along the Lab manual and copy-paste code from the lab manual into Spyder to run it. This will create output in the form of numbers, graphs, maps, etc. Every once in a while you will need to answer questions, which are written inside orange boxes like the one below. The questions need to be answered in the corresponding Brightspace LAB quiz. So you will be jumping back and forth between the Lab manual, Spyder and the LAB quiz throughout the lab.

Lets do a test question...


Sample question: Can you see THIS question in your Brightspace **LAB quiz?**

Where is my code?¶

Data, code and the results produced by the code can be stored in many places, including "the cloud" (e.g. BrightSpace, your OneDrive, GitHub, a website, etc) and physical computers including the computer that you are working on at this moment. Additionally, within your computer, data and code can be stored in the hard-drive, in memory and on the screen.

Note that:

  • Computers can ONLY see and work with code and data stored in memory
  • Humans can ONLY see code and data shown on the screen
  • When turning your computer off, code/data in both the memory and the screen are lost forever! Therefore, make sure that you save your work on your hard-drive or the cloud, before turning your computer off.

We will see below how to move code (and data) between all these storage areas (i.e. memory, hard-drive, screen, cloud). keeping track of where is your code/data stored, is one of the most common sources of errors during coding.

Anaconda¶

Anaconda is a free "bundle" of Python and many modules, libraries and other programs commonly used in scientific research. Anaconda was preinstalled in all Lab computers; however, if you want to follow along in your own laptop, you can install Anaconda (make sure you download the version with Python 3.8) from here: https://www.anaconda.com/products/individual#Downloads

Note that when you installed Anaconda, a lot of code was copied from "The Cloud" into your hard-drive, in a location that we will call "the Python Directory". The actual location of "the Python Directory" is not important to us in this course. However if you are curious and try to find it, just make sure that you DO NOT move, delete or add anything within "the Python Directory"! You can break Python's proper functioning.

Spyder¶

Spyder is a free "Integrated Development Environment" that comes included in Anaconda and that is especially designed to work with Python.

  • To open Spyder, search in the taskbar for "Spyder", the program looks as follows:

Note that when you open Spyder, behind the scenes a bunch of code was loaded into memory from "the Python Directory" (i.e. hard-drive).

The Spyder screen is divided in multiple panels. The two most important are the Console and the Editor.

Spyder's "Console"¶

The Console (also known as the interpreter) is a live instance of Python. It is the most direct way to connect your computer's memory (i.e. where python is loaded) to your screen. This is the place where Python will display error messages, warnings, and some code output. The Console is also where you can quickly interact with Python by writing code and then clicking [Enter] (note that when you click [Enter], anything you wrote on the Console will be loaded to memory and executed immediately). Let's try it:

Type the following code in the Console, then press [Enter]

In [147]:
print('hello world!')
hello world!

After you typed print('hello world!')... What got displayed on the screen?

Spyder's "Working Directory"¶

In the top right corner (usually) Spyder will show you its working directory.

The working directory is the location in your computer where Python is reading and writing files.

When you are working on a lab that requires ancillary data files or ancillary python code files, you need to make sure these files are in the working directory. You can download them directly to the working directory, or you can change the working directory to be folder where you downloaded your files.

To change the working directory, simply chick on the "folder" icon and select the new location. When you execute Python code with instructions to write data or figure to a file, this file will be save in the working directory.

You can also see the location of the working directory by typing in the Console the following (then press [Enter]):

In [148]:
%pwd
Out[148]:
'C:\\Users\\Diego\\Documents\\2.Marine Ecology\\aquaculture-modelling\\Week1'

You are starting to work on a lab that requires 2 ancillary data files and 3 ancillary python code files. You download all of these files from BrightSpace into you Laptop. Which of the following describes best what happened to the data/code?


You are starting to work on a lab that requires 2 ancillary data files and 3 ancillary python code files. **Where** in your laptop do you have to put these files?

Spyder's "Editor"¶

The Editor is the space where you write your code so that you can execute it later. The Editor is shown below inside the green box.

The Editor is just a text editor like Notepad (in Windows) or TextEdit (in Mac). However, note that Spyder's Editor color-codes your text so that it can be read with ease. It will also perform some basic quality control and will tell you if you made a mistake (more on this later).

You can run the code you wrote in the editor by clicking on the "run" button (i.e. green triangular button shown below).

Note that the first time you run your code the Editor will prompt you to save your file. You can name your file anything you want, but the extension will be automatically set as .py, which is the default extension for Python files. Also note that, by default, you will be prompt to save your file in the current working directory.

After you write some code in the Editor, note that clicking the "save" button technically transfers code from the Screen into a file in your hard-drive. Also, if you click the "run" button, it automatically saves your code and then loads it into memory for execution.

Lets do a test. Type the following in the Editor and click

In [149]:
print('hello world!')
hello world!

As you see, when you run the code in the Editor, the output is displayed in the Console


After you write some code in the **Editor**, what happens when you click the **"save"** button?


After you write some code in the **Editor**, what happens when you click the **"run"** button?

The Terminal (Anaconda Prompt)¶

When you need to install additional Python modules or libraries, you will need the Terminal. To open it (if you are in Windows), search in the task bar for "Anaconda Prompt", it should look like this:

If you are a Mac user, you can simply use Mac's built it terminal, instead of the "Anaconda promot".

We will use the Terminal a bit later (at the end of this lab). For now, you can close Terminal, we'll open it again when we need it.


Comments and statements¶

  • Comments are lines of text meant to be read ONLY by humans (i.e. the computer ignores these lines). Comments usually contain annotations and additional information to make easier to understand the code to the programmer. Comments are preceded by a # (hashtag)

  • A Statement is a line of text read by the computer. This is "the code". A statement contains instructions for the computer to do a simple task

Take a look to the sequence of statements below:

In [150]:
# This is a Comment because it is preceded by a # (hashtag)
a = 2

# Lets print our variables to screen
print(a) # Note that comments can be written after a statement (but not before)
2

Above...

  • The first line is a comment
  • The second line is a statement assigning the value of a to be equal to 2
  • Then there is a blank line. Blank lines are ignored by Python (they exist to space the code and make it easier to read).
  • The forth line is a comment
  • The fifth line is a statement asking Python to print the value of a to screen. After the statement, there is another comment

If you copy-paste the code above to the Spyder's Editor and click [run], it will print a 2 to the console.

Once `a = 2` has been executed, the value of `a` will be stored "in memory" until you turn off python, until you delete `a`, or until you update `a` with a different value.

Variables¶

A Variable is a user-defined label that can be used to name anything. For example, in the statement above (a = 2) "a" is a variable. You can name variables anything you want (no punctuation thought), for example:

In [151]:
my_cute_variable = 35

print(my_cute_variable)
35

Note that it is a good practice to choose variable names that describe what is stored within the variable.

Also note that you can use Spyder's Variable explorer to see what variables are currently loaded into memory.


Basic data types¶

In every computer language there are many data types. Each data type behaves under different rules, thus it is important to always be aware of what data type each statement is working with. Below are some of the most common data types used in Python:

Integers¶

Integers are "round" numbers (i.e. no decimal point)

In [152]:
my_int = 2

print(my_int)
2

Floats¶

Floats are numbers with a decimal point

In [153]:
my_float = 2.3

print(my_float)
2.3

Strings¶

Strings are letters. Note that you need to wrap the content of the string within quotes to tell Python they are a string

In [154]:
# You can use single quotes ' or double quotes " to create strings
my_string_1 = 'hello'

my_string_2 = "2.3"

print(my_string_1)
print(my_string_2)
hello
2.3

Note that even though "2.3" is a number, it will be treated by Python as "letters". Therefore you won't be able to do math with "2.3"


What data type is the following:

324.91


What data type is the following:

'the house is green'


What data type is the following:

87572


If you execute the following line in Python...

a = 32 Then, what is "**32**"?


If you execute the following line in Python...

a = 32 Then, what is "**a**"?

Simple operations between different data types¶

Adding an integer plus a float yields a float

In [155]:
my_int + my_float
Out[155]:
4.3

... but, operations between strings and floats (or integers) is not allowed and therefore if you try to do this operation, Python will yield an ERROR

In [156]:
my_float + my_string_1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-156-47d68115815b> in <module>
----> 1 my_float + my_string_1

TypeError: unsupported operand type(s) for +: 'float' and 'str'

"Summing" two strings simply concatenates them together:

In [157]:
my_string_1 + my_string_2
Out[157]:
'hello2.3'

If you execute the following lines in Python... a = 3 b = 1 Then, what is the result of: *a + b*


If you execute the following lines in Python... a = '3' b = '1' Then, what is the result of: *a + b*


Objects (lists and dictionaries)¶

Everything in Python is an object! Even the basic data types are objects: Integers are "Integer Objects", floats are "Float Objects" etc. Each type of object behaves under its own set of rules, therefore it is ESSENTIAL to be aware of what type of object is each of your variables.

We already talked about the sometimes called "primitive objects" (i.e. integers, floats, strings), however python comes included with some more advanced objects (e.g. lists and dictionaries). Also, when you install new modules, often they come with their own objects (each with their own rules).

Following a "cooking analogy", I like to think "primitive objects" as types of raw ingredients (e.g. liquid, powder, solid) and the more advanced objects can be thought as containers (e.g. bag, basket, jar, shelf).

If you don't know what type of object is a variable, just type type(variableName), for example:

In [158]:
my_var = 12.8

type(my_var)
Out[158]:
float

Lists¶

Lists are similar to "cell-arrays" in Matlab. "Lists" are great when you just want to append items at the end of the list, but you don't care too much about retrieving a particular item within the list.

Let's do a "list"

In [159]:
my_list = [2,5,8,7,1,2,7]

my_list
Out[159]:
[2, 5, 8, 7, 1, 2, 7]

To see if a variable is a list, you can also use type. See below:

In [160]:
type(my_list)
Out[160]:
list

To access an item within the list, type its index within square brackets:

In [161]:
my_list[0]
Out[161]:
2

Note that in Python, the first index is zero!

In [162]:
my_list[1]
Out[162]:
5
In [163]:
my_list[2]
Out[163]:
8

Negative indices are counted backwards from the last item:

In [164]:
my_list[-1]
Out[164]:
7
In [165]:
my_list[-2]
Out[165]:
2

To access several items within the list, use a colon (i.e. :)

In [166]:
my_list[2:4]
Out[166]:
[8, 7]
In [167]:
my_list[:3]
Out[167]:
[2, 5, 8]
In [168]:
my_list[-3:]
Out[168]:
[1, 2, 7]

Note that you can put anything you want in a list (i.e. not only integers). For example:

In [169]:
my_other_list = ['a','5',8,'b',1,2.88,7,'x']

my_other_list[-3:]
Out[169]:
[2.88, 7, 'x']

Given the following List:

my_list = [2,5,8,7,1,2,7]

How do you access the 3rd item in the list (i.e. 8)?


Given the following List:

my_list = [2,5,8,7,1,2,7]

How do you access the last 4 items in the list (i.e. 7,1,2,7)?

Dictionaries¶

Dictionaries are similar to "structure-arrays" in Matlab. A "Dictionary" stores items in a way that resembles a dictionary. You look up a "keyword" (e.g. "apple") and your immediately get the contents (i.e. "Red round fruit that tastes good").

Dictionaries are great to quickly find a specific item within the dictionary.

Lets make a dictionary:

In [170]:
my_dict = {'name':'Juan', 'age':27, 'gender':'male'}

my_dict
Out[170]:
{'name': 'Juan', 'age': 27, 'gender': 'male'}

To see if a variable is a Dictionary, you can also use type. See below:

In [171]:
type(my_dict)
Out[171]:
dict

Let's query the dictionary using the keywords:

In [172]:
my_dict['age']
Out[172]:
27
In [173]:
my_dict['name']
Out[173]:
'Juan'

Given the following Dictionary:

my_dict = {'name':'Juan', 'age':27, 'gender':'male'}

How do you access the contents stored under the 'age' key (i.e. 27)?


If you execute the following lines in Python... a = [3,7,1,9] b = {'1':3, '2':7, '3':1, '4':9 } Then, what type of object is **a**?


If you execute the following lines in Python... a = [3,7,1,9] b = {'1':3, '2':7, '3':1, '4':9 } Then, what type of object is **b**?


Functions and methods¶

Functions and methods are how you "do stuff" in Python. With one line of code you can do things like calculate a mean, or do a plot, or do a complicated statistical analysis. In essence, both functions and methods are code that accept an "input", do something to the input, and spits out an "output". The only thing you need to know and remember is the name of the function (or method). Arguably, learning Python is mostly learning the names of the available functions and methods.

Functions¶

Functions are used following the "nomenclature" below:

output = FunctionName(input)

We already used a few functions above, namely the "print" function...

In [174]:
print('hello world!')
hello world!

...and the "type" function

In [175]:
type(124.87)
Out[175]:
float

In both case the output was simply displayed to screen, which happens when you do not specify what to do with the output. Below are a few examples where the output is saved to a variable:

First lets do a list... and then apply some functions to it.

In [176]:
mylist = [4,7,2,9,6,3,7,8,5,10]


# Calcualte the sum
sum(mylist)
Out[176]:
61

In the case above, FunctionName is sum, input is mylist, and output is 61

In [177]:
# Figure out the maximum of a list
a = max(mylist)

a
Out[177]:
10

In the case above, FunctionName is max, input is mylist, and output is a

In [178]:
# Figure out the minimum of a list
min(mylist)
Out[178]:
2
In [179]:
# Get the "length" of a list
len(mylist)
Out[179]:
10
In [180]:
# Get the "type" of Object
type(mylist)
Out[180]:
list

Here is a list of the basic Functions built-in within Python: https://docs.python.org/3/library/functions.html

Given the following statement:

c = max(my_list)

What is the name of the function?


Given the following statement:

c = max(my_list)

What is the "input"?


Given the following statement:

c = max(my_list)

What is the "output"?


Given the following List:

my_list = [2,3,1]

... and the following statement:

output = max(my_list)

What is the value of "output"?


Given the following List:

my_list = [2,3,1]

... and the following statement:

output = len(my_list)

What is the value of "output"?

Methods¶

Methods are similar to functions, but they follow a different "nomenclature":

output = input.MethodName(arguments)

The arguments are additional information sometimes required by the method.

Note that before you can "apply" MethodName to the input object, we first need to create the input object.

For example...

In [181]:
# First we create a "list" object
mylist = [4,7,2,9,6,3,3,8,5,10]

# Then, this appends the number "2" to the end of mylist
mylist.append(2)

print(mylist)
[4, 7, 2, 9, 6, 3, 3, 8, 5, 10, 2]

Here the number 2 in the parenthesis is an argument.

Let do some more...

In [182]:
# This finds the number "7" in mylist, and removes it
mylist.remove(7)

print(mylist)
[4, 2, 9, 6, 3, 3, 8, 5, 10, 2]
In [183]:
#This sorts mylist. Note that this method does not require any arguments, thus the parenthesis are left empty ()
mylist.sort()

print(mylist)
[2, 2, 3, 3, 4, 5, 6, 8, 9, 10]
In [184]:
#This finds the number "5" in mylist and returns its index
mylist.index(8)
Out[184]:
7

By now you probably figured out that each object type (e.g. lists or dictionaries) has its own built in methods. Use Google to find out the available methods for a particular object type. For example, if you want to see the available methods for "list" objects, Google: python methods list


Given the following statement:

c = mylist.remove(7)

What is the "input"?


Given the following statement:

c = mylist.remove(7)

What is the "output"?


Given the following statement:

c = mylist.remove(7)

What is the "MethodName"?


Given the following statement:

c = mylist.remove(7)

What is the "argument"?

Modules¶

So far we used some of the built-in functions of Python (e.g. type, len, max), as well as some methods of Python's built-in objects (e.g. methods of the list object); however, there are A LOT MORE!. You can import functions and objects (with its associated methods) from external "modules" (or "packages")... therefore you have thousands of functions and methods at your disposal.

To be able to use a "module", you first need to (1) install the module, and then you need to (2) import the module.

  1. When you "install" a module, behind the scenes the module's code is downloaded from the cloud and written in the default "Python Directory" in your hard-drive. This step only needs to be done once. The module will remained installed in your hard-drive "forever" (or until you decide to uninstall it).

  2. When you "import" a module, behind the scenes the module's code is loaded into Memory (from the default "Python Directory" where it was installed in step 1). This step needs to be done every time you run the program that needs the module. Conveniently, you can simply use an import statement at the beginning of your code so the required module is loaded automatically every time you run your code.

When you install Anaconda, the Anaconda installer "installs" Python and also "installs" a bunch of commonly used modules. Therefore you can simply import them and use them right away. Below is an example about how to import the Module math and use one of its functions (note that we'll learn how to "install" new modules at the end of this lab).

In [185]:
# First import the module "math"
import math

# Then use the function "sin" within the module "math"
math.sin(mylist[0])
Out[185]:
0.9092974268256817

When using a function from an imported module, you need to add the module name to the function name. Thus the "nomenclature" is slightly modified as shown below:

output = ModuleName.FunctionName(input)

To "Editor" or to "Console"?¶

To execute Python code, you can (1) write code in the Editor and click , or (2) you can write code in the Console and click [Enter]. The question is when do you use the Editor and when do you use the Console?

Editor¶

When you use Editor you most save your code in a .py file before you can it. The advantage of this is that you can re-use your code at a later date. Simply open your .py file and it again!

Use the Editor to write self-contained code (i.e. scripts or little programs) that you may want to re-use a later date. Examples of these self-contained scripts are shown in the real-life case studies below.

Console¶

The Console is the quickest way to enter code into Python's "memory" and the quickest way to query or inspect something that is already in Python's "memory". The drawback is that code written in Console quickly dissapears as you write new code... or when you turn off Spyder or your computer. Code written in the **Console** is not very re-usable. Therefore, use the Console when you need to do "one-offs".



In the real-life case studies below we'll use both, the Editor and the Console. To make things easier...

  • I will explicitly tell you to write (or copy-paste) code in a specific .py file and then when you need to use the Editor
  • I will explicitly tell you to when you need to use the Console

Real-life case study #1: Downloading real-time data from an oceanographic buoy in Oregon¶

Up to here, everything probably sounds easy, but not really that useful. Lets put it all together in a real-life application using 1 function and 1 method.

There is an autonomous buoy in the Columbia River (Oregon, USA) that broadcasts real-time ocean data via the following server: http://columbia.loboviz.com/

LOBO Buoy Columbia River (Oregon, USA)

The following code connects to the buoy's server, downloads data, and makes a simple plot.

In Spyder's Editor open a new file and save it as buoy.py. Copy-paste the code below to your new file and click

In [186]:
# Import Pandas module
import pandas as pd

# Define URL buoy
URL = 'http://columbia.loboviz.com/cgi-data/nph-data.cgi?x=date&y=temperature&min_date=20180820&max_date=20180907&node=32&data_format=text'

# Read data from buoy
buoy_data = pd.read_csv(URL,sep='\t',header=2)

# Do quick plot
buoy_data.plot(x='date [PST]',y='temperature [C]', rot=90)
Out[186]:
<matplotlib.axes._subplots.AxesSubplot at 0x1f0d754cb20>

To see the plot you just made, click on the "Plot" tab (beside your "Variable explorer"). See below...

Code Explanation:¶

Line 1:¶

In [187]:
import pandas as pd

Here we imported the pandas module, which is a fantastic data analysis library (learn more: http://pandas.pydata.org/ ). Note that we also "nicknamed" the module pd so that we don't have to write the long name "pandas" before each function.

Line 2:¶

In [212]:
URL = 'http://columbia.loboviz.com/cgi-data/nph-data.cgi?x=date&y=temperature&min_date=20180820&max_date=20180907&node=32&data_format=text'

Here we assigned a string to a new variable that we decided to call URL. Note that if you just copy-paste the full url into a browser, you would actually display the data in your browser as text. Of course, the tricky part is to figure out what url to query?... It is not hard, but I'll show you how to do this in another lab.

Line 3:¶

In [213]:
buoy_data = pd.read_csv(URL,sep='\t',header=2)

This line queries the server, downloads the data and puts it into a new variable that we called buoy_data. Below is the breakdown:

  • MethodName.FunctionName: pd.read_csv
  • Input: URL
  • Argument 1: sep='\t' (this specifies that the .csv file uses tabs as separators instead of commas)
  • Argument 2: header=2 (this specifies that the first 2 rows are headers)
  • output: buoy_data

So, we used 2 arguments in this example (i.e. sep and header), but how do we know what arguments are available and how to use them?

ANSWER: Google the function name (i.e. Google: pd.read_csv)

Note that the output, buoy_data, is a pandas DataFrame object. We'll talk more about this below.

Line 4:¶

In [218]:
buoy_data.plot(x='date [PST]',y='temperature [C]', rot=90)
Out[218]:
<matplotlib.axes._subplots.AxesSubplot at 0x1f0d75df460>

Here we applied the method plot to the pandas DataFrame object buoy_data. The output is a plot!


Inspecting your code¶

Before we do anything with our newly downloaded data, lets take a minute to examine our code...


In the code above (i.e. Real-life case study 1: Downloading real-time data from an oceanographic buoy in Oregon)...

...What is: URL


In the code above (i.e. Real-life case study 1: Downloading real-time data from an oceanographic buoy in Oregon)...

...What is: pandas


In the code above (i.e. Real-life case study 1: Downloading real-time data from an oceanographic buoy in Oregon)...

...What is: pd.read_csv


In the code above (i.e. Real-life case study 1: Downloading real-time data from an oceanographic buoy in Oregon)...

...What is: buoy_data


In the code above (i.e. Real-life case study 1: Downloading real-time data from an oceanographic buoy in Oregon)...

...What is: .plot


Take a look to the plot of temperature in the Colorado River (Oregon, USA).

What do you think causes the oscillations in temperature?

Inspecting buoy_data:¶

Making a quick plot is useful, but we likely also want to look at the actual data. Lets start by inspecting what type of object is buoy_data

In [219]:
type(buoy_data)
Out[219]:
pandas.core.frame.DataFrame

As I said above, buoy_data, is a pandas DataFrame object (a pandas.core.frame.DataFrame to be specific). It is very similar to the core DataFrame in R. If you are not familiar with R, a good way to describe a pandas DataFrame is a table like an Excel spread-sheet, with rows and columns.

To actually take look into buoy_data, simply :

In [220]:
buoy_data
Out[220]:
date [PST] temperature [C]
0 2018-08-20 00:00:00 22.31
1 2018-08-20 01:00:00 22.29
2 2018-08-20 02:00:00 22.24
3 2018-08-20 03:00:00 22.18
4 2018-08-20 04:00:00 22.13
... ... ...
451 2018-09-07 19:00:00 20.21
452 2018-09-07 20:00:00 20.18
453 2018-09-07 21:00:00 20.16
454 2018-09-07 22:00:00 20.18
455 2018-09-07 23:00:00 20.29

456 rows × 2 columns

This shows you the top and bottom of the table as well as some descriptors (e.g. 192 rows × 2 columns). If you only want to show the first few rows, apply the method .head to the buoy_data object:

In [221]:
buoy_data.head()
Out[221]:
date [PST] temperature [C]
0 2018-08-20 00:00:00 22.31
1 2018-08-20 01:00:00 22.29
2 2018-08-20 02:00:00 22.24
3 2018-08-20 03:00:00 22.18
4 2018-08-20 04:00:00 22.13

After you ran the following statement

buoy_data.head()

What is the temperature (C) in the first record?

Similarly you can show only the end of the table by applying the .tail method to the buoy_data object:

In [222]:
buoy_data.tail()
Out[222]:
date [PST] temperature [C]
451 2018-09-07 19:00:00 20.21
452 2018-09-07 20:00:00 20.18
453 2018-09-07 21:00:00 20.16
454 2018-09-07 22:00:00 20.18
455 2018-09-07 23:00:00 20.29

After running the following statement

buoy_data.tail()

What is the last temperature in your table?

Instead of referring to the entire table (buoy_data), you can just refer to one column. The table is a Python dictionary where the "header" of each column is the dictionary key which will return the entire column. For example, to refer ONLY to the temperature column, :

In [223]:
buoy_data['temperature [C]']
Out[223]:
0      22.31
1      22.29
2      22.24
3      22.18
4      22.13
       ...  
451    20.21
452    20.18
453    20.16
454    20.18
455    20.29
Name: temperature [C], Length: 456, dtype: float64

Lets calculate the average temperature by applying the .mean method only to the 'temperature [C]' column:

In [224]:
buoy_data['temperature [C]'].mean()
Out[224]:
20.861206140350866

Similarly, you can calculate the minimum, maximum, median, standard deviation, mode, etc...

In [225]:
buoy_data['temperature [C]'].min()
Out[225]:
19.64
In [226]:
buoy_data['temperature [C]'].max()
Out[226]:
22.42
In [227]:
buoy_data['temperature [C]'].median()
Out[227]:
20.74
In [228]:
buoy_data['temperature [C]'].std()
Out[228]:
0.731239256140188
In [229]:
buoy_data['temperature [C]'].mode()
Out[229]:
0    20.12
dtype: float64

Real-life case study #2: Downloading real-time data from a Glider (Autonomous Underwater Vehicle) from Nova Scotia¶

Slocum Glider Current deployment

The code below pulls data from a glider currently deployed off the coast of Nova Scotia (http://gliders.oceantrack.org/). The code also does a simple quality control and makes a rough depth-vs-time scatter plot.

In Spyder open a new file, name it glider.py, copy-paste the code below and click .

In [230]:
# Import Pandas module
import pandas as pd

# Define URL of glider
URL = 'http://gliders.oceantrack.org/data/live/otn201_sci_water_temp_live.csv'

# Read data from glider server
glider_data = pd.read_csv(URL,sep=',')

# Quality Control: Filter out data above 40oC and below -8oC
glider_data = glider_data[(glider_data.sci_water_temp < 40) & (glider_data.sci_water_temp > -8)]

# Make a scatter plot
glider_data.plot.scatter('unixtime',
                         'depth',
                         c='sci_water_temp',
                         marker='o',
                         edgecolor='none',
                         cmap='viridis',
                         ylim=[glider_data['depth'].max(),0],
                         figsize=[19,9])
Out[230]:
<matplotlib.axes._subplots.AxesSubplot at 0x1f0d55ca8b0>

The left side of the graph shows the data collected at the beginning of the deployment. The right side of the graph shows the most recently collected data. The top of the graph shows the temperature close to the surface (0 m) while data points at the bottom of the graph represent temperature near the bottom (~150 m).

Take a look inside glider_data

In [231]:
glider_data.head()
Out[231]:
unixtime lat lon depth sci_water_temp
0 1.469369e+09 48.337975 -64.175423 0.000000 0.0000
1 1.469369e+09 48.337438 -64.171928 90.088686 0.9753
2 1.469369e+09 48.337437 -64.171921 90.123090 0.9856
3 1.469369e+09 48.337436 -64.171909 90.179436 0.9772
4 1.469369e+09 48.337435 -64.171897 90.240740 0.9767

Tough question: What is the maximum depth recorded in this dataset? Hint: Apply the .max method on the "depth" column.


What is the **maximum** depth recorded in glider_data?



More tough questions:


What is the **maximum** water temperature recorded in the "sci_water_temp" column within glider_data?




What is the **minimum** water temperature recorded in the "sci_water_temp" column within glider_data?




How many records are in the "sci_water_temp" column within glider_data?



Installing new modules¶

Before we do our last "Real-life case study", we need to install 3 modules that do not come pre-installed with Anaconda. There are several ways to install Python modules and libraries, here we will use conda. Note that normally you only need to install a module once (it will remain installed until you manually remove it). However, because the lab computers are wiped out clean every night, you may need to install the same module several times during this course. Lets install 3 modules (i.e. netcdf4, cartopy and cmocean). I'll explain what these modules do later.

  • Open the by typing in the Window's Task Bar the term "Anaconda Promt" (in Mac, you can search in the Launchpad for "Terminal")
  • To install netcdf4, type in the Terminal:
    conda install -c anaconda netcdf4
  • Click [Enter]
  • If it asks you "are you sure?", type "y" and click [Enter] again.
  • To install cartopy, type in the Terminal:
    conda install -c conda-forge cartopy
  • Click [Enter]
  • If it asks you "are you sure?", type "y" and click [Enter] again.
  • To install cmocean, type in the Terminal:
    conda install -c conda-forge cmocean
  • Click [Enter]
  • If it asks you "are you sure?", type "y" and click [Enter] again.

Done! The new modules are now installed (again, I'll explain how to use these new modules below).


Real-life case study #3: Downloading Satellite data from an ERDDAP server¶

ERDDAP is a data server that simplifies the download of subsets of scientific datasets to make graphs and maps (We'll learn more about ERDDAP in another lab).

Here we will be using one of NOAA's ERDDAP: https://coastwatch.pfeg.noaa.gov/erddap/info/index.html?page=1&itemsPerPage=2000


If you haven't installed netcdf4, catopy and cmocean yet.... take a look at the section above.


First, lets get ourselves some Sea Surface Temperature (SST) from the Nova Scotia region. For this we will use an ERDDAP product consisting on 6-month averages of several POES satellites.

In Spyder's Editor open a new file and save it as satellite.py. Copy-paste the code below and click

Note that the code below is pretty much "the real deal". It is as complicated as it gets with Python, yet it is just a series of statements declaring variables and applying functions and methods. Note that, by convention, at the top of the file you import all the modules you need.

In [232]:
import urllib.request
import netCDF4
import numpy as np 
import matplotlib.pyplot as plt 
import cartopy.crs as ccrs 
import cartopy.feature as cfeature 
import cmocean 

# Define the spatial and temporal box of interest
year = 2015
month = 8
minlat = 38
maxlat = 48
minlon = -67
maxlon = -50
isub = 0.5

# Make "minday" and "maxday" strings by concatenating several pieces
minday = str(year)+'-'+str(month).zfill(2)+'-11T12:00:00Z'
maxday = str(year)+'-'+str(month+1).zfill(2)+'-11T12:00:00Z'

# Create the URL
base_url='http://coastwatch.pfeg.noaa.gov/erddap/griddap/erdAGsstamday_LonPM180.nc?'
query='sst[('+minday+'):'+str(isub)+':('+maxday+')][(0.0):'+str(isub)+':(0.0)][('+str(minlat)+'):'+str(isub)+':('+str(maxlat)+')][('+str(minlon)+'):'+str(isub)+':('+str(maxlon)+')]'
url = base_url+query

# Download data and store it in a NetCDF file
filename='satellite_data_tempfile.nc'
urllib.request.urlretrieve (url, filename)

# open NetCDF data
nc = netCDF4.Dataset(filename)
ncv = nc.variables

# Extract variables of interest from inside the NetCDF file
lon = ncv['longitude'][:]
lat = ncv['latitude'][:]
sst = ncv['sst'][0,0,:,:]

# Make grids of lats and lons for later use to make maps
lons, lats = np.meshgrid(lon,lat)


#%% Create map (PlateCarree Projection) 
fig = plt.figure(figsize=(13,13)) # Create figure
ax = plt.axes(projection=ccrs.PlateCarree()) # Create axis within figure (with projection)
ax.pcolormesh(lons, lats, sst, transform=ccrs.PlateCarree()) # Add colormap to axis (specifying projection of data) 
ax.coastlines(resolution='10m') # Add coastline to axis
Out[232]:
<cartopy.mpl.feature_artist.FeatureArtist at 0x1f0d56a0100>

Take a look at the generated map. You can see the Gulf Stream in the south (bottom of the graph), which has temperatures in excess of 25$^\circ$C. You can also see that in the Bay of Fundy (Nova Scotia) the temperature is cold (about 13$^\circ$C), likely because the extreme tides cause a lot of vertical mixing which brings cold water to the surface.

What do you think is the average temperature in the whole map?

Luckily this is easy to find by applying the .mean method to out Sea Surface Temperature data (sst):

In [233]:
sst.mean()
Out[233]:
21.849518880725444

Lets see if you can figure out what are the minimum and the maximum temperatures is sst.

In [237]:
sst.min()
Out[237]:
10.875

What is the **minimum** recorded Sea Surface Temperature in the sst data?




What is the **maximum** recorded Sea Surface Temperature in the sst data?



Lets do another map, this time using a "Lamber Conformal Conic Projection". Luckily, you can use the same data (i.e. no need to download data again), and simply create the new map with the new projection.

In this map I added some extra labels, colorbar and shadeRelief.

We'll only use this map once, thus and click [Enter].

In [235]:
import matplotlib.pyplot as plt

#%% Create map (LambertConformal Projection)
# Create figure
fig = plt.figure(figsize=(13,13))
# Create axis within figure (with projection)
ax = plt.axes(projection=ccrs.LambertConformal(central_longitude=(maxlon+minlon)/2, central_latitude=(maxlat+minlat)/2)) 
# Add colormap to axis (specifying projection of data) 
cs = ax.pcolormesh(lons, lats, sst, cmap=cmocean.cm.thermal, transform=ccrs.PlateCarree()) 
# Add colorbar
cbar = fig.colorbar(cs, shrink=0.6, orientation='vertical', extend='both') 
# Add legend to colorbar
cbar.set_label('Sea Surface Temperature ($^\circ$C)') 
 # Add land to axis
ax.add_feature(cfeature.NaturalEarthFeature(category='physical', scale='10m',facecolor='none', name='coastline'),
               edgecolor='#666666', facecolor='#bfbfbf')
# Add gridlines to axis
ax.gridlines()
# Add title to figure
plt.title('Satellite SST (Monthly composite for '+str(year)+'/'+str(month)+')') 
Out[235]:
Text(0.5, 1.0, 'Satellite SST (Monthly composite for 2015/8)')

We will have a more in depth Lab about making maps with Cartopy. In the mean time, feel free to explore other projections and make some extra maps. Here are some Galleries:

https://scitools.org.uk/cartopy/docs/latest/crs/projections.html

https://scitools.org.uk/cartopy/docs/latest/gallery/index.html


From the code above... which of the following are **"functions"** or **"methods"**? Note that these **"functions"** or **"methods"** have fixed names, the coder has no choice but to use the correct **"function"** or **"method"** names.


From the code above... which of the following are **"variables"**? Note that these **variables** have names that the coder chose (e.g. a coder can write **a** = 1 ... or **b** = 1 ... or **alpha** = 1 ... or **my_variable** = 1).


From the code above... in the line
fig = plt.figure(figsize=(13,13)) What is **fig**?


From the code above... in the line
fig = plt.figure(figsize=(13,13)) What is **plt.figure**?


From the code above... in the line
cbar = fig.colorbar(cs, shrink=0.6, orientation='vertical', extend='both') ) What is **.colorbar**?

This is the end of lab¶



Code below is for formatting of this lab. Do not alter!

In [240]:
# Loads css file and applies it to lab HTML
from IPython.core.display import HTML
def css():
    style = open("../css/custom.css", "r").read()
    return HTML(style)
css()
Out[240]:
In [ ]: