# NCI data access training

Monash University, July 2017

### Trainers
- Dr Jingbo Wang
- Dr Joseph Antony
- Dr Adam Steer

### Aims:
Students should leave with an understanding of how to:

*1. Access data from NCI remotely using a web coverage service request*
- directly in a web browser or programmatic web request
- using Python's OWSlib

*2. Access data remotely using the NetCDF subset service*
- directly in a web browser or programmatic request
- using Python's Siphon library

### Assumptions:
1. Some familiarity with Python 3 and the Jupyter environment
2. Some familiarity with netCDF files
3. Students have been provided NCI materials on data discovery


##### Caution - Many NCI training examples are developed using Python 2.7. This is a Python 3 environment - if in doubt ask us

(or refer to the cheatsheet...)

In [2]:
# first, let's set up some basic libraries we need for the exercise. More specific software will be pulled in later

# we need matplotlib later to draw some pictures, let's import it now
import matplotlib.pyplot as plt
import numpy as np

#this is a Jupyter cell magic, which tells the notebook to draw plots in a notebook cell
%matplotlib  inline

#the next two modules let us stream data from a binary blob into a numpy array
# so that we can visualise data without saving and reopening a file.

#note - if you do this at home, you need to install pillow (an imaging library) to use scipy.misc

from scipy.misc import imread
import io

## Task: extracting a data subset from a massive file - ocean colour, 15.65gb

We want to grab ocean colour in a small region off the east coast of Australia (say, the coast of Victoria), but we don't want to download the whole 15gb file to do that. Here is our dataset in the NCI THREDDS catalogue:

http://dapds00.nci.org.au/thredds/catalog/u39/public/data/modis/oc.stacked/v201503/catalog.html

We'll use two different services - Web Coverage Service and the NetCDF Subset Service to get some data.

## 1. Web Coverage services

Required libraries:
- OWSlib
- matplotlib
- scipy
- numpy
- io

Reference notebooks:

- https://github.com/nci/nci-notebooks/blob/master/Data_Access/Using_Thredds/THREDDS_WMS_WCS.ipynb
- https://github.com/nci/Data-Intensive-Workshop-Nov-2016/blob/master/01_Data_Services/THREDDS_WCS.ipynb

This example is based on material here: https://github.com/geopython/OWSLib/blob/master/examples/wcs-thredds-prism.py

In [3]:
#import OWSlib
from owslib.wcs import WebCoverageService

In [0]:
### construct your WCS request from the reference notebooks here! Find some data you like and see how you go! Or follow along...

## 2. NetCDF subset service and Siphon

Required libraries:
- netCDF4
- Siphon
- numpy
- matplotlib
- datetime

Reference notebooks:

- https://github.com/nci/nci-notebooks/blob/master/Data_Access/Using_Thredds/THREDDS_DataAccess.ipynb
- https://github.com/nci/nci-notebooks/blob/master/Data_Access/Using_Siphon/Python_Siphon_II.ipynb
- https://github.com/nci/Data-Intensive-Workshop-Nov-2016/blob/master/01_Data_Services/NetcdfSubset_Examples.ipynb

This material is mainly based on the notebook https://github.com/nci/nci-notebooks/blob/master/Data_Access/Using_Siphon/Python_Siphon_II.ipynb and material here: https://unidata.github.io/siphon/examples/ncss/NCSS_Example.html#sphx-glr-examples-ncss-ncss-example-py


In [10]:
from netCDF4 import Dataset
from siphon import catalog, ncss
import datetime

In [0]:
### construct your NCSS request from the reference notebooks here! Find some data you like and see how you go! Or follow along...