Folder full of pertinent coursework
Kernel: Python 2 (SageMath)
GHCND
Sean Paradiso
In [1]:
Out[1]:
As described by GHCND_documentation.pdf, this data is 'a composite of climate records from numerous sources that were merged and then subjected to a suite of quality assurance reviews.'
In [2]:
Out[2]:
In [3]:
In [4]:
In [5]:
The following code provides a more succinct version of the column names for user ease
In [6]:
Out[6]:
Index([u'STATION', u'STATION_NAME', u'DATE', u'MDPR', u'DAPR', u'PRCP',
u'TMAX', u'TMIN', u'TOBS'],
dtype='object')
Here we display the names of the stations with recorded data and show their respective data size. The sizes were shown so that when making a selection of three stations we could compare sample sizes and choose statoins with a similar number of data points.
In [7]:
Out[7]:
STATION_NAME
ACTON CALIFORNIA CA US 610
ACTON ESCONDIDO CANYON CA US 273
ADELANTO 3.1 S CA US 22
ADIN MOUNTAIN CA US 596
ADIN RANGER STATION CA US 582
AHWAHNEE 2.5 NNW CA US 571
ALAMO 1.0 WSW CA US 5
ALBION 4.0 SE CA US 563
ALDER POINT CALIFORNIA CA US 562
ALDER SPRINGS CALIFORNIA CA US 610
ALPINE CA US 610
ALPINE CALIFORNIA CA US 610
ALTA SIERRA 0.4 WSW CA US 567
ALTA SIERRA 1.3 S CA US 45
ALTA SIERRA 1.4 SSW CA US 63
ALTA SIERRA 2.3 WSW CA US 55
ALTADENA 0.7 ESE CA US 583
ALTADENA CA US 577
ALTURAS CA US 497
ALTURAS MUNICIPAL AIRPORT CA US 608
AMBOY CA US 370
AMERICAN CANYON 0.3 S CA US 138
AMERICAN CANYON 3.5 NE CA US 16
ANAHEIM 4.9 E CA US 580
ANAHEIM 4.9 ENE CA US 602
ANAHEIM 7.3 E CA US 394
ANAHEIM CA US 610
ANAHEIM HILLS 1.1 SE CA US 38
ANDERSON 2.6 NE CA US 104
ANDERSON 8.5 WNW CA US 205
...
WINDSOR 0.6 NNE CA US 597
WINDSOR 1.2 NNW CA US 54
WINDSOR 1.4 SE CA US 609
WINDSOR 1.5 WNW CA US 81
WINDSOR 1.8 SE CA US 63
WINTERS CA US 604
WOFFORD HEIGHTS CALIFORNIA CA US 610
WOLVERTON CALIFORNIA CA US 610
WOODACRE 0.6 SW CA US 145
WOODACRE CALIFORNIA CA US 610
WOODLAND 1 WNW CA US 527
WOODLAND 2.8 SE CA US 449
WOODLAND HILLS PIERCE COLLEGE CA US 610
WOODSIDE 3.4 S CA US 558
WOODSIDE FIRE STATION 1 CA US 356
WRIGHTWOOD 1.2 WNW CA US 395
YOLLA BOLLA CALIFORNIA CA US 610
YOSEMITE LAKES 4.7 S CA US 597
YOSEMITE PARK HDQUARTERS CA US 502
YOSEMITE VILLAGE 12 W CA US 606
YREKA 0.9 WNW CA US 481
YREKA 4.5 S CA US 295
YREKA CA US 608
YUCAIPA 1.5 NNE CA US 500
YUCCA MESA CA US 577
YUCCA VALLEY 1.1 SW CA US 29
YUCCA VALLEY 2.7 ENE CA US 531
YUCCA VALLEY CA US 577
YUCCA VALLEY CALIFORNIA CA US 575
YUROK CALIFORNIA CA US 601
Length: 1345, dtype: int64
The code below was just a superfluous method of displaying the names of the stations through the utilization of a for loop.
In [8]:
The next three lines are the station selections and the three lines beyond (namely, tmm1,2,3) are streamlining the data so only the information in which we are interested, i.e. minimum and maximum temperature, are displayed.
In [9]:
In [10]:
In [11]:
In [12]:
In [13]:
In [14]:
Here we correct the data because there are numerous inputs of -9999 and this is clearly not a recorded value but most likely a form of placeholder. In order to make everything readable/plottable, we simply replace every instance of -9999 with NaN (not a number).
Below these long columns of data we have the actual plot of the minimum and maximum temperatures for our first selection.
In [15]:
Out[15]:
In [16]:
Out[16]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f521c28bad0>
We again correct for the -9999 values and plot the necessary data for our second selection.
In [17]:
Out[17]:
In [18]:
Out[18]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f521c5fd8d0>
Finally, we, again, correct and plot for the data we selected for our third station.
In [19]:
Out[19]:
In [20]:
Out[20]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f521a06bb50>
Here we extract the precipitation for every date in our data. We then proceed to furhter specify this selection to the June 2015 span that is required.
In [21]:
Out[21]:
PRCP
-9999 171378
0 403317
2 66
3 7160
4 26
5 5698
6 19
7 10
8 3587
9 8
10 2696
11 14
12 3
13 2494
14 4
15 1761
16 5
17 3
18 1614
19 6
20 1638
21 5
22 5
23 1376
24 8
25 3093
26 9
27 9
28 1277
29 6
...
1740 1
1753 1
1758 1
1765 1
1791 1
1793 1
1808 1
1811 1
1822 1
1859 1
1872 1
1880 1
1892 1
1918 1
1923 1
2017 1
2019 1
2052 2
2090 1
2096 1
2159 1
2179 1
2256 1
2271 1
2304 1
2413 1
2558 1
2753 1
4699 1
12344 1
Length: 707, dtype: int64
In [22]:
Out[22]:
As we can see from the above data, every precipitation value for June 2015 is -9999 so we double check the data and correct for these unwanted values and replace them with NaN as seen below. Our initial results were confirmed and thus we didn't plot any data due to the sheer lack of data.
In [23]:
Out[23]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f5219f73810>
In [24]:
Out[24]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f5219f9fb90>
In [25]:
Out[25]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f52141b30d0>
In [26]:
Out[26]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f521410d290>