()\n",
" 1 myarray = np.array([])\n",
"----> 2 myarray.append(7)\n",
"\n",
"AttributeError: 'numpy.ndarray' object has no attribute 'append'\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"To create a NumPy array, you must create an array with the correct shape from the beginning. However, the array doesn't have to have all the correct values from the very beginning: these you can fill in later.
"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"There are a few ways to create a new array with a particular size:\n",
"\n",
"* `np.empty(size)` -- creates an empty array of size `size`\n",
"* `np.zeros(size)` -- creates an array of size `size` and sets all the elements to zero\n",
"* `np.ones(size)` -- creates an array of size `size` and sets all the elements to one\n",
"\n",
"So the way that we would create an array like the list above is:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"myarray = np.empty(2) # create an array of size 2\n",
"myarray[0] = 7\n",
"myarray[1] = 2\n",
"myarray"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"\n",
"Another very useful function for creating arrays is np.arange
, which will create an array containing a sequence of numbers (it is very similar to the built-in range
or xrange
functions in Python).\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Here are a few examples of using `np.arange`. Try playing around with them and make sure you understand how it works:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# create an array of numbers from 0 to 3\n",
"np.arange(3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# create an array of numbers from 1 to 5\n",
"np.arange(1, 5)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# create an array of every third number between 2 and 10\n",
"np.arange(2, 10, 3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# create an array of numbers between 0.1 and 1.1 spaced by 0.1\n",
"np.arange(0.1, 1.1, 0.1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"## \"Vectorized\" computations"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Another very useful thing about NumPy is that it comes with many so-called \"vectorized\" operations. A vectorized operation (or computation) works across the entire array. For example, let's say we want to add together all the numbers in a list. In regular Python, we might do it like this:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"mylist = [3, 6, 1, 10, 22]\n",
"total = 0\n",
"for number in mylist:\n",
" total += number\n",
"total"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Using NumPy arrays, we can just use the `np.sum` function:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# you can also just do np.sum(mylist) -- it converts it to an\n",
"# array for you!\n",
"myarray = np.array(mylist)\n",
"np.sum(myarray)"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"\n",
"There are many other vectorized computations that you can do on NumPy arrays, including multiplication (np.prod
), mean (np.mean
), and variance (np.var
). They all act essentially the same way as np.sum
-- give the function an array, and it computes the relevant function across all the elements in the array.\n",
"
"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"### Exercise: Euclidean distance (2 points)"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Recall that the Euclidean distance $d$ is given by the following equation:\n",
"\n",
"$$\n",
"d(a, b) = \\sqrt{\\sum_{i=1}^N (a_i - b_i) ^ 2}\n",
"$$\n",
"\n",
"In NumPy, this is a fairly simple computation because we can rely on array computations and the `np.sum` function to do all the heavy lifting for us."
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"\n",
"Complete the function euclidean_distance
below to compute $d(a,b)$, as given by the equation above. Note that you can compute the square root using np.sqrt
.\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"grade_id": "euclidean_distance",
"locked": false,
"schema_version": 1,
"solution": true
}
},
"outputs": [],
"source": [
"def euclidean_distance(a, b):\n",
" \"\"\"Computes the Euclidean distance between a and b.\n",
" \n",
" Hint: your solution can be done in a single line of code!\n",
" \n",
" Parameters\n",
" ----------\n",
" a, b : numpy arrays or scalars with the same size\n",
" \n",
" Returns\n",
" -------\n",
" the Euclidean distance between a and b\n",
" \n",
" \"\"\"\n",
" ### BEGIN SOLUTION\n",
" return np.sqrt(np.sum((a - b) ** 2))\n",
" ### END SOLUTION"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Remember that you need to execute the cell above (with your definition of euclidean_distance
), and then run the cell below to check your answer. If you make changes to the cell with your answer, you will need to first re-run that cell, and then re-run the test cell to check your answer again.
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# add your own test cases in this cell!\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": true,
"grade_id": "test_euclidean_distance",
"locked": false,
"points": 2.0,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"from nose.tools import assert_equal, assert_raises\n",
"\n",
"# check euclidean distance of size 3 integer arrays\n",
"a = np.array([1, 2, 3])\n",
"b = np.array([4, 5, 6])\n",
"assert_equal(euclidean_distance(a, b), 5.196152422706632)\n",
"\n",
"# check euclidean distance of size 4 float arrays\n",
"x = np.array([3.6, 7., 203., 3.])\n",
"y = np.array([6., 20.2, 1., 2.])\n",
"assert_equal(euclidean_distance(x, y), 202.44752406487959)\n",
"\n",
"# check euclidean distance of scalars\n",
"assert_equal(euclidean_distance(1, 0.5), 0.5)\n",
"\n",
"# check that an error is thrown if the arrays are different sizes\n",
"a = np.array([1, 2, 3])\n",
"b = np.array([4, 5])\n",
"assert_raises(ValueError, euclidean_distance, a, b)\n",
"assert_raises(ValueError, euclidean_distance, b, a)\n",
"\n",
"print(\"Success!\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"## Creating multidimensional arrays"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Previously, we saw that functions like `np.zeros` or `np.ones` could be used to create a 1-D array. We can also use them to create N-D arrays. Rather than passing an integer as the first argument, we pass a list or tuple with the *shape* of the array that we want. For example, to create a $3\\times 4$ array of zeros:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"arr = np.zeros((3, 4))\n",
"arr"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"\n",
"The *shape* of the array is a very important concept. You can always get the shape of an array by accessing its shape
attribute:\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"arr.shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Note that for 1-D arrays, the shape returned by the `shape` attribute is still a tuple, even though it only has a length of one:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"np.zeros(3).shape"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"This also means that we can *create* 1-D arrays by passing a length one tuple. Thus, the following two arrays are identical:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"np.zeros((3,))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"np.zeros(3)"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"There is a warning that goes with this, however: be careful to always use tuples to specify the shape when you are creating multidimensional arrays. For example, to create an array of zeros with shape (3, 4)
, we must use np.zeros((3, 4))
. The following will not work:
\n",
"\n",
"```python\n",
"np.zeros(3, 4)\n",
"```\n",
"\n",
"It will give an error like this:\n",
"\n",
"```\n",
"---------------------------------------------------------------------------\n",
"TypeError Traceback (most recent call last)\n",
" in ()\n",
"----> 1 np.zeros(3, 4)\n",
"\n",
"TypeError: data type not understood\n",
"```\n",
"\n",
"This is because the second argument to `np.zeros` is the data type, so numpy thinks you are trying to create an array of zeros with shape `(3,)` and datatype `4`. It (understandably) doesn't know what you mean by a datatype of `4`, and so throws an error."
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"\n",
"Another important concept is the size of the array -- in other words, how many elements are in it. This is equivalent to the length of the array, for 1-D arrays, but not for multidimensional arrays. You can also see the total size of the array with the size
attribute:\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"arr = np.zeros((3, 4))\n",
"arr.size"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"We can also create arrays and then reshape them into any shape, provided the new array has the same size as the old array:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"arr = np.arange(32).reshape((8, 4))\n",
"arr"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"## Accessing and modifying multidimensional array elements"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"To access or set individual elements of the array, we can index with a sequence of numbers:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# set the 3rd element in the 1st row to 0\n",
"arr[0, 2] = 0\n",
"arr"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"We can also access the element on it's own, without having the equals sign and the stuff to the right of it:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"arr[0, 2]"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"We frequently will want to access ranges of elements. In NumPy, the first dimension (or *axis*) corresponds to the rows of the array, and the second axis corresponds to the columns. For example, to look at the first row of the array:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# the first row\n",
"arr[0]"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"To look at columns, we use the following syntax:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# the second column\n",
"arr[:, 1]"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"The colon in the first position essentially means \"select from every row\". So, we can interpret `arr[:, 1]` as meaning \"take the second element of every row\", or simply \"take the second column\".\n",
"\n",
"Using this syntax, we can select whole regions of an array. For example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# select a rectangular region from the array\n",
"arr[2:5, 1:3]"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Note: be careful about setting modifying an array if what you really want is a copy of an array. Remember that in Python, variables are really just pointers to objects.
"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"For example, if I want to create a second array that mutliples every other value in `arr` by two, the following code will work but will have unexpected consequences:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"arr = np.arange(10)\n",
"arr2 = arr\n",
"arr2[::2] = arr2[::2] * 2\n",
"print(\"arr: \" + str(arr))\n",
"print(\"arr2: \" + str(arr2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"Note that `arr` and `arr2` both have the same values! This is because the line `arr2 = arr` doesn't actually copy the array: it just makes another pointer to the same object. To truly copy the array, we need to use the `.copy()` method:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"arr = np.arange(10)\n",
"arr2 = arr.copy()\n",
"arr2[::2] = arr2[::2] * 2\n",
"print(\"arr: \" + str(arr))\n",
"print(\"arr2: \" + str(arr2))"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"### Exercise: Border (2 points)"
]
},
{
"cell_type": "markdown",
"metadata": {
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"source": [
"\n",
"Write a function to create a 2D array of arbitrary shape. This array should have all zero values, except for the elements around the border (i.e., the first and last rows, and the first and last columns), which should have a value of one.\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true,
"nbgrader": {
"grade": false,
"grade_id": "border",
"locked": false,
"schema_version": 1,
"solution": true
}
},
"outputs": [],
"source": [
"def border(n, m):\n",
" \"\"\"Creates an array with shape (n, m) that is all zeros\n",
" except for the border (i.e., the first and last rows and\n",
" columns), which should be filled with ones.\n",
"\n",
" Hint: you should be able to do this in three lines\n",
" (including the return statement)\n",
"\n",
" Parameters\n",
" ----------\n",
" n, m: int\n",
" Number of rows and number of columns\n",
"\n",
" Returns\n",
" -------\n",
" numpy array with shape (n, m)\n",
"\n",
" \"\"\"\n",
" ### BEGIN SOLUTION\n",
" arr = np.ones((n, m))\n",
" arr[1:-1, 1:-1] = 0\n",
" return arr\n",
" ### END SOLUTION"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": false,
"locked": false,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"# add your own test cases in this cell!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"nbgrader": {
"grade": true,
"grade_id": "test_border",
"locked": false,
"points": 2.0,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"from numpy.testing import assert_array_equal\n",
"from nose.tools import assert_equal\n",
"\n",
"# check a few small examples explicitly\n",
"assert_array_equal(border(1, 1), [[1]])\n",
"assert_array_equal(border(2, 2), [[1, 1], [1, 1]])\n",
"assert_array_equal(border(3, 3), [[1, 1, 1], [1, 0, 1], [1, 1, 1]])\n",
"assert_array_equal(border(3, 4), [[1, 1, 1, 1], [1, 0, 0, 1], [1, 1, 1, 1]])\n",
"\n",
"# check a few large and random examples\n",
"for i in range(10):\n",
" n, m = np.random.randint(2, 1000, 2)\n",
" result = border(n, m)\n",
"\n",
" # check dtype and array shape\n",
" assert_equal(result.dtype, np.float)\n",
" assert_equal(result.shape, (n, m))\n",
"\n",
" # check the borders\n",
" assert (result[0] == 1).all()\n",
" assert (result[-1] == 1).all()\n",
" assert (result[:, 0] == 1).all()\n",
" assert (result[:, -1] == 1).all()\n",
"\n",
" # check that everything else is zero\n",
" assert np.sum(result) == (2*n + 2*m - 4)\n",
"\n",
"print(\"Success!\")"
]
}
],
"metadata": {
"celltoolbar": "Create Assignment",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.4.0"
}
},
"nbformat": 4,
"nbformat_minor": 0
}