{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Основы работы с библиотекой `numpy`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Знакомство с массивами" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Библиотека `numpy` - сокращение от *numeric Python*." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1, 2, 4, 0]\n", "[[1, 0, 3], [3, 6, 7], [1, 2, 3]]\n", "[[1, 3, 6], ['a', 'b', 'c', 7]]\n", "\n", "\n" ] } ], "source": [ "# Списки\n", "L = [1, 2, 4, 0]\n", "E = [[1, 0, 3], [3, 6, 7], [1, 2, 3]]\n", "D = [[1, 3, 6], ['a', 'b', 'c', 7]]\n", "\n", "# все работает\n", "print(L)\n", "print(E)\n", "print(D)\n", "print(type(L))\n", "print(type(D))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Массивы `numpy` очень похожи на списки (даже больше на вложенные списки), но элементы массива должны быть одного типа." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Достоинства массивов `numpy`\n", "1. Обработка массивов занимает меньше времени, их хранение меньше памяти, что очень актуально в случае работы с большими объемами данных. \n", "2. Функции `numpy` являются векторизованными ‒ их можно применять сразу ко всему массиву поэлементно. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Импортируем библиотеку и сократим название до `np`" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Получить массив `numpy` можно из обычного списка, используя функцию `array()`:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 4, 0])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array(L)\n", "A" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 4, 0]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 4, 0])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = np.array([1, 2, 4, 0]) #Главное - не забыть квадратные скобки\n", "A" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "ename": "ValueError", "evalue": "only 2 non-keyword arguments accepted", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mA\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0marray\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m2\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m4\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m0\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# error - нет квадратных скобок\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mValueError\u001b[0m: only 2 non-keyword arguments accepted" ] } ], "source": [ "A = np.array(1, 2, 4, 0) # error - нет квадратных скобок" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array(['1', '2', '4', '0', 'yes', '6'], dtype='\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mA\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;31m# index error\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mIndexError\u001b[0m: invalid index to scalar variable." ] } ], "source": [ "A[0][0] # index error" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.size # Общее число элементов в массиве (аналог `len()` для списков)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Разные описательные статистики:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.max() # максимум" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.min() # минимум" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.75" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.mean() # среднее" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "О других полезных методах можно узнать, нажав *Tab* после `np.`." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 2, 4, 0]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A.tolist() # Преобразование массива `numpy` в список" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Многомерные массивы" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [], "source": [ "S = np.array([[8, 1, 2], [2, 8, 9]]) # Создание многомерного массива на основе вложенного списка" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[8, 1, 2],\n", " [2, 8, 9]])" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.ndim # число измерений - два массива внутри" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2, 3)" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.shape # две строки (два списка) и три столбца (по три элемента в списке)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "6" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.size # Общее число элементов в массиве (его длина)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Когда в массиве больше одного измерения, при различных операциях нужно указывать, по какому измерению мы движемся (по строкам или по столбцам). " ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[8, 1, 2],\n", " [2, 8, 9]])" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.max()" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([8, 8, 9])" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.max(axis=0) # максимальное значение по столбцам - три столбца и три максимальных значения" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([8, 9])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.max(axis=1) # максимальное значение по строкам - две строки и два максимальных значения" ] }, { "cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([5. , 4.5, 5.5])" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.mean(axis=0)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([3.66666667, 6.33333333])" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S.mean(axis=1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Для того, чтобы обратиться к элементу двумерного массива, нужно указывать два индекса: сначала индекс массива, в котором находится нужный нам элемент, а затем индекс элемента внутри этого массива:" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S[0][0]" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S[1][2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Если мы оставим один индекс, мы просто получим массив с соответствующим индексом:" ] }, { "cell_type": "code", "execution_count": 39, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([8, 1, 2])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S[0]" ] }, { "cell_type": "code", "execution_count": 40, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[8, 1, 2],\n", " [2, 8, 6]])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "S[1][2] = 6 # массивы ‒ изменяемые объекты \n", "S" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Чтобы выбрать сразу несколько элементов, как и в случае со списками, можно использовать срезы. " ] }, { "cell_type": "code", "execution_count": 41, "metadata": {}, "outputs": [], "source": [ "T = np.array([[1, 3, 7], [8, 10, 1], [2, 8, 9], [1, 0, 5]])" ] }, { "cell_type": "code", "execution_count": 42, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 3, 7],\n", " [ 8, 10, 1],\n", " [ 2, 8, 9],\n", " [ 1, 0, 5]])" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T" ] }, { "cell_type": "code", "execution_count": 43, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 3, 7],\n", " [ 8, 10, 1]])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T[0:2] # массивы с индексами 0 и 1" ] }, { "cell_type": "code", "execution_count": 45, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1, 3, 7],\n", " [2, 8, 9]])" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T[0::2] # Можно выставить шаг среза - старт, двоеточие, двоеточие, шаг" ] }, { "cell_type": "code", "execution_count": 46, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "T[::2][1][2]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Способы создания массива" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Способ 1**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Уже познакомились - можно получить массив из готового списка, воспользовавшись функцией `array()`" ] }, { "cell_type": "code", "execution_count": 52, "metadata": {}, "outputs": [], "source": [ "B = np.array([10.5, 45, 2.4])" ] }, { "cell_type": "code", "execution_count": 53, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([10.5, 45. , 2.4])" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B" ] }, { "cell_type": "code", "execution_count": 56, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B[1] += 0.1\n", "B[1] -=0.1\n", "B[1] == 45" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Кроме того, при создании массива из списка можно изменить его форму, используя функцию `reshape()`." ] }, { "cell_type": "code", "execution_count": 57, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 5, 6],\n", " [9, 8, 0]])" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "old = np.array([[2, 5, 6], [9, 8, 0]])\n", "old " ] }, { "cell_type": "code", "execution_count": 58, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(2, 3)" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "old.shape # 2 на 3" ] }, { "cell_type": "code", "execution_count": 59, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 5],\n", " [6, 9],\n", " [8, 0]])" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new = old.reshape(3, 2) # изменим на 3 на 2\n", "new" ] }, { "cell_type": "code", "execution_count": 60, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(3, 2)" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "new.shape # 3 на 2" ] }, { "cell_type": "code", "execution_count": 61, "metadata": {}, "outputs": [ { "ename": "ValueError", "evalue": "cannot reshape array of size 6 into shape (2,4)", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mold\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m4\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# Несоответствующее число измерений приведет к ошибке\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[1;31mValueError\u001b[0m: cannot reshape array of size 6 into shape (2,4)" ] } ], "source": [ "old.reshape(2, 4) # Несоответствующее число измерений приведет к ошибке" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Способ 2**\n", "\n", "Можно создать массив на основе промежутка, созданного с помощью функции из `numpy` `arange()` (похожа на `range()`, только более гибкая)." ] }, { "cell_type": "code", "execution_count": 62, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2, 3, 4, 5, 6, 7, 8])" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(2, 9) # по умолчанию - как обычный range()" ] }, { "cell_type": "code", "execution_count": 63, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2, 5, 8])" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(2, 9, 3) # с шагом 3" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2,\n", " 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5,\n", " 4.6, 4.7, 4.8, 4.9, 5. , 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7, 5.8,\n", " 5.9, 6. , 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9, 7. , 7.1,\n", " 7.2, 7.3, 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8. , 8.1, 8.2, 8.3, 8.4,\n", " 8.5, 8.6, 8.7, 8.8, 8.9])" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(2, 9, 0.1) # с дробным шагом " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Можно совместить `arange()` и `reshape()` для создания массива нужного вида" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 2. , 2.5, 3. , 3.5, 4. , 4.5],\n", " [ 5. , 5.5, 6. , 6.5, 7. , 7.5],\n", " [ 8. , 8.5, 9. , 9.5, 10. , 10.5]])" ] }, "execution_count": 66, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(2, 11, 0.5).reshape(3, 6)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Способ 3**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Массив можно создать с нуля, если знать его размерность. Библиотека `numpy` позволяет создать массивы, состоящие из нулей или единиц, а также \"пустые\" массивы (не совсем пустые). " ] }, { "cell_type": "code", "execution_count": 67, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 0.],\n", " [0., 0., 0.],\n", " [0., 0., 0.]])" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Z = np.zeros((3, 3)) # размеры в виде кортежа - не теряйте еще одни круглые скобки\n", "Z" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "data type not understood", "output_type": "error", "traceback": [ "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[1;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[1;32m----> 1\u001b[1;33m \u001b[0mZ1\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mzeros\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;36m3\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;36m3\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;31m# ошибка - забыли скобки\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 2\u001b[0m \u001b[0mZ1\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n", "\u001b[1;31mTypeError\u001b[0m: data type not understood" ] } ], "source": [ "Z1 = np.zeros(3, 3) # ошибка - забыли скобки\n", "Z1" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 1.],\n", " [1., 1.],\n", " [1., 1.],\n", " [1., 1.]])" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "O = np.ones((4, 2)) # массив из единиц\n", "O" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Пустой (*empty*) массив имеет особенности" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[6.03e-322, 0.00e+000],\n", " [0.00e+000, 0.00e+000],\n", " [0.00e+000, 0.00e+000]])" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Emp = np.empty((3, 2))\n", "Emp" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "-6.03e-322" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "0-Emp[0][0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Массив *Emp* ‒ не совсем пустой, в нем содержатся какие-то (псевдо)случайные элементы, которые примерно равны 0. Теоретически создавать массив таким образом можно, но не рекомендуется: лучше создать массив из \"чистых\" нулей, чем из какого-то непонятного \"мусора\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Задание:** Дан массив `ages` (см. ниже). Напишите программу с циклом, которая позволит получить массив `ages_bin` такой же размерности, что и `ages`, состоящий из 0 и 1 (0 - младше 18, 1 - не младше 18).\n", "\n", "*Подсказка:* используйте вложенный цикл." ] }, { "cell_type": "code", "execution_count": 72, "metadata": {}, "outputs": [], "source": [ "ages = np.array([[12, 16, 17, 18, 14], [20, 22, 18, 17, 23], [32, 16, 44, 16, 23]])" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[12, 16, 17, 18, 14],\n", " [20, 22, 18, 17, 23],\n", " [32, 16, 44, 16, 23]])" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ages" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Решение:*" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0.]\n", " [0. 0. 0. 0. 0.]]\n", "[[0. 0. 0. 1. 0.]\n", " [1. 1. 1. 0. 1.]\n", " [1. 0. 1. 0. 1.]]\n" ] } ], "source": [ "shape = ages.shape\n", "ages_bin = np.zeros(shape)\n", "print(ages_bin)\n", "\n", "for i in range(0, shape[0]):\n", " for j in range(shape[1]):\n", " if ages[i][j] >= 18:\n", " ages_bin[i][j] = 1\n", "print(ages_bin)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Поэлементная обработка массивов `numpy`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Операции с массивами можно производить поэлементно, не используя циклы или их аналоги." ] }, { "cell_type": "code", "execution_count": 75, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 4, 0])" ] }, "execution_count": 75, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Возведем все его элементы в квадрат" ] }, { "cell_type": "code", "execution_count": 76, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 1, 4, 16, 0], dtype=int32)" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A = A ** 2\n", "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Вычтем из всех элементов единицу" ] }, { "cell_type": "code", "execution_count": 108, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[10. 13. 25. 6.71828183]\n" ] } ], "source": [ "A = A + 5\n", "print(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Кроме того, так же просто к элементам массива можно применять свои функции. Напишем функцию, которая будет добавлять к элементу 1, а затем считать от него натуральный логарифм." ] }, { "cell_type": "code", "execution_count": 109, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.0" ] }, "execution_count": 109, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.log(np.e)" ] }, { "cell_type": "code", "execution_count": 110, "metadata": {}, "outputs": [], "source": [ "def my_log(x):\n", " return np.log(x + 1)" ] }, { "cell_type": "code", "execution_count": 93, "metadata": {}, "outputs": [], "source": [ "A = [float(x) for x in A]" ] }, { "cell_type": "code", "execution_count": 101, "metadata": {}, "outputs": [], "source": [ "A = np.array(A)" ] }, { "cell_type": "code", "execution_count": 111, "metadata": {}, "outputs": [], "source": [ "A[3]=np.e-1" ] }, { "cell_type": "code", "execution_count": 112, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([10. , 13. , 25. , 1.71828183])" ] }, "execution_count": 112, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Применим:" ] }, { "cell_type": "code", "execution_count": 113, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2.39789527, 2.63905733, 3.25809654, 1. ])" ] }, "execution_count": 113, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_log(A)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "При этом нет никаких циклов! " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Превратить многомерный массив в одномерный (как список) можно, воспользовавшись методами `flatten()` (и `ravel()`)." ] }, { "cell_type": "code", "execution_count": 114, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1 2 3]\n", " [4 5 6]]\n", "[1 2 3 4 5 6]\n", "[[1 2 3]\n", " [4 5 6]]\n" ] } ], "source": [ "my = np.array([[1, 2, 3], [4, 5, 6]])\n", "print(my)\n", "print(my.flatten()) # \"плоский\" массив\n", "print(my)" ] }, { "cell_type": "code", "execution_count": 115, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 2 3 4 5 6]\n", "[[1 2 3]\n", " [4 5 6]]\n" ] } ], "source": [ "print(my.ravel()) # \"плоский\" массив\n", "print(my)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Еще достоинства `numpy`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1.Позволяет производить вычисления ‒ нет необходимости дополнительно загружать модуль `math`." ] }, { "cell_type": "code", "execution_count": 116, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "1.0986122886681098" ] }, "execution_count": 116, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.log(3) # натуральный логарифм" ] }, { "cell_type": "code", "execution_count": 117, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "2.6457513110645907" ] }, "execution_count": 117, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.sqrt(7) # квадратный корень" ] }, { "cell_type": "code", "execution_count": 118, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "7.38905609893065" ] }, "execution_count": 118, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.exp(2) # e^2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2.Позволяет производить операции с векторами и матрицами. Пусть у нас есть два вектора `a` и `b`. " ] }, { "cell_type": "code", "execution_count": 119, "metadata": {}, "outputs": [], "source": [ "a = np.array([1, 2, 3])\n", "b = np.array([0, 4, 7])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Если мы просто умножим `a` на `b` с помощью символа `*`, мы получим массив, содержащий произведения соответствующих элементов `a` и `b`:" ] }, { "cell_type": "code", "execution_count": 120, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 8, 21])" ] }, "execution_count": 120, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a * b" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "А если мы воспользуемся функцией `dot()`, получится [скалярное произведение](https://ru.wikipedia.org/wiki/%D0%A1%D0%BA%D0%B0%D0%BB%D1%8F%D1%80%D0%BD%D0%BE%D0%B5_%D0%BF%D1%80%D0%BE%D0%B8%D0%B7%D0%B2%D0%B5%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5) векторов (*dot product*)." ] }, { "cell_type": "code", "execution_count": 121, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "29" ] }, "execution_count": 121, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(a, b) # результат - число" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "При желании можно получить [векторное произведение](https://ru.wikipedia.org/wiki/%D0%92%D0%B5%D0%BA%D1%82%D0%BE%D1%80%D0%BD%D0%BE%D0%B5_%D0%BF%D1%80%D0%BE%D0%B8%D0%B7%D0%B2%D0%B5%D0%B4%D0%B5%D0%BD%D0%B8%D0%B5) (*cross product*): " ] }, { "cell_type": "code", "execution_count": 122, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 2, -7, 4])" ] }, "execution_count": 122, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.cross(a, b) # результат- вектор" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Создадим матрицу из строки и поработаем с ней." ] }, { "cell_type": "code", "execution_count": 124, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 4, 3],\n", " [1, 6, 5],\n", " [7, 1, 7]])" ] }, "execution_count": 124, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m = np.array(np.mat('2 4 3; 1 6 5; 7 1 7')) # np.mat - матрица из строки, np.array - массив из матрицы \n", "m" ] }, { "cell_type": "code", "execution_count": 125, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 1, 7],\n", " [4, 6, 1],\n", " [3, 5, 7]])" ] }, "execution_count": 125, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.T # транспонировать ее, то есть поменять местами строки и столбцы" ] }, { "cell_type": "code", "execution_count": 126, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([2, 6, 7])" ] }, "execution_count": 126, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.diagonal() # вывести ее диагональные элементы" ] }, { "cell_type": "code", "execution_count": 127, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "15" ] }, "execution_count": 127, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m.trace() # посчитать след матрицы ‒ сумму ее диагональных элементов" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Задание.** Создайте [единичную матрицу](https://ru.wikipedia.org/wiki/%D0%95%D0%B4%D0%B8%D0%BD%D0%B8%D1%87%D0%BD%D0%B0%D1%8F_%D0%BC%D0%B0%D1%82%D1%80%D0%B8%D1%86%D0%B0) 3 на 3, создав массив из нулей, а затем заполнив ее диагональные элементы значениями 1.\n", "\n", "*Подсказка:* функция `fill_diagonal()`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Решение:*" ] }, { "cell_type": "code", "execution_count": 128, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 0., 0.],\n", " [0., 1., 0.],\n", " [0., 0., 1.]])" ] }, "execution_count": 128, "metadata": {}, "output_type": "execute_result" } ], "source": [ "I = np.zeros((3, 3))\n", "np.fill_diagonal(I, 1)\n", "I" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Правда, для создания массива в виде единичной матрицы в `numpy` уже есть готовая функция (наряду с `zeros` и `ones`):" ] }, { "cell_type": "code", "execution_count": 129, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[1., 0., 0.],\n", " [0., 1., 0.],\n", " [0., 0., 1.]])" ] }, "execution_count": 129, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.eye(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Найдем [обратную матрицу](https://ru.wikipedia.org/wiki/%D0%9E%D0%B1%D1%80%D0%B0%D1%82%D0%BD%D0%B0%D1%8F_%D0%BC%D0%B0%D1%82%D1%80%D0%B8%D1%86%D0%B0):" ] }, { "cell_type": "code", "execution_count": 148, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 4, 3],\n", " [1, 6, 5],\n", " [7, 1, 7]])" ] }, "execution_count": 148, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m" ] }, { "cell_type": "code", "execution_count": 150, "metadata": {}, "outputs": [], "source": [ "mm = np.invert(m)" ] }, { "cell_type": "code", "execution_count": 151, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-3, -5, -4],\n", " [-2, -7, -6],\n", " [-8, -2, -8]], dtype=int32)" ] }, "execution_count": 151, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mm" ] }, { "cell_type": "code", "execution_count": 144, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 4, 3],\n", " [1, 6, 5],\n", " [7, 1, 7]], dtype=int32)" ] }, "execution_count": 144, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mm1 = np.invert(mm)\n", "mm1" ] }, { "cell_type": "code", "execution_count": 143, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-38, -44, -56],\n", " [-55, -57, -80],\n", " [-79, -56, -90]])" ] }, "execution_count": 143, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.dot(m,mm)" ] }, { "cell_type": "code", "execution_count": 142, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-38, -44, -56],\n", " [-55, -57, -80],\n", " [-79, -56, -90]])" ] }, "execution_count": 142, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mmm = m.dot(mm)\n", "mmm" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Для других операций с матрицами (и вычислений в рамках линейной алгебры) можно использовать функции из подмодуля `linalg`. Например, так можно найти определитель матрицы:" ] }, { "cell_type": "code", "execution_count": 146, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[2, 4, 3],\n", " [1, 6, 5],\n", " [7, 1, 7]])" ] }, "execution_count": 146, "metadata": {}, "output_type": "execute_result" } ], "source": [ "m" ] }, { "cell_type": "code", "execution_count": 145, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "62.99999999999999" ] }, "execution_count": 145, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linalg.det(m) # это 8 на самом деле" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "И собственные значения:" ] }, { "cell_type": "code", "execution_count": 147, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([12.33303543+0.j , 1.33348228+1.82484424j,\n", " 1.33348228-1.82484424j])" ] }, "execution_count": 147, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linalg.eigvals(m)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Полный список функций с описанием см. в [документации](https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.linalg.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Циклы vs векторные операции" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 183, "metadata": {}, "outputs": [], "source": [ "size = 1000\n", "O = np.ones((size, size)) # массив из единиц\n" ] }, { "cell_type": "code", "execution_count": 193, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall time: 14 ms\n" ] } ], "source": [ "%%time\n", "O1 = O ** 2 " ] }, { "cell_type": "code", "execution_count": 185, "metadata": {}, "outputs": [], "source": [ "O = np.ones((size, size)) # массив из единиц" ] }, { "cell_type": "code", "execution_count": 186, "metadata": {}, "outputs": [], "source": [ "O2 = np.zeros((size, size)) # массив из единиц" ] }, { "cell_type": "code", "execution_count": 194, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Wall time: 2.08 s\n" ] } ], "source": [ "%%time\n", "for i in range(size):\n", " for j in range(size):\n", " O2[i][j]=O[i][j]*O[i][j]" ] }, { "cell_type": "code", "execution_count": 158, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ True, True, True, ..., True, True, True],\n", " [ True, True, True, ..., True, True, True],\n", " [ True, True, True, ..., True, True, True],\n", " ...,\n", " [ True, True, True, ..., True, True, True],\n", " [ True, True, True, ..., True, True, True],\n", " [ True, True, True, ..., True, True, True]])" ] }, "execution_count": 158, "metadata": {}, "output_type": "execute_result" } ], "source": [ "O1==O2" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Библиотеку `numpy` часто используют с библиотекой для визуализации `matplotlib`. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 2 }