{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Review\n", "\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "import sympy as sy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Problem: Summations\n", "\n", "Use the summation formulas below to evaluate the given summations.\n", "\n", "$$\\sum_{i = 1}^n i = \\frac{n^{2}}{2} + \\frac{n}{2}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\sum_{i = 1}^n i^2 = \\frac{n^{3}}{3} + \\frac{n^{2}}{2} + \\frac{n}{6}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\sum_{i = 1}^n i^3 = \\frac{n^{4}}{4} + \\frac{n^{3}}{2} + \\frac{n^{2}}{4}\n", "$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "a. $\\sum_{i = 1}^{10} i^2 - i$\n", "\n", "b. $\\sum_{i = 1}^{20} 4i^3 + i^2 - 6$\n", "\n", "c. $\\sum_{i = 4}^{10} 2i^2$ " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "i = sy.Symbol('i')" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "$\\displaystyle 330$" ], "text/plain": [ "330" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sy.summation(i**2 - i, (i, 1, 10))" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "$\\displaystyle n^{4} + \\frac{7 n^{3}}{3} + \\frac{3 n^{2}}{2} - \\frac{35 n}{6}$" ], "text/plain": [ "n**4 + 7*n**3/3 + 3*n**2/2 - 35*n/6" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sy.summation(4*i**3 + i**2 - 6, (i, 1, n))" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "n = sy.Symbol('n')" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/latex": [ "$\\displaystyle 742$" ], "text/plain": [ "742" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sy.summation(2*i**2, (i, 4, 10))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Problem 2: Mean, Variance, Standard Deviation\n", "\n", "Below is a sample of historic data relating to NYPD arrests. We count the 20 most frequent violations, and ask you to use these values to compute the *mean number of incidents*, *variance in incident counts*, and *standard deviation of incident counts*. " ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "nyc_arrests = pd.read_json('https://data.cityofnewyork.us/resource/8h9b-rp9u.json')" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
arrest_keyarrest_datepd_cdpd_descky_cdofns_desclaw_codelaw_cat_cdarrest_boroarrest_precinctjurisdiction_codeage_groupperp_sexperp_racex_coord_cdy_coord_cdlatitudelongitude
01731306022017-12-31T00:00:00.000566MARIJUANA, POSSESSION678.0MISCELLANEOUS PENAL LAWPL 2210500VQ105025-44MBLACK106305620746340.735772-73.715638
11731144632017-12-31T00:00:00.000478THEFT OF SERVICES, UNCLASSIFIED343.0OTHER OFFENSES RELATED TO THEFTPL 1651503MQ114025-44MASIAN / PACIFIC ISLANDER100911321961340.769437-73.910241
21731135132017-12-31T00:00:00.000849NY STATE LAWS,UNCLASSIFIED VIOLATION677.0OTHER STATE LAWSLOC000000VVK73118-24MBLACK101071918685740.679525-73.904572
31731134232017-12-31T00:00:00.000101ASSAULT 3344.0ASSAULT 3 & RELATED OFFENSESPL 1200001MM18025-44MWHITE98783121744640.763523-73.987074
41731134212017-12-31T00:00:00.000101ASSAULT 3344.0ASSAULT 3 & RELATED OFFENSESPL 1200001MM18045-64MBLACK98707321607840.759768-73.989811
\n", "
" ], "text/plain": [ " arrest_key arrest_date pd_cd \\\n", "0 173130602 2017-12-31T00:00:00.000 566 \n", "1 173114463 2017-12-31T00:00:00.000 478 \n", "2 173113513 2017-12-31T00:00:00.000 849 \n", "3 173113423 2017-12-31T00:00:00.000 101 \n", "4 173113421 2017-12-31T00:00:00.000 101 \n", "\n", " pd_desc ky_cd \\\n", "0 MARIJUANA, POSSESSION 678.0 \n", "1 THEFT OF SERVICES, UNCLASSIFIED 343.0 \n", "2 NY STATE LAWS,UNCLASSIFIED VIOLATION 677.0 \n", "3 ASSAULT 3 344.0 \n", "4 ASSAULT 3 344.0 \n", "\n", " ofns_desc law_code law_cat_cd arrest_boro \\\n", "0 MISCELLANEOUS PENAL LAW PL 2210500 V Q \n", "1 OTHER OFFENSES RELATED TO THEFT PL 1651503 M Q \n", "2 OTHER STATE LAWS LOC000000V V K \n", "3 ASSAULT 3 & RELATED OFFENSES PL 1200001 M M \n", "4 ASSAULT 3 & RELATED OFFENSES PL 1200001 M M \n", "\n", " arrest_precinct jurisdiction_code age_group perp_sex \\\n", "0 105 0 25-44 M \n", "1 114 0 25-44 M \n", "2 73 1 18-24 M \n", "3 18 0 25-44 M \n", "4 18 0 45-64 M \n", "\n", " perp_race x_coord_cd y_coord_cd latitude longitude \n", "0 BLACK 1063056 207463 40.735772 -73.715638 \n", "1 ASIAN / PACIFIC ISLANDER 1009113 219613 40.769437 -73.910241 \n", "2 BLACK 1010719 186857 40.679525 -73.904572 \n", "3 WHITE 987831 217446 40.763523 -73.987074 \n", "4 BLACK 987073 216078 40.759768 -73.989811 " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nyc_arrests.head()" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [], "source": [ "arrests = nyc_arrests['pd_desc'].value_counts().nlargest(20).to_frame()" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pd_desc 38.25\n", "dtype: float64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.mean(arrests)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pd_desc 1031.2875\n", "dtype: float64" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.var(arrests)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "pd_desc 32.113665\n", "dtype: float64" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.std(arrests)" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pd_desc
ASSAULT 3142
LARCENY,PETIT FROM OPEN AREAS,UNCLASSIFIED89
TRAFFIC,UNCLASSIFIED MISDEMEAN74
MARIJUANA, POSSESSION 4 & 567
INTOXICATED DRIVING,ALCOHOL53
ASSAULT 2,1,UNCLASSIFIED47
ROBBERY,UNCLASSIFIED,OPEN AREAS37
THEFT OF SERVICES, UNCLASSIFIED32
LARCENY,GRAND FROM OPEN AREAS,UNCLASSIFIED29
CONTROLLED SUBSTANCE, POSSESSION 725
PUBLIC ADMINISTRATION,UNCLASSIFIED FELONY23
OBSTR BREATH/CIRCUL20
MENACING,UNCLASSIFIED19
RESISTING ARREST16
WEAPONS, POSSESSION, ETC16
FORGERY,ETC.,UNCLASSIFIED-FELONY16
CONTROLLED SUBSTANCE,INTENT TO SELL 316
WEAPONS POSSESSION 315
TRAFFIC,UNCLASSIFIED MISDEMEANOR15
WEAPONS POSSESSION 1 & 214
\n", "
" ], "text/plain": [ " pd_desc\n", "ASSAULT 3 142\n", "LARCENY,PETIT FROM OPEN AREAS,UNCLASSIFIED 89\n", "TRAFFIC,UNCLASSIFIED MISDEMEAN 74\n", "MARIJUANA, POSSESSION 4 & 5 67\n", "INTOXICATED DRIVING,ALCOHOL 53\n", "ASSAULT 2,1,UNCLASSIFIED 47\n", "ROBBERY,UNCLASSIFIED,OPEN AREAS 37\n", "THEFT OF SERVICES, UNCLASSIFIED 32\n", "LARCENY,GRAND FROM OPEN AREAS,UNCLASSIFIED 29\n", "CONTROLLED SUBSTANCE, POSSESSION 7 25\n", "PUBLIC ADMINISTRATION,UNCLASSIFIED FELONY 23\n", "OBSTR BREATH/CIRCUL 20\n", "MENACING,UNCLASSIFIED 19\n", "RESISTING ARREST 16\n", "WEAPONS, POSSESSION, ETC 16\n", "FORGERY,ETC.,UNCLASSIFIED-FELONY 16\n", "CONTROLLED SUBSTANCE,INTENT TO SELL 3 16\n", "WEAPONS POSSESSION 3 15\n", "TRAFFIC,UNCLASSIFIED MISDEMEANOR 15\n", "WEAPONS POSSESSION 1 & 2 14" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nyc_arrests['pd_desc'].value_counts().nlargest(20).to_frame()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Problem: Root Mean Squared Error\n", "\n", "Suppose we build a model to predict the price of an apartment using the square footage and bedrooms as follows:\n", "\n", "$$c(x,y) = 10x + 1.2y + 200$$\n", "\n", "where $x$ represents square footage and $y$ the number of bedrooms. The **Root Mean Squared Error** is defined by:\n", "\n", "$$RMSE = \\sqrt{\\frac{\\sum_{i = 1}^n (\\hat{y} - y_i)^2 }{n}}$$\n", "\n", "essentially the square root of summed squared errors between real and predicted cost." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use the formula to find the **RMSE** of our models predictions below." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
square footage bedrooms real_cost predicted_cost
0 400 1 4300.76 4201.2
1 500 1 5301.01 5201.2
2 600 3 6302.63 6203.6
3 700 1 7300.91 7201.2
4 800 1 8301.07 8201.2
5 900 3 9303.85 9203.6
6 1000 2 10302.35 10202.4
" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "9912.19360000008" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "(4300.76-4201.2)**2 + (5301.01 - 5201.2)**2" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [], "source": [ "arrests['predictions'] = arrests['pd_desc'] + np.random.randint(1, 20, 20)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "157.0" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.mean((arrests['pd_desc'] - arrests['predictions'])**2)" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "12.529964086141668" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.sqrt(157)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Problem: Area under a curve. \n", "\n", "Evaluate the definite integrals below and represent the solution visually as the area under the curve $f(x)$.\n", "\n", "a. $\\int_{1}^3 x^3 - \\frac{1}{x} dx$\n", "\n", "b. $\\int_{\\pi}^{4\\pi} 2 \\cos(x) dx$\n", "\n", "c. $\\int_4^9 1.04(5)^x dx$" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [], "source": [ "from scipy.integrate import quad" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": [ "def f(x): return x**3 - 1/x\n", "def g(x): return 2*np.cos(x)\n", "def h(x): return 1.04*5**x" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(18.90138771133189, 2.0984755834487654e-13)" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quad(f, 1, 3)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(-1.5809616649023117e-15, 1.306472773737261e-13)" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quad(g, np.pi, 4*np.pi)" ] }, { "cell_type": "code", "execution_count": 38, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(1261682.718116748, 1.4007492034248646e-08)" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "quad(h, 4, 9)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Problem: Area between curves\n", "\n", "Given the functions:\n", "\n", "$$f(x) = 1 + x + e^{x^2 - 2x} \\quad g(x) = x^4 - 6.5x^2 + 6x + 2$$\n", "\n", "define the regions R and S shown below.\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Prove that the lines intersect at $x = 1$.\n", "\n", "2. Set up definite integrals to represent the areas $R$ and $S$\n", "\n", "3. Evaluate the integrals using technology." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Problem: Volumes and Revolution\n", "\n", "1. Find the volume of the solid generated by rotating the region bounded by $y = x$, $x = 0$, and $y = (x-1)^2 + 1$. Sketch an image of this region or try to use Python to visualize.\n", "\n", "2. Find the volume of the solid formed by rotating the region R from previous problem about the $x$-axis. Sketch an image of this region.\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### Problem: Gini Index\n", "\n", "The World Bank provides access to data about world GINI Indicies [here](https://data.worldbank.org/indicator/SI.POV.GINI?end=2017&start=1985). Take a look around at a country of your choice. What does the GINI Index say about this country? \n", "\n", "The United States Census gathers and provides data related to Income and Poverty in the United States. Visit their site [here](https://www.census.gov/library/publications/2019/demo/p60-266.html), and explore the data available. Download one data table and discuss the information your found and what it says about income and poverty in the United States." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# def c(x, y): return 10*x + 1.2*y + 200\n", "\n", "# x = np.arange(400, 1100, 100)\n", "\n", "# y = np.random.randint(1, 4, 7)\n", "\n", "# zhat = np.round(c(x, y), 2)\n", "\n", "# z = zhat + np.round(np.random.normal(100, size = 7), 2)\n", "\n", "# df = pd.DataFrame({'square footage': x, 'bedrooms': y, 'real_cost': z, 'predicted_cost': zhat})\n", "# df.to_html()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# def f(x): return 1 + x + np.e**(x**2 - 2*x)\n", "# def g(x): return x**4 - 6.5*x**2 + 6*x + 2\n", "# x = np.linspace(0, 2, 1000)\n", "# plt.plot(x, f(x), label = '$f(x)$')\n", "# plt.plot(x, g(x), label = '$g(x)$')\n", "# plt.legend()\n", "# plt.fill_between(x, f(x), g(x), color = 'grey', alpha = 0.1)\n", "# plt.text(0.4, 2.5, 'R', fontsize = 30)\n", "# plt.text(1.4, 2.2, 'S', fontsize = 30)\n", "# plt.savefig('images/a4p4.png')" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.4" } }, "nbformat": 4, "nbformat_minor": 4 }