[24]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Wealth Distribution¶
One way that we might understand equality is through understanding the distribution of wealth in a society. Perfect wealth distribution would mean that all participants have the same share of wealth as everyone else. We can represent this situation mathematically with a function \(L(x) = x\) that we will call the Lorenz Curve.
Concretely, if we were to look at every 20% of the population, we would see 20% of income.
| Fifths of Households | Percent of Wealth |
|---|---|
| Lowest Fifth | 20 |
| Lowest two - Fifths | 40 |
| Lowest three - Fifths | 60 |
| Lowest four - Fifths | 80 |
| Lowest five - Fifths | 100 |
[25]:
percent = [0, 0.2, 0.4, 0.6, 0.8, 1.0]
lorenz = [0, 0.2, 0.4, 0.6, 0.8, 1.0]
[26]:
plt.plot(percent, lorenz, '-o')
plt.title("Perfect Wealth Distribution")
[26]:
Text(0.5, 1.0, 'Perfect Wealth Distribution')
It is unlikely that we have perfect distribution of wealth in a society however. For example, the following table describes the cumulative distribution of income in the United States for the year 1994.
| Fifths of Households | Percent of Wealth |
|---|---|
| Lowest Fifth | 4.2 |
| Lowest two - Fifths | 14.2 |
| Lowest three - Fifths | 29.9 |
| Lowest four - Fifths | 53.2 |
| Lowest five - Fifths | 100.0 |
[27]:
usa_94 = [0, 0.042, 0.142, 0.299, 0.532, 1.00]
[28]:
plt.figure(figsize = (9, 5))
plt.plot(percent, lorenz, '-o', label = 'Lorenz Curve')
plt.plot(percent, usa_94, '-o', label = 'USA 1994')
plt.title("The Difference between Perfect and Actual Wealth Equality")
plt.legend(loc = 'best', frameon = False)
[28]:
<matplotlib.legend.Legend at 0x1361b7710>
The area between these curves can be understood to represent the discrepency between perfect wealth distribution and levels of inequality. Further, if we examine the ratio between this area and that under the Lorenz Curve we get the Gini Index.
One big issue remains however. We don’t want to use rectangles to approximate these regions but we don’t have equations for the actual distribution of wealth. We introduce two curve fitting techniques using numpy to address this problem.
Quadratic Fit¶
The curve in the figure above representing the actual distribution of wealth in the USA in 1994 can be approximated by a polynomial function. NumPy has a function called polyfit that will fit a polynomial to a set of points. Here, we use polyfit to fit a quadratic function to the points.
[29]:
fit = np.poly1d(np.polyfit(percent, usa_94, 2))
[30]:
plt.figure(figsize = (9, 5))
plt.plot(percent, usa_94, '-o', label = 'Polyfit')
plt.plot(percent, fit(percent), '-o', label = 'Actual USA 1994')
plt.title("Quality of Quadratic Fit")
plt.legend(loc = 'best', frameon = False)
[30]:
<matplotlib.legend.Legend at 0x134e26e90>
Getting the Fit¶
Below, we return to the complete picture where we plot our fitted function and the Lorenz Curve and shade the area that represents the difference in income distribution.
[31]:
plt.figure(figsize = (9, 5))
plt.plot(percent, lorenz, '--o', label = 'Lorenz')
plt.plot(percent, fit(percent), '--o', label = 'Polyfit')
plt.fill_between(percent, lorenz, fit(percent), alpha = 0.3, color = '#bc42f5')
plt.title("Visualizing the Gini Index")
plt.legend(loc = 'best', frameon = False)
[31]:
<matplotlib.legend.Legend at 0x139c2a390>
Now, we want to compute the ratio between the area between the curves to that under the Lorenz Curve. We can do this easily in Sympy but declaring \(x\) a symbol and substituting it into our fit function then integrating this.
[ ]:
[ ]:
[ ]:
[ ]:
[ ]:
Inequality through Time¶
Now that we understand how to compute the Gini Index, we want to explore what improving the gap in wealth distribution would mean.
[32]:
x = np.linspace(0, 1, 100)
plt.figure(figsize = (10, 7))
plt.plot(x, x)
plt.plot(x, x**2, label = "Country A")
plt.plot(x, x**4, label = "Country B")
plt.plot(x, x**8, label = "Country C")
plt.plot(x, x**16, label = "Country D")
plt.ylabel("Income Percent")
plt.xlabel("Population Fraction")
plt.title("Different Wealth Distributions")
plt.legend(loc = "best", frameon = False)
[32]:
<matplotlib.legend.Legend at 0x1370a8a90>
Which of the above countries do you believe is the most equitable? Why?
Census Bureau Data and Pandas¶
There are many organizations that use the Gini Index to this day. The OECD, World Bank, and US Census all track Gini Indicies. We want to investigate the real data much as we have with our smaller examples. To do so, we will use the Pandas library.
The table below gives distribution data for the years 1970, 1980, 1990, and 2000.
| x | 0.0 | 0.2 | 0.4 | 0.6 | 0.8 | 1.0 |
|---|---|---|---|---|---|---|
| 1970 | 0.000 | 0.041 | 0.149 | 0.323 | 0.568 | 1.000 |
| 1980 | 0.000 | 0.042 | 0.144 | 0.312 | 0.559 | 1.000 |
| 1990 | 0.000 | 0.038 | 0.134 | 0.293 | 0.530 | 1.000 |
| 2000 | 0.000 | 0.036 | 0.125 | 0.273 | 0.503 | 1.000 |
Creating the DataFrame¶
We will begin by creating a table from this data by entering lists with these values and creating a DataFrame from these lists.
[33]:
import pandas as pd
seventies = [0, 0.041, 0.149, 0.323, 0.568, 1.0]
eighties = [0, 0.042, 0.144, 0.312, 0.559, 1.0]
nineties = [0, 0.038, 0.134, 0.293, 0.53, 1.0]
twothou = [0, 0.036, 0.125, 0.273, 0.503, 1.0]
[34]:
df = pd.DataFrame({'1970s': seventies, '1980s':eighties, '1990s': nineties,
'2000s': twothou, 'perfect': [0, 0.2, 0.4, 0.6, 0.8, 1.0]})
df.head()
[34]:
| 1970s | 1980s | 1990s | 2000s | perfect | |
|---|---|---|---|---|---|
| 0 | 0.000 | 0.000 | 0.000 | 0.000 | 0.0 |
| 1 | 0.041 | 0.042 | 0.038 | 0.036 | 0.2 |
| 2 | 0.149 | 0.144 | 0.134 | 0.125 | 0.4 |
| 3 | 0.323 | 0.312 | 0.293 | 0.273 | 0.6 |
| 4 | 0.568 | 0.559 | 0.530 | 0.503 | 0.8 |
Plotting from the DataFrame¶
We can plot directly from the dataframe. The default plot generates lines for each decades inequality distribution. There are many plot types available however, and we can specify them with the kind argument as demonstrated with the density plot that follows. What do these visualizations tell you about equality in the USA based on this data?
[35]:
plt.figure(figsize = (14, 8))
plt.plot(df['1970s'], color = '#bc42f5', label = '1970s')
plt.plot(df['1980s'], color = '#bc42f5', label = '1980s')
plt.plot(df['1990s'], color = '#f55142', label = '1990s')
plt.plot(df['2000s'], color = '#234f11', label = '2000s')
plt.plot(df['perfect'], color = 'black', label = 'Perfect')
plt.legend()
plt.grid()
[36]:
df.plot(kind = 'kde', figsize = (15, 7))
plt.grid()
Volumes of Revolution¶
Find the volume of the solid formed by rotating the line \(y = x^2\) around the \(x\)-axis from \(x\) = 0 to \(x = 1\).
[37]:
x = np.linspace(0, 1, 1000)
def f(x): return x**2
plt.plot(x, f(x), color = 'black')
plt.fill_between(x, f(x), color = 'orange', alpha = 0.5)
[37]:
<matplotlib.collections.PolyCollection at 0x13163bc50>
[38]:
import mpl_toolkits.mplot3d.axes3d as axes3d
[39]:
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection = '3d')
x = np.linspace(-1, 2, 70)
v = np.linspace(0, np.pi, 70)
U, V = np.meshgrid(x, v)
Y1 = (U**2 + 1)*np.cos(V)
Z1 = (U**2 + 1)*np.sin(V)
X = U
ax.plot_surface(X, Y1, Z1)
ax.set_xlim(-3,3)
ax.set_ylim(-5,5)
ax.set_zlim(-5,5)
[39]:
(-5, 5)
[40]:
import gif
@gif.frame
def plot_volume(angle):
fig = plt.figure(figsize = (20, 15))
ax2 = fig.add_subplot(1, 1, 1, projection = '3d')
angles = np.linspace(0, 360, 20)
x = np.linspace(-1, 2, 60)
v = np.linspace(0, 2*angle, 60)
U, V = np.meshgrid(x, v)
Y1 = (U**2 + 1)*np.cos(V)
Z1 = (U**2 + 1)*np.sin(V)
Y2 = (U + 3)*np.cos(V)
Z2 = (U + 3)*np.sin(V)
X = U
ax2.plot_surface(X, Y1, Z1, alpha = 0.2, color = 'blue', rstride = 6, cstride = 6)
ax2.plot_surface(X, Y2, Z2, alpha = 0.2, color = 'red', rstride = 6, cstride = 6)
ax2.set_xlim(-3,3)
ax2.set_ylim(-5,5)
ax2.set_zlim(-5,5)
ax2.view_init(elev = 50, azim = 30*angle)
ax2.plot_wireframe(X, Y2, Z2)
ax2.plot_wireframe(X, Y1, Z1, color = 'black')
ax2._axis3don = False
frames = []
for i in np.linspace(0, 2*np.pi, 20):
frame = plot_volume(i)
frames.append(frame)
gif.save(frames, 'images/vol1.gif', duration = 500)
from IPython.display import Image
Image('images/vol1.gif')
[40]:
<IPython.core.display.Image object>
[41]:
def three_d_plotter(angle, rotate, turn):
fig = plt.figure(figsize = (13, 6))
ax = fig.add_subplot(1, 1, 1, projection='3d')
u = np.linspace(-1, 2, 60)
v = np.linspace(0, angle, 60)
U, V = np.meshgrid(u, v)
X = U
Y1 = (U**2 + 1)*np.cos(V)
Z1 = (U**2 + 1)*np.sin(V)
Y2 = (U + 3)*np.cos(V)
Z2 = (U + 3)*np.sin(V)
ax.plot_surface(X, Y1, Z1, alpha=0.3, color='red', rstride=6, cstride=12)
ax.plot_surface(X, Y2, Z2, alpha=0.3, color='blue', rstride=6, cstride=12)
ax.plot_wireframe(X, Y2, Z2, alpha=0.3, color='blue', rstride=6, cstride=12)
ax._axis3don = False
ax.view_init(elev = rotate, azim = turn)
plt.show()
[42]:
from ipywidgets import interact
import ipywidgets as widgets
[43]:
interact(three_d_plotter, angle = widgets.FloatSlider(0, min = 0, max = 2*np.pi, step = np.pi/10),
rotate = widgets.FloatSlider(0, min = 0, max = 360, step = 5),
turn = widgets.FloatSlider(0, min = 0, max = 500, step = 5))
[43]:
<function __main__.three_d_plotter(angle, rotate, turn)>
Practice¶
- Riemann
- Volume