Buy vs. Rent, A Financial Modeling Workflow in Python

Summary

This post goes through the following exercises:

Use numpy-financial to build a loan amortization calculator for a home mortgage
Use said table as well as simulated home and stock equity returns over time to compare year-to-year wealth resulting from the following strategies:
1. buying a residential living space
2. renting one instead and investing the dollar amount that would have been your down-payment

A Note on `numpy-financial`

At one point in time, numpy, the popular Python numerical analysis library, included 10 specialized functions for financial analysis. Given their specific nature, they were eventually removed from numpy, I think in 2019 (learn about why that is here) and are now available in the separate library, numpy-financial. The library still seems focused on the same 10 core functions, which handle tasks like calculating loan payment amounts given some inputs, and applied financial economics tasks like calculating time value of money. Cool… Anyways, I’ll use it to create an amortization schedule for a mortgage.

Environment/Packages

I built this notebook in a Google Colab instance, which seems to include most major Python libraries (more info).

You’ll probably have to download numpy-financial (it’s not included in Anaconda as far as I know), which you can accomplish within any notebook-like environment using the following command:

! pip install numpy-financial

Requirement already satisfied: numpy-financial in c:\users\peteramerkhanian\anaconda3\lib\site-packages (1.0.0)
Requirement already satisfied: numpy>=1.15 in c:\users\peteramerkhanian\anaconda3\lib\site-packages (from numpy-financial) (1.23.5)

You’ll want to load the usual suspects - pandas, numpy, seaborn, matplotlib. I also run from datetime import datetime since we will be working with ranges of dates, and I run sns.set_style() to get my seaborn plots looking a bit more aesthetically pleasing - read more on themes here.

import pandas as pd
import numpy as np
import numpy_financial as npf
from datetime import datetime

import seaborn as sns
# set seaborn style
sns.set_style("white")

import matplotlib.pyplot as plt
# Set Matplotlib font size
plt.rcParams.update({'font.size': 14})

Problem Setup

Definining Constants

I’ll run this as a comparison between buying an apartment that costs $700,000 with a 20% downpayment, versus renting a home for $2,600 a month. This is meant to approximate buying versus renting a two-bed one-bath apartment.

Buying fees are defined at 4%, the homeowners association fees are defined as $700 monthly.

# Buying Constants
interest_rate = 0.065
cost = 700000
hoa = 700
down_payment = cost * .2
principal = cost - down_payment
buying_fees = principal*.04

# Renting Constants
rent = 2600

npf.pmt() can be used to generate a monthly mortgage payment given those buying constants:

npf.pmt(interest_rate/12, 12*30, principal)

-3539.580931560606

alternatively, we can use npf.ppt() to see how much of the payment goes towards the principal, and use npf.ipmt() to see how much goes towards interest (see below for applications of those functions).

Defining Random Variables

I’ll make the simplifying assumption that both “annual home appreciation” and “annual stock appreciation” are generated from normal distributions. This is a kind of strong assumption, but one that seems to be routinely made at least with regards to stock market returns, even if there might be better distribution choices out there (more here).

Here’s a look at how we’ll draw from a normal distribution. Given an average annual return, $\mu = 0.0572$ ($\mu$, or, mu, is a common variable name for average) and a standard deviation $\sigma = 0.1042$ ($\sigma$, or, sigma, is the common variable name for standard deviation), we can draw one sample from a normal distribution as follows:

# Set a random seed for stability of results
np.random.seed(30)

mean = .0572
standard_deviation = .1042
samples = 1

# Draw the sample
np.random.normal(mean, standard_deviation, samples)

array([-0.07451429])

We now simulate market returns for every month by supplying mean and standard deviation values for both home and stock market appreciation and drawing 360 samples (360 months in 30 years). For simplicity, we’ll just use world-wide aggregate values from “The Rate of Return on Everything, 1870-2015”.

mu_stock = .1081
sigma_stock = .2267

mu_home = .0572
sigma_home = .1042

Given that stock and home appreciation is probably correlated, I’ll have ti sample from a bivariate normal distribution using numpy.random.Generator.multivariate_normal - documentation here, rather than the univariate distribution draw shown above. I am going to assume a correlation coefficient, $\rho_{stock,home}$ of 0.5 - a fairly clear correlation.
In order to use that numpy function, I’ll need to translate my correlation statistic into a covariance statistic, and I’ll use the following formula (source):
\[ \begin{align*} cov_{stock,home} &= \rho_{stock,home} \times \sigma_{stock} \sigma_{home} \\\ cov_{stock,home} &= 0.5 \times .2267 \times .1042 \end{align*} \]

I calculate covariance and confirm that the covariance and correlations match up below:

cov = 0.5 * sigma_stock * sigma_home
print("Covariance:", cov)
print("Back to correlation:", cov / (sigma_stock * sigma_home))

Covariance: 0.01181107
Back to correlation: 0.5

Now that I have the covariance, I’ll be able to sample from a bivariate normal distribution of the form shown below (source).
\[ \begin{pmatrix} Stock \\\\ Home\end{pmatrix} \sim \mathcal{N} \left[ \begin{pmatrix} \mu_{s} \\\ \mu_{h}\end{pmatrix}, \begin{pmatrix} \sigma_{s}^2 & cov_{s,h} \\\ cov_{s,h} & \sigma_{h}^2\end{pmatrix} \right] \]

Note, $s$ is shorthand for stock and $h$ is shorthand for home.

Now I’ll code that in Python and confirm that the means and standard deviations of our samples match what we expect:

cov_matrix = np.array([[sigma_stock**2, cov],
              [cov, sigma_home**2]])

returns_df = pd.DataFrame(np.random
                          .default_rng(30)
                          .multivariate_normal([mu_stock, mu_home],
                                               cov_matrix,
                                               360),
                          columns=["Stock_Appreciation", "Home_Appreciation"])
print("Means:", returns_df.mean(axis=0).values)
print("Std. Devs:", returns_df.std(axis=0).values)

returns_df = (returns_df / 12)

Means: [0.10764063 0.05970695]
Std. Devs: [0.22544095 0.10543034]

Plotting the simulated values, we can see that stock market returns are typically higher than home value appreciation.

returns_df.add(1).cumprod().plot(figsize=(7,4))
plt.xlabel("Months")
plt.ylabel("Money Multiplier")
plt.title("Simulated Home/Stock Returns")
sns.despine();

home_performance = returns_df.add(1).cumprod()['Home_Appreciation']
stock_performance = returns_df.add(1).cumprod()['Stock_Appreciation']

Now we can define two spread-sheet-like dataframes: - one that shows a mortgage amortization schedule for if you were to buy the $600,000 home, along with the home’s appreciation over time. - one that shows a table of rent payments and the stock market growth of what would have been your down payment (you can invest the down payment since you didn’t end up purchasing a house).

Scenarios

Ownership Table

# Buying Table
df_own = pd.DataFrame()
df_own["Period"] =  pd.Series(range(12*30)) + 1
df_own["Date"] = pd.date_range(start=datetime.today(),
                           periods=12*30,
                           freq='MS',
                           name="Date")
df_own["Principal_Paid"] = npf.ppmt(interest_rate/12,
                                    df_own["Period"],
                                    12*30,
                                    principal)
df_own["Interest_Paid"] = npf.ipmt(interest_rate/12,
                                   df_own["Period"],
                                   12*30,
                                   principal)
df_own["HOA_Paid"] = hoa
df_own["HOA_Paid"] = df_own["HOA_Paid"].cumsum()
df_own["Balance_Remaining"] = principal + df_own["Principal_Paid"].cumsum()
df_own["Home_Value"] = round(cost * home_performance, 2)
df_own["PropTax_Paid"] = (df_own["Period"]
                          .apply(lambda x:
                                 (cost * 1.02**((x-1)/12) * 0.01)
                                 if (x-1) in list(range(0, 12*1000, 12))
                                 else 0)
                          .cumsum())
df_own["Sale_Fees"] = df_own["Home_Value"] * .07
df_own["Own_Profit"] = (df_own["Home_Value"] -
                              df_own["HOA_Paid"] -
                              df_own["Balance_Remaining"] -
                              (buying_fees + df_own["Sale_Fees"]) -
                              df_own["PropTax_Paid"])
df_own = round(df_own, 2)

Note this code, which is a bit of a monster:

df_own["PropTax_Paid"] = (df_own["Period"]
                          .apply(lambda x:
                                 (cost * 1.02**((x-1)/12) * 0.01)
                                 if (x-1) in list(range(0, 12*1000, 12))
                                 else 0)
                          .cumsum())

What is happening here is a calculation of property assessed value and property tax according to California’s property assessment/tax regime (more here). We’ll look at this in two pieces, first, assessed value. In California, once you purchase a property, its assessed value is set at the purchase price, then increases annually by the lower of 2% or the rate of inflation according to the California Consumer Price Index (CCPI). You could write out an equation for this as follows:

\[ \begin{align*} AnnualFactor_y = \begin{cases} 1 + CCPI_y, & \text{if } CCPI_y < 0.02 \\\ 1.02, & \text{otherwise} \end{cases} \end{align*} \]

$AnnualFactor$ is the amount that the assessed value of a home will appreciate (expressed as a multiplier) in a given year, $y$. We define $y^*$ as the year of initial purchase and $PurchasePrice$ as the amount that the home was purchased for. Given that, $AssessedValue$ is defined as follows:

\[ \begin{align*} AssessedValue_y = \begin{cases} PurchasePrice, & \text{if } y = y^* \\ AssessedValue_{y-1} \times AnnualFactor_y, & \text{otherwise } \end{cases} \end{align*} \]

In our code, we will simplify this calculation by excluding the CCPI and just always using 1.02 as our annual factor. Therefore, we get:

\[ AssessedValue_y = PurchasePrice \times 1.02^y \]

and once we factor in taxes (1%), we get:

\[ PropertyTax_y = 0.01(PurchasePrice \times 1.02^y) \]

and finally we look at the the cumulative total property tax you’ve paid in a given year $y$, which is df_own["PropTax_Paid"]:

\[ \begin{equation*} PropertyTaxPaid_y = \sum_{y=1}^{30} 0.01(PurchasePrice \times 1.02^y) \end{equation*} \]

There’s some elements added to the code to work between years and months, but that equation is the gist of it.
We end up with the following table for property ownership:

df_own

	Period	Date	Principal_Paid	Interest_Paid	HOA_Paid	Balance_Remaining	Home_Value	PropTax_Paid	Sale_Fees	Own_Profit
0	1	2025-03-01 20:03:45.953475	-506.25	-3033.33	700	559493.75	701405.73	7000.000000	49098.40	62713.58
1	2	2025-04-01 20:03:45.953475	-508.99	-3030.59	1400	558984.76	707155.41	7000.000000	49500.88	67869.77
2	3	2025-05-01 20:03:45.953475	-511.75	-3027.83	2100	558473.02	723324.03	7000.000000	50632.68	82718.33
3	4	2025-06-01 20:03:45.953475	-514.52	-3025.06	2800	557958.50	737245.96	7000.000000	51607.22	95480.25
4	5	2025-07-01 20:03:45.953475	-517.31	-3022.28	3500	557441.19	745606.08	7000.000000	52192.43	103072.46
...	...	...	...	...	...	...	...	...	...	...
355	356	2054-10-01 20:03:45.953475	-3445.26	-94.33	249200	13968.65	4026844.31	283976.554436	281879.10	3175420.00
356	357	2054-11-01 20:03:45.953475	-3463.92	-75.66	249900	10504.74	4044808.10	283976.554436	283136.57	3194890.24
357	358	2054-12-01 20:03:45.953475	-3482.68	-56.90	250600	7022.06	4066823.89	283976.554436	284677.67	3218147.61
358	359	2055-01-01 20:03:45.953475	-3501.54	-38.04	251300	3520.51	4096809.97	283976.554436	286776.70	3248836.21
359	360	2055-02-01 20:03:45.953475	-3520.51	-19.07	252000	-0.00	4122136.18	283976.554436	288549.53	3275210.09

360 rows × 10 columns

Rental Table

This one is a but more simple, only examining the total rent you’ve paid in a given month and simulated stock returns at that point.

# Rental Table
df_rent = pd.DataFrame()
df_rent["Period"] =  pd.Series(range(12*30)) + 1
df_rent["Date"] = pd.date_range(start=datetime.today(),
                           periods=12*30,
                           freq='MS',
                           name="Date")
df_rent["DownPayment_Invested"] =  stock_performance * down_payment
df_rent["Rent_Paid"] = rent * 1.02**(df_rent["Period"].add(1) % 12 == 0).cumsum()
monthly_savings = df_own[["Principal_Paid", "Interest_Paid"]].sum(axis=1).multiply(-1).add(hoa) - df_rent["Rent_Paid"]
df_rent["Total_Rent_Paid"] = df_rent["Rent_Paid"].cumsum()
df_rent["Rent_Profit"] = df_rent["DownPayment_Invested"] - df_rent["Total_Rent_Paid"] + monthly_savings.cumsum()
df_rent = round(df_rent, 2)

Results

At this point, I’ll merge the ownership and rental tables and plot out what happened in this simulation

merged = pd.merge(df_own, df_rent, on="Period")
merged = merged.melt(value_vars = ["Rent_Profit", "Own_Profit"], id_vars='Period')

plt.figure(figsize=(14, 6))
plt.title("Wealth Outcomes for Owning vs. Renting a 2b1br Apt")
sns.lineplot(data=merged, x="Period", y="value", hue="variable")
for x in range(0, 350, 12):
    if x == 0:
        plt.axvline(x, color="grey", linestyle=":", alpha=1, label="Year")
    else:
        plt.axvline(x, color="grey", linestyle=":", alpha=0.7)
    plt.text(x+1, -100000, str(int(x/12)), alpha=0.8)
plt.axhline(0, color="red", linestyle="--", alpha=0.5, label="Zero")
plt.legend()
sns.despine()

We can quickly see that ownership will clearly build more wealth in the medium and long run:

years = 5
print(f"Owner after {years} years:", df_own.loc[12*years-1, "Own_Profit"])
print(f"Renter after {years} years:", df_rent.loc[12*years-1, "Rent_Profit"])

Owner after 5 years: 243148.71
Renter after 5 years: 135926.53

However, we can see that, in the unlikely case that the home is sold within the first year or so, the wealth of the renter and the owner are very similar, likely due to the owner contending with buying/selling fees:

years = 1
print(f"Owner after {years} years:", df_own.loc[12*years-1, "Own_Profit"])
print(f"Renter after {years} years:", df_rent.loc[12*years-1, "Rent_Profit"])

Owner after 1 years: 115891.0
Renter after 1 years: 135331.15

A possible takeaway here is that, as long as you can be confident you’ll be able to hold onto the house for more than a year, it’s probably better to purchase it. Uncertainty estimates would be useful here, and could be obtained by running the simulation under a wide variety of randomly generated market conditions.

Citation

BibTeX citation:

@online{amerkhanian2022,
  author = {Amerkhanian, Peter},
  title = {Buy Vs. {Rent,} {A} {Financial} {Modeling} {Workflow} in
    {Python}},
  date = {2022-08-06},
  url = {https://peter-amerkhanian.com/posts/rent-vs-buy/},
  langid = {en}
}

For attribution, please cite this work as:

Amerkhanian, Peter. 2022. “Buy Vs. Rent, A Financial Modeling Workflow in Python.” August 6, 2022. https://peter-amerkhanian.com/posts/rent-vs-buy/.