Stock market portfolio optimisation.¶
The goal of this notebook is to provide an introduction to stock market portfolio optimisation using Python. We will use pandas_datareader module to get data and PyPortfolioOpt for optimisation. We will walk through a simple Python script to retrieve, analyze, and visualize data for predefined portfolio of stocks.
To read and use this notebook clone it from remote repository to your local computer and run in preferred environment.
Permanent link to repository: bitbucket.org/Nekrasovp/stock-market-portfolio-optimisation
You can view this notebook on Jupyter Notebook nbviewer website: nbviewer.jupyter.org/urls/bitbucket.org/Nekrasovp/stock-market-portfolio-optimisation
Table of content:¶
- Project setup
- Pulling data
- Preparing data
- Analyze portfolio
- Calculate the 'Markowitz portfolio', minimising volatility for a given target return.
- Calculate the Sharpe-maximising portfolio for a given volatility (i.e max return for a target risk).
- Visualise portfolio
- Widget that allows you to interactively change the allocation.
- Plot with individual stock's annual return and annual risk, efficient frontier curve and best allocation cases.
Project setup¶
First thing we'll do is import the required dependencies.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interactive, FloatSlider, IntText, HBox, Layout, VBox, interact, Dropdown
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')
plt.style.use('fivethirtyeight')
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
We will use pandas_datareader to get data from the market. And pypfopt for optimisation goals.
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import risk_models, expected_returns, discrete_allocation
import pandas_datareader as pdr
Making the same request repeatedly can use a lot of bandwidth, slow down your code and may result in your IP being banned.
pandas_datareader allows you to cache queries using requests_cache
import requests_cache
expire_after = timedelta(days=1)
session = requests_cache.CachedSession(cache_name='cache', backend='sqlite', expire_after=expire_after)
%reload_ext watermark
%watermark -v -m --iversions
Pulling data¶
Define tickers you would like to analyze
securities = ['PRSP',
'GOOGL',
'IBM',
'BTI',
'BA',
'BIDU',
'CMA',
'WDC',
'SLB',
'SKM',
'F',
'EIX',
'FAST',
'NOK',
'DLR',
'INTC',
'CBPO',
'NVDA',
'SHI',
'CAT', ]
sec_data = {}
for security in securities:
sec_df = pdr.DataReader(security,
'yahoo',
start=datetime(2018, 7, 10),
end=datetime(2019, 7, 10),
session=session)
sec_data[security] = sec_df
By looking at keys of resulted dict we can explore which securities we fetched successfully.
sec_data.keys()
sec_data['GOOGL'].head()
Let's look at how the price of each stock has evolved within give time frame. While price of each stock is different there is no meaning to look at prices themselves so we will plot daily returns (percent change compared to the day before). By plotting daily returns instead of actual prices, we can see the stocks' volatility.
plt.figure(figsize=(14, 7))
for security, df in sec_data.items():
df['PctChange'] = df.Close.pct_change()
plt.plot(df.index, df.PctChange, lw=3, alpha=0.8,label=security)
plt.legend(loc='upper left', fontsize=10)
plt.ylabel('daily returns')
It seems that EIX has experienced a real "roller coaster" in the study period.
sec_data['EIX']['Close'].dropna().plot(figsize = (14, 7))
plt.ylabel('price in $')
Now we have a dictionary of dataframes, each containing the historical ohlcv prices data.
Preparing data¶
For analyzing we will use only Close prices from each ohlcv tuple.
def build_pricing_data_from_ohlcv(ohlcv_data: dict):
"""
Build new pricing dict from ohlcv_data 'Close' column
Add base asset column with price equal to '1'.
:param ohlcv_data: Pricing data
:return: Formated pricing data dict
"""
data_c = pd.DataFrame()
for m, ohlcv in ohlcv_data.items():
data_c[m] = ohlcv['Close']
return data_c
prices_df = build_pricing_data_from_ohlcv(sec_data)
pd.set_option('display.max_columns', 10)
prices_df.head()
Analyze portfolio¶
Methods of "portfolio optimization" as a solution to a particular optimization problem is a widely studied area of human knowledge.
Look more here:
As mentioned above we will use PyPortfolioOpt package because its provide clear and uncomplicated access to pandas and scipy optimisation methods used in our case.
Expected returns models:
- mean historical return
- exponentially weighted mean historical return
Risk models:
- sample covariance
- semicovariance
- exponentially weighted covariance
- mininum covariance determinant
shrunk covariance matrices:
- manual shrinkage
- Ledoit Wolf shrinkage
- Oracle Approximating shrinkage
def generate_analysis_model(pricing_data, weight_bounds):
"""Generate efficient frontier model with simple and intuitive estimators
:param pricing_data: Adjusted closing prices of the asset, each row is a date
and each column is a ticker/id.
:param weight_bounds: Minimum and maximum weight of an asset
:return: Efficient Frontier model
"""
# Annualised mean (daily) historical return from input (daily) asset prices
mu = expected_returns.mean_historical_return(pricing_data, frequency=252)
# Annualised sample covariance matrix of (daily) asset returns
s = risk_models.sample_cov(pricing_data, frequency=252)
return EfficientFrontier(mu, s, weight_bounds=weight_bounds, gamma=0)
def make_discrete_allocation(prices_df, cleaned_weights, total_portfolio_value):
latest_prices = discrete_allocation.get_latest_prices(prices_df)
da = discrete_allocation.DiscreteAllocation(weights=cleaned_weights,
latest_prices=latest_prices,
total_portfolio_value=total_portfolio_value
)
allocation, leftover = da.lp_portfolio(verbose=False)
print(allocation)
print("Funds remaining: ${:.2f}".format(leftover))
def draw_cleaned_weights_with_perfomance(cleaned_weights, p_perfomance):
# Filter non-zero values
non_zero_weights = dict(filter(lambda w: w[1] > 0.0, cleaned_weights.items()))
# Plot horizontal bar chart with resulted weights
plt.figure(figsize=(10, 7))
_ = non_zero_weights
ax = plt.barh(range(len(_.keys())),
[float(k)*100 for k in _.values()],
align='center',
height=0.5,
color='#f3ba2f',
edgecolor='black',
tick_label=[k for k in _.keys()])
plt.yticks(fontsize=14)
plt.xlabel('Ticker allocation, %')
plt.ylabel('Portfolio tickers')
plt.title('Expected annual return: {:.2%} Annual volatility: {:.2%} Sharpe Ratio: {:.2f}'.format(
p_perfomance[0],
p_perfomance[1],
p_perfomance[2]))
for rect in ax.patches:
# Get X and Y placement of label from rect.
x_value = rect.get_width()
y_value = rect.get_y() + rect.get_height() / 2
# Number of points between bar and label. Change to your liking.
space = 5
# Vertical alignment for positive values
ha = 'left'
# If value of bar is negative: Place label left of bar
if x_value < 0:
# Invert space to place label to the left
space *= -1
# Horizontally align label at left
ha = 'left'
# Use X value as label and format number with one decimal place
label = "{:.1f}".format(x_value)
# Create annotation
plt.annotate(
label, # Use `label` as label
(x_value, y_value), # Place label at end of the bar
xytext=(space, 0), # Horizontally shift label by `space`
textcoords="offset points", # Interpret `xytext` as offset in points
va='center', # Vertically center label
ha=ha) # Horizontally align label
# plt.savefig("image.png")
plt.show()
EFmodel = generate_analysis_model(prices_df, (0.0, 1.0))
Calculate the 'Markowitz portfolio', minimising volatility for a given target return.¶
EFmodel.efficient_return(target_return=0.2)
cleaned_weights = EFmodel.clean_weights(cutoff=1e-4, rounding=4)
p_perfomance = EFmodel.portfolio_performance(verbose=False)
draw_cleaned_weights_with_perfomance(cleaned_weights, p_perfomance)
make_discrete_allocation(prices_df, cleaned_weights, 10000)
You can do better than 1/N¶
Here is a quote from Portfolio optimisation: lessons learnt:
"A significant body of research suggests that, in light of the aforementioned failures of MVO, we should instead use 1/N portfolios(Optimal Versus Naive Diversification: How Inefficient is the 1/N Portfolio Strategy?) – it is noted that they often beat efficient frontier optimisation significantly in out-of-sample testing.
However, an interesting paper by Kritzman et al. (2010) finds that it is not the case that there is anything special about 1/N diversification: it is just that the expected returns are an incredibly poor estimator and any optimisation scheme which relies on them will likely go astray.
The easiest way to avoid this problem is to not provide the expected returns to the optimiser, and just optimise on the sample covariance matrix instead. Effectively we are saying that although previous returns won’t predict future returns, previous risks might predict future risks. This is intuitively a lot more reasonable – the sample covariance matrix really seems like it should contain a lot of information. Empirical results support this, showing that minimum variance portfolios outperform both standard MVO and 1/N diversification.
In my own work I have found that the standard minimum variance portfolio is a very good starting point, from which you can try a lot of new things:
- Shrinkage estimators on the covariance matrix
- Exponential weighting
- Different historical windows
- Additional cost terms in the objective function (e.g small-weights penalty)
- ..."
Calculate the Sharpe-maximising portfolio for a given volatility (i.e max return for a target risk).¶
EFmodel.efficient_risk(target_risk=0.25)
cleaned_weights = EFmodel.clean_weights(cutoff=1e-4, rounding=4)
p_perfomance = EFmodel.portfolio_performance(verbose=False)
draw_cleaned_weights_with_perfomance(cleaned_weights, p_perfomance)
make_discrete_allocation(prices_df, cleaned_weights, 10000)
Visualise portfolio¶
One of the goal of this notebook is visualising Efficient portfolio. To achive this we will provide:
- Portfolio securities list.
- Maximum target return and minimum volatility cases information.
- Widget where you can choose different portfolio allocations along the Efficient frontier curve.
- Final allocations for presented portfolio value.
Next will plot:
- Individual stock's annual return and annual risk.
- Maximum target return and minimum volatility allocation.
- Efficient frontier curve.
- Target return portfolio perfomance
def generate_portfolio(prices_df, target_return = None, target_risk = None):
"""
:return: Expected annual return(mu), Annual volatility(sigma), Sharpe Ratio
"""
weight_bounds = (0.0, 1.0)
EFmodel = generate_analysis_model(prices_df, weight_bounds)
if target_return is not None:
EFmodel.efficient_return(target_return=target_return)
return EFmodel.portfolio_performance()
if target_risk is not None:
EFmodel.efficient_risk(target_risk=target_risk)
return EFmodel.portfolio_performance()
def draw_portfolio(prices_df, opt_obj, total_v, target_return, target_risk):
weight_bounds = (0.0, 1.0)
EFmodel = generate_analysis_model(prices_df, weight_bounds)
if opt_obj == 'target_return': EFmodel.efficient_return(target_return=target_return)
elif opt_obj == 'target_risk': EFmodel.efficient_risk(target_risk=target_risk)
elif opt_obj == 'max_sharpe': EFmodel.max_sharpe()
elif opt_obj == 'min_volatility': EFmodel.min_volatility()
cleaned_weights = EFmodel.clean_weights(cutoff=1e-4, rounding=4)
p_perfomance = EFmodel.portfolio_performance(verbose=False)
draw_cleaned_weights_with_perfomance(cleaned_weights, p_perfomance)
make_discrete_allocation(prices_df, cleaned_weights, total_v)
p = prices_df.copy()
# Calculate the 'Markowitz portfolio', minimising volatility for a given target_return
max_annual_return_portfolio = generate_portfolio(prices_df=p, target_return=1.0)
# Calculate the Sharpe-maximising portfolio for a given volatility(max return for a target_risk).
min_annual_volatility_portfolio = generate_portfolio(prices_df=p, target_risk=0.0)
print(f"Portfolio tickers:{securities}\n")
print("Minimising volatility for maximum target_return portfolio allocation:\n")
print(f"Annualised Return:{max_annual_return_portfolio[0]:.1%}")
print(f"Annualised Volatility:{max_annual_return_portfolio[1]:.1%}")
print(f"Sharpe Ratio:{max_annual_return_portfolio[1]:.2f}\n")
print("Sharpe-maximising portfolio for a minimum volatility(max return for a target_risk) allocation:\n")
print(f"Annualised Return:{min_annual_volatility_portfolio[0]:.1%}")
print(f"Annualised Volatility:{min_annual_volatility_portfolio[1]:.1%}")
print(f"Sharpe Ratio:{min_annual_volatility_portfolio[1]:.2f}\n")
def f(opt_obj, x, tr, tri):
draw_portfolio(p, opt_obj, x, tr, tri)
opt_obj_widget = Dropdown(options=['target_return', 'target_risk', 'max_sharpe', 'min_volatility'],
value='target_return',
description='Otimisation objective:',
style={'description_width': 'initial'},
disabled=False, )
def update_opt_obj(*args):
if opt_obj_widget.value == 'target_return':
risk_slider_widget.disabled = True
return_slider_widget.disabled = False
elif opt_obj_widget.value == 'target_risk':
risk_slider_widget.disabled = False
return_slider_widget.disabled = True
elif opt_obj_widget.value == 'max_sharpe':
risk_slider_widget.disabled = True
return_slider_widget.disabled = True
elif opt_obj_widget.value == 'min_volatility':
risk_slider_widget.disabled = True
return_slider_widget.disabled = True
opt_obj_widget.observe(update_opt_obj, 'value')
total_v_widget = IntText(value=10000, description='Total value:', disabled=False)
risk_slider_widget = FloatSlider(value=0,
min=min_annual_volatility_portfolio[1],
max=max_annual_return_portfolio[1],
step=0.001,
description='Target risk:',
disabled=True,
continuous_update=False,
layout=Layout(width='50%', height='40px'),
readout=True,
readout_format='.1%',)
return_slider_widget = FloatSlider(value=max_annual_return_portfolio[0]*0.9,
min=0.0,
max=max_annual_return_portfolio[0],
step=0.001,
description='Target return:',
disabled=False,
continuous_update=False,
orientation='horizontal',
style={'description_width': 'initial'},
layout=Layout(width='50%', height='40px'),
readout=True,
readout_format='.1%',)
interactive_plot = interactive(f,
opt_obj=opt_obj_widget,
x=total_v_widget,
tr=return_slider_widget,
tri=risk_slider_widget,)
interactive_plot
Next we can plot each individual stocks on the plot with the corresponding values of each stock's annual return and annual risk. This way we can see and compare how diversification is lowering the risk by optimising the allocation. For example we will take recommended in previous state portfolio by filtering presented dataframe.
individual_assets = ['PRSP', 'GOOGL', 'IBM', 'BTI', 'BA', 'BIDU', 'CMA', 'WDC', 'SLB', 'SKM', 'F', 'EIX', 'FAST', 'NOK', 'DLR', 'INTC', 'CBPO', 'NVDA', 'SHI', 'CAT']
# individual_assets = ['PRSP', 'SKM', 'EIX', 'FAST', 'DLR']
t_return_opt = 0.16
i_stocks_assets_data = prices_df[individual_assets].copy()
max_annual_return_portfolio = generate_portfolio(prices_df=i_stocks_assets_data, target_return=0.5)
min_annual_volatility_portfolio = generate_portfolio(prices_df=i_stocks_assets_data, target_risk=0.0)
max_annual_volatility_portfolio = generate_portfolio(prices_df=i_stocks_assets_data, target_risk=1.0)
opt_portfolio = generate_portfolio(prices_df=i_stocks_assets_data, target_return=t_return_opt)
i_stocks_perfomance = {}
for a in individual_assets:
EFmodel = generate_analysis_model(i_stocks_assets_data[a], (0.0, 1.0))
EFmodel.efficient_return(target_return=1.0)
i_stocks_perfomance[a] = EFmodel.portfolio_performance(verbose=False)
fig, ax = plt.subplots(figsize=(12, 7))
ax.scatter([p[1] for p in i_stocks_perfomance.values()],
[p[0] for p in i_stocks_perfomance.values()],
marker='o',s=200)
for a, p in i_stocks_perfomance.items():
ax.annotate(a, (p[1],p[0]), xytext=(10,0), textcoords='offset points')
target = np.linspace(min_annual_volatility_portfolio[1], max_annual_return_portfolio[1], 30)
ef_curve_portfolios_perfomance = {}
for t in target:
EFmodel = generate_analysis_model(i_stocks_assets_data, (0.0, 1.0))
EFmodel.efficient_risk(target_risk=t)
ef_curve_portfolios_perfomance[t] = EFmodel.portfolio_performance(verbose=False)
target_ret = np.linspace(max_annual_volatility_portfolio[0], max_annual_return_portfolio[0], 30)
for t in target_ret:
EFmodel = generate_analysis_model(i_stocks_assets_data, (0.0, 1.0))
EFmodel.efficient_return(target_return=t)
ef_curve_portfolios_perfomance[t] = EFmodel.portfolio_performance(verbose=False)
ax.plot([p[1] for p in ef_curve_portfolios_perfomance.values()],
[p[0] for p in ef_curve_portfolios_perfomance.values()],
linestyle='-.',
color='black',
label='efficient frontier')
ax.scatter(opt_portfolio[1],
opt_portfolio[0],
marker='p',color='y',s=300, label='Minimum Volatility for target return')
ax.scatter(max_annual_return_portfolio[1],
max_annual_return_portfolio[0],
marker='*',color='r',s=300, label='Minimum volatility for maximum return')
ax.scatter(min_annual_volatility_portfolio[1],
min_annual_volatility_portfolio[0],
marker='*',color='g',s=300, label='Sharpe-maximising portfolio for minimum volatility')
ax.set_title('Portfolio Optimization with Individual Stocks')
ax.set_xlabel('annualised volatility')
ax.set_ylabel('annualised returns')
ax.legend(labelspacing=0.8, loc='best')
# ax.savefig("image.png")
As you can see from the above plot, the stock with the least risk is SKM at around 0.20. But with portfolio optimisation, we can achieve even lower risk at 0.14, and still with a higher return than SKM. Best risk/return allocation are in FAST in our case and its matched with portfolio optimised for minimum volatility for maximum return.
Comments
comments powered by Disqus