The goal of this article is to provide an easy introduction to stock market analysis using Python. In this notebook we will use pandas_datareader module. We will walk through a simple Python script to retrieve, analyze, and visualize data from different markets.
Project Setup¶
Once we've got a blank Jupyter notebook open, the first thing we'll do is import the required dependencies.
import pandas as pd
from datetime import datetime, timedelta
We will use pandas_datareader to get data from the market.
import pandas_datareader as pdr
Making the same request repeatedly can use a lot of bandwidth, slow down your code and may result in your IP being banned.
pandas_datareader allows you to cache queries using requests_cache
import requests_cache
expire_after = timedelta(days=1)
session = requests_cache.CachedSession(cache_name='cache', backend='sqlite', expire_after=expire_after)
%reload_ext watermark
%watermark -v -m --iversions
Pulling Data¶
We will try MOEX GAZP ticker. You can explore securities secid with this link http://iss.moex.com/iss/securities.xml?q=Gazprom
gazp = pdr.DataReader('GAZP', 'moex', start=datetime(2018, 1, 1), end=datetime(2019, 1, 1), session=session)
gazp['OPEN'].head(10)
We will clear rows with NaN with dropna as is and plot OPEN price
gazp['OPEN'].dropna().plot(figsize = (14, 6))
We can use other data provider, lets try Yahoo finance and Chevron Corporation (CVX) ticker
cvx = pdr.DataReader('CVX', 'yahoo', start=datetime(2018, 1, 1), end=datetime(2019, 1, 1), session=session)
cvx.head()
Next, we'll generate a simple chart as a quick visual verification that the data looks correct.
cvx['Open'].plot(figsize = (14, 6))
Retrive index data¶
Now we wanna compare stock price with some index
We will download stock prices for European Energy Companies as listed at https://www.nasdaq.com/screening/companies-by-industry.aspx?industry=Energy®ion=Europe
securities = ['BP', 'TOT', 'EQNR', 'SLB', 'E', 'PHG']
sec_data = {}
for security in securities:
sec_df = pdr.DataReader(security,
'yahoo',
start=datetime(2018, 1, 1),
end=datetime(2019, 1, 1),
session=session)
sec_data[security] = sec_df['Open']
sec_data.keys()
Now we have a dictionary of 6 dataframes, each containing the historical daily average exchange prices.
We can preview TOT Open price to make sure it looks ok.
sec_data['TOT'].plot(figsize = (14, 6))
Correlations in 2018¶
Calculate the pearson correlation coefficients for Energy companies in 2018
sec_data_2018 = pd.DataFrame(sec_data)
sec_data_2018.head()
sec_data_2018.pct_change().corr(method='pearson')
These correlation coefficients are all over the place. Coefficients close to 1 or -1 mean that the series' are strongly correlated or inversely correlated respectively, and coefficients close to zero mean that the values tend to fluctuate independently of each other.
Heatmap Visualization¶
corr = sec_data_2018.pct_change().corr(method='pearson')
corr.style.background_gradient(cmap='binary', low=0.2, high=0.9)
Here, the dark red values represent stronger correlations (note that each currency is, obviously, strongly correlated with itself), and values with light background represent strong inverse correlations.
In this notebook, we tested pandas_datareader library, got fresh data on stocks of European Energy Companies for 2018 year and visualized the heatmap correlation matrix.
Comments
comments powered by Disqus