# Stock Price Analysis based on Pharma Companies with Covid Vaccine (Feb 2021)

8 min readMar 8, 2021

Author: Santiago Rodríguez Trompeta

This blog post is part of Udacity Data Scientists Nanodegree Program. Detailed analysis with all required code is posted in my github repository.

In this project I am exploring the movements of Pharma companies stock prices playing with new variables: RSI, Signal, MACD (Part I), and applying some deep learning technics in order to predict the movements of stock prices (Part II) In both cases I have a pure educational objetive.

It’s important to note the high risk of overfitting in the deep learning exercise you will see later and the way to avoid this situation in “Next Steps” section

Introduction:

Yahoo finance is the place I’m going to use to get information about markets. As input I’ll take daily trading data: opening price (Open), highest price the stock traded at (High), how many stocks were traded (Volume), closing price (Close) and closing price adjusted for stock splits and dividends (Adjusted Close) selecting pharma companies with Covid19 vaccine because of my personal interest

Time series forecasting is an interesting area of Machine Learning that requires attention and can be highly profitable if allied to other complex topics such as stock price prediction. Time series forecasting is the application of a model to predict future values based on previously observed values.

By definition, a time series is a series of data points indexed in time order. This type of problem is important because there is a variety of prediction problems that involve a time component, and finding the data/time relationship is key to the analysis.

Metrics

First I am looking at the development of close stock prices by means of comparing and visualising different trading parameters: Daily returns, Cumulative returnes, Rolling statistics of mean, standard deviation and Bollinger Bands, as well as MACD and RSI.

These parameters show, how risky (or volatile) are the stock prices, how profitable they are and what investing logic could be used. Several of these techniques indeed have some power to predict stock prices movements.

In the second part dedicated to Deep Learning algorithms I’ll use RMSE and performance time in order to select the best model.

Part I: Technical Indicators

First, Let’s take a look at the historical data of four Pharma companies with Covid Vaccine to get a feel of what we’re dealing with.

JNJ: Johnson & Johnson

PFE: Pfizer Inc.

MRNA: Moderna, Inc.

AZN: AstraZeneca PLC

Closing Price Evolution Main Four Pharma Companies with Covid19 Vaccine

I’ll focus on the different evolution between Pfizer and AstraZeneca

Pfizer vs AstraZeneca

In the next figures you sill see the evolution of volatility of some stocks by means of rolling mean and Bollinger Bands.

1.1.-Moving Average:

A moving average (MA) is a stock indicator that is commonly used in technical analysis.
The reason for calculating the moving average of a stock is to help smooth out the price data over a specified period of time by creating a constantly updated average price.
A simple moving average (SMA) is a calculation that takes the arithmetic mean of a given set of prices over the specific number of days in the past; for example, over the previous 15, 30, 100, or 200 days. I’will use a window of 20 days

On the graphs you can see rolling mean (redline) and Bollinger Bands two standard deviations far away of the mean (Black lines). We can see that Astra Zeneca is much more volatile, particularly from 2018.

Astra Zeneca has a price below 30 USD during 2012–2014 and now the price is clearly arround 50 USD

1.2.-Cumulative Returns

A cumulative return on an investment is the aggregate amount that the investment has gained or lost over time, independent of the amount of time involved.

1.3.-MACD

Now I would like to show how MACD (Moving Average Convergence Divergence) indicator can help predict if a stock price is going to grow of fall the next days. Divergence in this case representes a difference between two time series and represents MACD. These two time series are rolling exponential weighted means of short time period (I took 12 days) and longer time perios (I took 26). Second is substacted from the first and it is compared with exponential weighted means of even shorter time period (I took 9 days), which is called signal. So buy signal occurs when MACD is smaller than signal line, growing faster than that and line crosses signal line from below. At this point it is a Buy signal. If MACD ist above the signal line, falling faster than that and crosses it from above, it is a Sell signal.

Moving Average Convergence Divergence (MACD) is a momentum indicator that shows the relationship between two moving averages of a security’s price. Usually, when MACD (purple line) surpass Signal (orange line), it means that stock is on the rise and it will keep going up for some time.

1.4.-RSI

Another interesting technique, that could help predict stock prices movements is RSI (relative strength index). It indicates how strong is price momentum shif. It compares average losses and average gains for the previous days.

RSI could be from 0 to 100. Values over 80 represent signal to buy and under 20 a signal to sell. Values in the middle are neutral and dont require for any action. 14 is a common time period used when calculating RSI, even though another time period could be chosen for calculations.

Part II: Deep Neural Networks

When traders use historical data along with technical indicators to predict stock movement, they look for familiar patterns. Some types of neural networks are great at finding patterns and have a variety of applications in image recognition or text processing.

2.1.-LSTM

First, I tried a LSTM. An LSTM unit is composed of a cell, an input gate, an output gate, and a forget gate. The cell remembers values over arbitrary time intervals, and the three gates regulate the flow of information into and out of the cell

LSTM structures are be able to retain memory for RNNs over longer time periods. It solves the problem of gradient vanishing by introducing additional gates, input and forget gates. These additional gates can control better over the gradient, enable what information to preserve and what to forget. The structure is called Long Short-Term Memory because it uses the short-term memory processes to create longer memory.

For example, the price of this Monday may be influenced by the prices of previous Mondays, or even the price of same day last year. RNNs may not be able to retain the price information of same day last year, while LSTM in theory is designed to retain it.

Results for different hyperparameter selection:

With 32 hidden dimensions, 2 layers and 25 epochs the model does not appear to correctly fit the early time series data and does not appear to correctly fit the later ones either.

2.2.-GRU:

Following I tried a GRU network .GRU also aims to solve the vanishing gradient problem. GRU does not have the cell state and the output gate like those in LSTM. It therefore has fewer parameters than LSTM. GRU uses the hidden layers to transfer information. GRU calls its two gates the reset gate and the update gate.

2.3.- Results:

Next table shows results obtained based on a different selection of hyperparameters, bascially I was playing with the number of epcohs and the number of hidden dimensions fixing the number of previous days (20) and the number of layers (2)

Increasing the number of hidden dimensions with the samen number of ecpoch has low impact of prediction performace based on Root Mean Squared Error (RMSE).

On the other hand, if I fix the number of hidden dimensions in 32 and I increase the number of epochs the impact on RMSE is significant in both structures LSTM and GRU

In all cases Train RMSE is lower than Test RMSE alerting for a possible case of overfitting

Improvements:

In order to improve the prediction performance of the models is important to change the way I have standardized the data to train the model, only the training data have to be used to fit the scaler transformation, then the scaler is used to transform the test input data. In the models I’m sharing on this blog probabily there is a bias due to I have standarized train and test datarset
Avoid Overfitting .The real issue is that overfitting not only makes your model inefficient, it could make your prediction very wrong. Deep learning uses the dropout technique to control overfitting. The dropout technique randomly drops or deactivates some neurons for a layer during each iteration. It is like some weights are set to zero. So in each iteration the model looks at a slightly different structure of itself to optimize the model.
Do the same exercise based on Keras in order to consolidate the knowlegde about how LSTM networks work
Try to make predictions not over “Close” but over “Returns”

Conclusions:

In my research I tried to understand stock prices movements based on technical indicators (Returns, MACD and Signal, Rolling mean and standard deviation, RSI ) . The objetive was to know the main trading parameters within the finantial environment and see the evolution during these last months for main pharma companies with Covid19 vaccine

On the other hand I tried to apply some deep learning models for stock prices of two of these pharma comapanies since it seems that they are working really well on time series analysis of other kind of fileds.

I’ve learnt the importance to normalize data before use a LSTM or GRU models (Recommendation from many sources si to use a range between -1 and +1) and also the possbility we have to make predictions for one day (many to one) or for many days (many to many) but in any case I need to have more knowledge about this topic if my goal was to invest my money based on my analysis/models.

Taking into account all this and based on RMSE and processing time I have the best results with GRU architecture comparing with LSTM