Multiple Perspectives: Stock Prediction with Multiple Regression
In the world of finance, nothing happens in a vacuum. A stock’s closing price isn’t just a random number; it’s influenced by the opening price, the daily high, the daily low, and a dozen other factors.
In Chapter 8 of Data Analytics for Finance Using Python, we look at how to juggle these multiple factors using a Multiple Regression Model.
The Formula
At its heart, multiple regression is just an equation:
Y = a + b1X1 + b2X2 + ...
Where Y is what you’re trying to predict (the closing price) and X1, X2 are your independent variables (Open, High, Low prices).
Step 1: The Correlation Matrix
Before you start plugging numbers into a formula, you need to know if your variables actually like each other. The authors used a correlation matrix to see how closely the Open, High, and Low prices related to the Close price.
Here’s the thing: they found a positive correlation of over 0.98. In stats-speak, that’s basically a perfect match. It means these variables are excellent candidates for a regression model.
Step 2: The Results (And they’re good)
The authors ran this model on MRF stock data from 2023 to 2024, and the results were pretty stellar:
- R-Square: 0.99. This is the big one. It means that 99% of the movement in the closing price can be explained by the Open, High, and Low prices.
- P-Values: All independent variables had p-values less than 0.05, meaning they are all statistically significant.
- Durbin-Watson: 1.80. This score checks for autocorrelation. A 2.0 is perfect, so 1.80 is a very solid “passing grade.”
Why it matters
A 99% R-square value tells you that this model is incredibly accurate for this specific dataset. It gives you a mathematical “why” behind the price movements.
But here’s the problem: multiple regression assumes a linear relationship. It assumes that if X goes up, Y goes up by a predictable amount. The stock market isn’t always that polite. Sometimes it’s chaotic, non-linear, and doesn’t care about your R-square value.
And that’s why it matters. Multiple regression is a powerhouse for understanding relationships, but it’s only one piece of the puzzle.
Next: Assessing Stock Risk with the F-Test | Previous: Stock Risk Analysis with Descriptive Statistics