The Branching Path: Investment Management with Decision Trees

We’ve talked about random forests, but sometimes it’s better to look at the individual trees. In Chapter 6 of Data Analytics for Finance Using Python, we dive into the Decision Tree Classifier.

If you’ve ever used a flowchart to make a decision, you already know how a decision tree works. It’s a diagram that shows every possible outcome of a series of related choices. In finance, it’s a way to map out exactly why a model thinks you should buy or sell a stock.

Root Nodes and Leaf Nodes

The book breaks down the anatomy of a tree:

  • Root Node: This is the “mother” node where everything starts. For the authors’ MRF stock model, the first split was based on std_5 (the 5-day standard deviation).
  • Decision Nodes: These are the branches where the data gets split again based on other features like Open-Close or High-Low prices.
  • Leaf Nodes: These are the end of the line. A leaf node has a “Gini value” of zero, meaning it’s perfectly “pure”—it has reached a final decision (Buy or Sell).

The Reality of the Tree

While decision trees are amazing for visualizing why a decision is made, they can be a bit finicky.

  • Accuracy: 42.85% (Wait, what?)
  • Precision (Buy): 48%
  • Precision (Sell): 40%

Why so low?

You might notice these numbers are lower than the Random Forest or the Naive Bayes model. Here’s the thing: individual decision trees are prone to “overfitting.” They can get so caught up in the tiny details of the training data that they fail to see the bigger picture when they meet new data.

The authors found that the highest leaf class predicted was “Sell” (with 9 leaf nodes) vs “Buy” (with 7 leaf nodes).

Why it still matters

Even if the accuracy isn’t hitting 90%, decision trees are invaluable for transparency. In a world of “black box” AI, a decision tree tells you exactly which variable (like that 5-day standard deviation) was the most important factor in the final call.

And that’s why it matters. Sometimes understanding how a model thinks is just as important as the final answer it gives you.

Next: Stock Risk Analysis with Descriptive Statistics | Previous: Stock Trading with Random Forest

About

About BookGrill.net

BookGrill.net is a technology book review site for developers, engineers, and anyone who builds things with code. We cover books on software engineering, AI and machine learning, cybersecurity, systems design, and the culture of technology.

Know More