Data Science Foundations Chapter 15: Real Companies Using Data Science Right Now

Theory is nice. But does any of it work in the real world? Chapter 15 of Data Science Foundations by Stephen Mariadas and Ian Huke answers that with five case studies. Real people, real problems. One of them technically failed. And that is part of the point.

Innovation Factory: Listening to Traffic

Anwar got his start on a reality TV show. Not Love Island. Think Dragons Den or Shark Tank, but earlier when inventors are still building products. His first business used sound detection to alert hearing-impaired people about emergencies. Then he realized sound pattern recognition had way more applications.

He founded Innovation Factory in Birmingham and built Traffic Ear. Small boxes on lamp posts with acoustic sensors, cameras, and pollution detectors. Identify vehicles by sound, estimate pollution from that.

Here’s the thing. Collecting training data was brutal. Nobody wanted to sit by a road for nine months labeling cars. So they automated it. Cameras recorded vehicle details, and those records labeled the sounds. That is how the classification model got built.

But the model only works within its data. They collected on 40 mph roads. So you cannot predict pollution from a bus doing 60. The data defines what the model can do.

The cool part? Same tech went to railways. Keeping animals off tracks. Deer in the UK, kangaroos in Australia, elephants in India. The system detects an animal, feeds video to a generative AI, identifies species and behavior, then triggers one of 20,000 scare responses. The animal’s reaction gets stored to improve the model.

Smart Container Co: When Failure Is the Result

Steve’s job sounds fun. Open beer kegs, take measurements. But he is not actually into beer, so the glamour was limited.

Smart Container Co monitors containers during shipping. Clients worried about drinks becoming over- or under-carbonated in sealed kegs. Bad carbonation means quality problems or explosion risk. So Steve tested whether ultrasonic readings could measure carbonation in a pressurized keg.

He searched the literature. Found almost nothing. Collected data himself. Tried linear regression, multivariate, polynomial. Nothing worked. After a year, no relationship was found.

Most people would call that a failure. Steve does not. They proved the hypothesis wrong, and that is a valid outcome. They now know how not to measure carbonation.

This is my favorite case study. Not every project finds a signal. Data science is not about always getting a positive result. It is about getting an honest one.

Cognitive Business: Predicting Wind Turbine Failures

Ty calls himself a physicist who does many things, including data science. His company works with wind farms.

In a single farm, every turbine is basically the same machine. Learn something about one, apply it to all. Ty saw the opportunity for predictive maintenance. Figure out when something will break before it does.

First problem: finding data. They found an open dataset from a small farm, built a prototype, showed it to an energy company. That company gave access to real data.

Here is the smart part. They eyeball the data first. Ty’s physics background helps because industrial machines follow physical laws, and those laws connect the data points. Then they apply machine learning.

But reliable machinery means few failure examples to train on. So they modeled “normal” first, then flagged anything that deviated. Anomalies guided engineering investigations. Results fed back into models. They built millions of models because each turbine is affected differently by wind, humidity, and position. All automated.

Good With: Financial Inclusion Through Data

Ellie’s team at Good With wants to change how people get assessed for credit. Financial inclusion for people traditional banks ignore or underserve.

Two products: an alternative credit score based on financial capability, and a personalized learning system for financial literacy.

They started with qualitative research and found four financial personas for young people. Then validated with data from about 90 people. Used NLP to classify spending from transaction descriptions. When they fed it into an unsupervised clustering model, it found four groups matching the original personas. Qualitative gut feeling confirmed by quantitative evidence.

What I respect here is the ethical thinking. They could predict depression from financial data. But they asked: does that help users or feel intrusive? Is it legal given health data regulations? They decided not to go there. Knowing which questions to avoid matters just as much.

SMARTabg: Trading Signals and Data Quality

Dev wants to be a winemaker in Surrey. But first, he built a financial trading software company.

SMARTabg provides trading signals to smaller organizations that cannot afford to process financial data themselves. Many signals are statistical models. Moving averages, trend detection. Combine several and you get a strategic signal. Five out of five pointing up means higher probability than three out of five.

Biggest challenge: data quality. Financial data comes from many sources with different quality and cost. Dev sampled providers, checked for outliers, tracked standard deviations. Built about 200 models using relational databases and NoSQL for unstructured data like central bank statements. When the Bank of England talks, markets move. NLP helps interpret that.

Dev’s best advice? Sometimes having no domain knowledge is an advantage. You spot patterns without preconceptions.

What Ties These Together

Five different companies. Traffic, beer kegs, wind turbines, credit scoring, trading. But the themes repeat. Data quality matters. Domain knowledge matters. Failure is a valid result. And models are only as good as the data you feed them.

Previous: Chapter 14: Machine Learning and AI Next: Chapter 16: Conclusion

About

About BookGrill.net

BookGrill.net is a technology book review site for developers, engineers, and anyone who builds things with code. We cover books on software engineering, AI and machine learning, cybersecurity, systems design, and the culture of technology.

Know More