The Fundamentals of Predictive Analytics – Data Science Wednesday

Data Science Wednesday is produced by Decisive Data, a data analytics consultancy. Lean more about us using the following links. Also, the video transcription is included below.

Video Transcription:
What is Predictive Analytics
Hello, and welcome back to Data Science Wednesday. My name is Tessa Jones, and I’m a data scientist with Decisive Data. And today we’re gonna talk about predictive analytics and what it can do for you. Predictive analytics fits into the spectrum of analytics that we’ve talked about before. Starting with descriptive, which is the most basic of the analytics, it’s basically just cleaning, relating, summarizing, and visualizing your data, really getting to the questions about what’s happening in my business. And then there’s diagnostic, which is really getting down to why things are happening. What’s causing my revenue to decline or to increase? How are things related? Things like that. So if you’ve got a good base in both of these, then we’re ready to move into predictive analytics, which is gonna dive into what’s gonna happen in the future, which is super powerful. If you’re a business person and you want to be able to make good business questions, if you have at least an idea of what might happen in the future, your answers are already gonna be a little bit better.

So, let’s dive in. So, let’s go with an example because that just makes it easier to kind of flow through what’s actually happening here. So let’s pretend that we are grocery store owners. And if we’re already talking about predictive analytics, you should have a pretty good grasp on descriptive and predictive and diagnostic analytics. So, you probably already have a decent dashboard that really tells you what’s happening in your business right now. So, something like this where you have, you know, something here that tells you revenue by different departments like foods and pastry, or how your sales changes by product over time, things like that. So you have an idea of what’s happening in your business, but now you really wanna know, what’s gonna happen in my business? So one really common question is, how many of a given product am I gonna sell for every store? Because this can really answer questions around how you’re gonna support supply chain processes, or how you’re gonna manage the profits that you’re going to have. Things like that.

So the first thing we need to do is talk about what happened in the past. We really can’t do anything or predict very easily unless we know or at least have an idea of what’s happened in the past. So here we have three lines in black that represent, basically, historical data. Each line here is one year worth of sales for a given product. And then the green line here is the current year. And here’s today. And if we build a predictive model, it’s gonna tell us what’s gonna happen for the rest of the year. So if this is all set up and we build a model, basically, we mix this information with all the data that’s really clean and well-organized, we mash it together with a bunch of mathematics and coding, and basically, we pop out some results and it shows up in a visual like this where you have, these are the sales that we have had and these are the sales that we think we’re going to have. So a business person can look at this chart and say, “Wow, we need to put a lot more products to this store because I see sales are gonna increase.” Or, “Our profit margins are gonna be way higher than we thought so we can start a new program.” Things like that. You can really start to get innovative with your business decisions.

So, let’s pretend we’ve built this model and it’s been running for a year. And now we wanna know how well is this model actually performing? So down here, we have a chart that shows, in black, what we actually sold, and then in green, what we thought we were going to sell. And we see that there’s a couple of pretty big misses. Right here, we sold way more than we thought we would, which leaves risk to, you know, missing out on inventory. Or, here, we predicted we would sell way more than we did. So both of these are kind of misses. And so we need to go back and look at the data and understand what assumptions we applied that were maybe a little bit wrong, or applied incorrectly, or look at the data, maybe we weren’t accounting for something and we kind of reorganize that and incorporate it into the model. And then we redeploy it, and then we have a better model.