Over the last decade, organizations across industries have come to rely on the 2.5 quintillion bytes of data humans generate daily to understand their consumers better, identify patterns in behavior, and make more effective and strategic decisions.
As the technology used to collect and analyze data has continued to advance, these organizations have evolved their data-related practices with it. Data analysts can now achieve a level of insight that extends beyond a description of past behavior and instead use data strategically to look ahead at future possibilities.
“Data analytics today is allowing us for the first time to take the massive amount of data we’ve been assembling for years and use it for predictive purposes rather than in just descriptive ways,” says Thomas Goulding, professor for the Master of Professional Studies in Analytics program within Northeastern’s College of Professional Studies. “Through the use of mathematical modeling and data analytics, data can now tell me something I wouldn’t otherwise have been able to learn. As a result, because of analytics, we can make informed business decisions today that simply were not possible ten years ago.”
Known as predictive analytics, this new application of data analysis has successfully served an array of vital industry needs. Read on to explore what predictive analytics entails, examples of its many uses across sectors, and the skills you need to succeed in this ever-changing field.
Download Our Free Guide to Breaking Into Analytics
A guide to what you need to know, from the industry’s most popular positions to today’s sought-after data skills.
What is Predictive Analytics?
Predictive analytics uses mathematical modeling tools to generate predictions about an unknown fact, characteristic, or event. “It’s about taking the data that you know exists and building a mathematical model from that data to help you make predictions about somebody [or something] not yet in that data set,” Goulding explains.
An analyst’s role in predictive analysis is to assemble and organize the data, identify which type of mathematical model applies to the case at hand, and then draw the necessary conclusions from the results. They are often also tasked with communicating those conclusions to stakeholders effectively and engagingly.
Types of Predictive Models
While data analysts are required to make decisions regarding which mathematical model to use in a given situation, they are not actually the ones crunching the data. Statisticians and programmers develop computer programs that carry out these processes, each of which operates using a different mathematical model.
“The tools we’re using for predictive analytics now have improved and become much more sophisticated,” Goulding says, explaining that these advanced models have allowed us to “handle massive amounts of data in ways we couldn’t before.”
The advancement of these tools has also resulted in the use of predictive analytics to identify “unknowns” that previously could not be addressed, leading to an overall need for analysts that can succinctly identify which model best aligns with the type of unknown in each scenario.
Below, we explore four common predictive models and the types of questions they can be best used to answer.
1. Linear Regression
Linear regression is one of the most famous and historic modeling tools, according to Goulding. This model considers all the known data points on a graph and creates a straight line that travels through the center of those data points. This line represents the smallest possible distance between all the points on the graph. A linear regression mathematical modeling tool can then base predictions about nonexistent data off of the relationship between this line and the existing data points.
A linear regression model would be useful when a doctor wants to predict a new patient’s cholesterol based only on their body mass index (BMI). In this example, the analyst would know to put the data the doctor gathered from his 5,000 other patients—including each of their BMIs and cholesterol levels—into the linear regression model. They are hoping to predict an unknown based on a predetermined set of quantifiable data.
The linear regression model would take the data, plot it onto a graph, and establish a line down the center that properly depicts the smallest distance between all plotted data points. In this scenario, when that new patient arrives knowing only that their BMI is 31, a data analyst will be able to predict the patient’s cholesterol by looking at that line and seeing what cholesterol level most closely aligns with other patients who have a BMI of 31.
2. Text Mining
Whereas linear regression uses only numeric data, mathematical models can also be used to make predictions about non-numerical factors. Text mining is a perfect example.
“Text mining is part of predictive analytics in the sense that analytics is all about finding the information I previously knew nothing about,” Goulding says. In this scenario, the tool takes data points in the form of text-based words or phrases and searches a giant database for those specific points.
Sound Familiar? The algorithm used by Google or other search engines to bring up relevant links when you search for a specific keyword is an example of text mining.
Although tools like search engines—or even the “find” function you may use when searching for a word in a digital body of text—represent some common examples of text mining, there are also industry-specific instances where this type of predictive analytics comes into play.
Goulding describes another medical application of predictive analytics, explaining how doctors rely on text mining when analyzing patient symptoms and trying to determine the root cause. “If I’m a doctor and I have 50 children in front of me with flu symptoms, my brain can figure out that the next patient to walk in the door [with similar symptoms] also has the flu,” he says. “But if I see an unusual set of symptoms from just one patient, I may need the case history of patients from all over the world to make a correct diagnosis. My brain can’t help me do this; analytics, however, can.”
Especially in complex patient cases, an analyst can use text mining modeling tools to comb databases, locate similar symptoms among patients of the past, and generate a prediction as to what this new patient is “most likely” suffering from based on that data.
3. Optimal Estimation
Optimal estimation is a modeling technique that is used to make predictions based on observed factors. This model has been used in analytics for over 50 years and has laid the groundwork for many of the other predictive tools used today. According to Goulding, past applications of this method include determining “how to best recalibrate equipment on a manufacturing floor…[and] estimating where a bullet might go when shot,” as well as in other aspects of the defense industry.
If two planes were flying toward one another, an analyst might use the optimal estimation model to predict if or when they will collide. To do this, the analyst would put a variety of observed factors into the mathematical modeling tool, including the airplanes’ height, altitude, speed, angle, and more. The mathematical model would then be able to help predict at which point, if any, the planes would meet.
4. Clustering Models
Clustering models are focused on finding different groups with similar qualities or elements within the data. Many mathematical modeling tools fall within this category, including:
- Hierarchical Clustering
- Density-Based Scan Clustering
- Gaussian Clustering Model
If a fast-food restaurant wanted to open a new location in a new city, the corporate team may work with a data analyst to figure out exactly where that new location should go. The analyst would start by gathering an array of specific, relevant data about each location—including factors like demographics, where the high-end houses are, how close the location is to a college, etc.—then input all of that data into a clustering mathematical model. This model would most efficiently analyze this particular type of data and predict where the most strategic location in the city for that restaurant is based on the data alone.
5. Neural Networks
Neural networks are complex algorithms inspired by the structure of the human brain. They process historical and current data and identify complex relationships within the data to predict the future, similar to how the human brain can spot trends and patterns.
A typical neural network is composed of artificial neurons, called units, arranged in different layers. The neural network uses input units to learn about and process data. On the other hand, output units are on the opposite side and outline how the neural network should respond to the input units. Between the two are hidden layers, which are layers of mathematical functions that produce a specific output.
If an e-commerce retailer wants to accurately predict which products its customers are likely to consider purchasing in the future, a data analyst or data scientist might use neural networks to inform the company’s product recommendation algorithm. The analyst will pull purchase data and feed it to the neural network, giving the network real examples to learn from. This data will travel through the neural network through various mathematical functions until the output is produced and a product recommendation populates.
Other Common Predictive Models
In addition to the mathematical models above, there are additional models that data analysts use to make predictions, including:
- Decision trees
- Random forests
- Logistic regression
- Bayesian methods
Why Is Predictive Analytics Important?
While organizations have recognized the importance of gathering data as a means of looking back on industry trends for years, business teams have only just started scratching the surface of possibility when it comes to predictive analytics.
“Analytics is getting exciting in every industry because we’re [more] equipped than ever to…use the data in the back room that has been gathering dust…to make better business decisions,” Goulding says.
From insurance to retail to healthcare, organizations are starting to adapt to this model of informed decision-making and are using it to their advantage:
- Today, insurance companies can predict if a new client is a risk based on their age, history, health conditions, etc. They can weigh this data and make an informed decision about whether or not they want to cover that individual.
- Retail organizations can predict how new brands or items might sell in their local market based on consumer demographics. They can then make strategic decisions about how much product to stock.
- Doctors can use predictive data to help determine not only what ailment someone’s conditions point to but also their chances of survival, whether or not they need immediate surgery, and their condition’s expected decline over a certain period of time.
No matter the industry, the recent advancements in mathematical modeling and the overall lean into data as a prescriptive form of insight have changed the way businesses operate today. Businesses can make data-driven decisions based on predictive models, allowing them to mitigate potential risks and maximize profits. These changes have created an overall trend in decision-making that is sure to continue developing and expanding for years to come.
Prepare for a Career in Predictive Analytics
Those who aspire to work with predictive analytics should consider a career as a data scientist or data analyst, two roles that play very different parts in the predictive analytics process.
In short, Goulding explains that “data scientists…develop the mathematical models [while] most data analysts, use the tools…that have already been developed.” This difference in roles requires a particular background for professionals who want to achieve success in each field.
Those hoping to work on the development of the mathematical models vital to the predictive analytics process, for example, should focus primarily on honing their computer programming, mathematical, and statistical skills. Data analysts, on the other hand, are tasked with developing a working understanding of these data science tools on top of practical skills in data analysis.
Pursuing a Master’s Degree in Analytics
To gain the full breadth of knowledge and practical abilities required to succeed as a data analyst, Goulding recommends professionals pursue a master’s degree in analytics from a top university like Northeastern.
“In [this type of program], students will master the tools and techniques required for data analytics,” he says. “They will gain functional competency in statistics, programming languages like R and Python, and in visualization tools so that they can learn how to present their results in a visual way.”
Apart from the tactical skills needed to organize, input, and draw conclusions from data, this added layer of presentation skills is what he considers the most important area of study for aspiring analysts. Students “have to learn how to present their findings in a way that executives can use to make decisions,” he says. “They need to be able to think in terms of the business problem they’re trying to solve.”
While the use of visualization tools and the practice of effectively presenting data are covered at length in Northeastern’s analytics curriculum, Goulding also recognizes that students will do most of their learning in this area outside of the classroom.
“[Students] need real-world experiences,” he says. “They need to practice taking real-world data and solving business problems from that data.”
Northeastern offers countless experiential learning opportunities during which students pursuing their master’s in analytics can gain this real-world exposure. From co-ops to XN projects, aspiring analysts are given the chance to apply their skills from the classroom within the various industries and organizations that make up Northeastern’s global partnership network.
“Crunching data with no correlation to a business problem that needs solving is not useful,” Goulding says. “To be effective, a data analyst has to understand how data is driving strategic decisions in the workplace [and how] that data is helping executives make those strategic decisions.”
Explore all that Northeastern’s Master of Professional Studies in Analytics has to offer and take your first step toward an exciting career in predictive analytics today.
Editor’s note: This article was originally published in October 2019. It has since been updated to match Northeastern’s current style.