Machine Learning Algorithms – What, Why and How?
Image by editor
Machine learning is a field of learning patterns through data without any need for explicit programming or hand-written rules. It is a subfield of artificial intelligence (AI) and computer science.
Before machine learning became mainstream, programmers wrote rules derived from a function of their knowledge of the domain, observing certain hand-picked instances, and the business requirement of performing a particular task. But this legacy way of delivering business results has suffered from some obvious constraints.
- Hand-written rules are limited by the knowledge of the edge cases a programmer can cover. This concept is very well explained by one of the most cited articles in the world of psychology titled “The Magic Number Seven, Plus or Minus Two: Some Limits to Our Information Processing Capacity.”
- Commonly referred to as Miller’s Law, the article describes the limited amount of information a midbrain can hold and how it becomes unmanageable with the increasing number of variables and dimensions.
- Data is dynamic in nature and has become more so over the past decade with the proliferation of technology in our daily lives. The various data models fed by pre-written static rules are of little use to the company to take meaningful action. This is where the pattern-mining ability of machine learning algorithms is put to best use.
Consider an example of a fraud detection case. The programmer would write rules that if the amount of the transaction is greater than $10,000, the location of the transaction is X and it is made from a particular type, i.e. a bank transfer, it is then flagged as a potentially fraudulent transaction.
It might work like a charm for a while until the bad actors find a clever way to commit fraud. Conventional hard-coded rules are no longer effective in detecting fraud. As the way they work evolves, so must our fraud detection system.
Also, are all software development processes highly collaborative? What if the developer who wrote the original rules is no longer associated with the fraud detection project? And the new developer tasked with upgrading the logic has no understanding of the previous system and is skeptical of the backwards compatibility of recent changes. To sum up, updating a rules-based system is not only a tedious process, but also non-scalable.
This is where machine learning algorithms come to our rescue. If the metrics are well defined and well aligned with the business objective, it continues to learn new training data and evolves into a sophisticated machine learning system.
Now we understand what kind of business problems machine learning algorithms are best suited for and what the broad categories are in terms of the statistical formulation of the given use case.
So, the next step is to identify which particular algorithm is best suited for solving a machine learning problem. No rulebook or guide can give you an instant answer, but we’ll discuss the factors experienced data scientists consider when selecting a set of candidate algorithms.
Sci-kit learn has published a flowchart which is a good starting point for understanding which algorithms are appropriate for which type of data and given problem.
It’s easy to get overwhelmed by the number of advanced algorithms and start implementing them to find the right model per swipe and per trial. But time is running out and you need to limit the selection of algorithms to a minor set of candidates, say 2 or 3.
We have reduced our algorithm search space to a small set and will now list the pros and cons, limitations and constraints.
Several factors go into choosing the right champion model:
- Precision: Does it meet the qualification criteria for performance above the threshold?
- Data availability: how much data do you have?
- Resources: How long does the model take to train? Is it computationally expensive?
- Latency: time to draw conclusions in real time
- Explainability: can we explain the predictions?
- Robustness to outliers: Is it sensitive to abnormal behavior?
- Nonlinear associations: If the data follows a nonlinear model, this excludes linear models
- Missing values: is your algorithm able to handle missing values?
The factors listed above give a quick and easy basis for your selection. Furthermore, I will conclude this section by attributing to the “No free lunch” theorem, which states that:
“No model works best for every problem. The assumptions of a good model for one situation may not hold for another problem, so it is common in machine learning to try several models and find the one that works best for a particular case”
To understand this theorem, we must first determine what a model does. A model tries to capture the real-world phenomenon by subjecting it to certain assumptions. These assumptions help simplify the modeling environment and focus only on the relevant details. Therefore, a model that works well for one problem may not work well for another.
Now that we understand the essential factors in the selection of machine learning algorithms, you can refer to this excellent cheat sheet understand the world of algorithms.
The highlight of the article is understanding the importance of machine learning algorithms. Additionally, it explains different model selection criteria to help you find the right machine learning algorithm for your business problem. Notably, no benchmark algorithm performs well in multiple use cases. The article ends with sharing cheat sheets to understand the mapping of data type and business problem to that of an algorithm.
Vidhi Chugh is an award-winning AI/ML innovation leader and AI ethicist. She works at the intersection of data science, product and research to deliver business value and insights. She is a champion of data-centric science and a leading expert in data governance with a vision to create trusted AI solutions.