The important considerations when choosing machine learning algorithms:
Type of problem: It is obvious that algorithms have been designd to solve specific problems. So, it is important to know what type of problem we are dealing with and what kind of algorithm works best for each type of problem. I don’t want to go into much detail but at high level, machine learning algorithms can be classified into Supervised, Unsupervised and Reinforcement learning. Supervised learning by itself can be categorized into Regression, Classification, and Anomoly Detection.
Size of training set: This factor is a big player in our choice of algorithm. For a small training set, high bias/low variance classifiers (e.g., Naive Bayes) have an advantage over low bias/high variance classifiers (e.g., kNN), since the latter will overfit. But low bias/high variance classifiers start to win out as training set grows (they have lower asymptotic error), since high bias classifiers aren’t powerful enough to provide accurate models [1].
Accuracy: Depending on the application, the required accuracy will be different. Sometimes an approximation is adequate, which may lead to huge reduction in processing time. In addition, approximate methods are very robust to overfitting.
Training time: Various algorithms have different running time. Training time is normally function of size of dataset and the target accuracy.
Linearity: Lots of machine learning algorithms such as linear regression, logistic regression, and support vector machines make use of linearity. These assumptions aren’t bad for some problems, but on others they bring accuracy down. Despite their dangers, linear algorithms are very popular as a first line of attack. They tend to be algorithmically simple and fast to train.
Number of parameters: Parameters affect the algorithm’s behavior, such as error tolerance or number of iterations. Typically, algorithms with large numbers parameters require the most trial and error to find a good combination. Even though having many parameters typically provides greater flexibility, training time and accuracy of the algorithm can sometimes be quite sensitive to getting just the right settings.
Number of features: The number of features in some datasets can be very large compared to the number of data points. This is often the case with genetics or textual data. The large number of features can bog down some learning algorithms, making training time unfeasibly long. Some algorithms such as Support Vector Machines are particularly well suited to this case [2,3].
Below is an algorithm cheatsheet provided by scikit-learn (works as rule of thumb), which I believe it has implicitely considered all the above factors in making recommendation for choosing the right algorithm. But it doesn’t work for all situations and we need to have a deeper understanding of these algorithms to employ the best one for a unique problem.
The important considerations when choosing machine learning algorithms:
Type of problem: It is obvious that algorithms have been designd to solve specific problems. So, it is important to know what type of problem we are dealing with and what kind of algorithm works best for each type of problem. I don’t want to go into much detail but at high level, machine learning algorithms can be classified into Supervised, Unsupervised and Reinforcement learning. Supervised learning by itself can be categorized into Regression, Classification, and Anomoly Detection.
Size of training set: This factor is a big player in our choice of algorithm. For a small training set, high bias/low variance classifiers (e.g., Naive Bayes) have an advantage over low bias/high variance classifiers (e.g., kNN), since the latter will overfit. But low bias/high variance classifiers start to win out as training set grows (they have lower asymptotic error), since high bias classifiers aren’t powerful enough to provide accurate models [1].
Accuracy: Depending on the application, the required accuracy will be different. Sometimes an approximation is adequate, which may lead to huge reduction in processing time. In addition, approximate methods are very robust to overfitting.
Training time: Various algorithms have different running time. Training time is normally function of size of dataset and the target accuracy.
Linearity: Lots of machine learning algorithms such as linear regression, logistic regression, and support vector machines make use of linearity. These assumptions aren’t bad for some problems, but on others they bring accuracy down. Despite their dangers, linear algorithms are very popular as a first line of attack. They tend to be algorithmically simple and fast to train.
Number of parameters: Parameters affect the algorithm’s behavior, such as error tolerance or number of iterations. Typically, algorithms with large numbers parameters require the most trial and error to find a good combination. Even though having many parameters typically provides greater flexibility, training time and accuracy of the algorithm can sometimes be quite sensitive to getting just the right settings.
Number of features: The number of features in some datasets can be very large compared to the number of data points. This is often the case with genetics or textual data. The large number of features can bog down some learning algorithms, making training time unfeasibly long. Some algorithms such as Support Vector Machines are particularly well suited to this case [2,3].
Below is an algorithm cheatsheet provided by scikit-learn (works as rule of thumb), which I believe it has implicitely considered all the above factors in making recommendation for choosing the right algorithm. But it doesn’t work for all situations and we need to have a deeper understanding of these algorithms to employ the best one for a unique problem.
Note: For diagram download the below attachment