Rationale
SageMaker is the platform we use for developing solutions involving Machine Learning.
The main reasons why we chose it over other alternatives are:
- It integrates with EC2, allowing to easily provision cloud computing resources. Such feature is essential in order to have horizontal autoscaling.
 - It complies with several certifications from ISO and CSA. Many of these certifications are focused on granting that the entity follows best practices regarding secure cloud-based environments and information security.
 - It integrates with S3, allowing us easily to store raw data, datasets and training outputs in our S3 Bucket.
 - It supports a wide range of EC2 ML-specific machines for training models.
 - It supports EC2 spot machines, allowing to considerably reduce machine costs.
 - Thanks to its horizontal autoscaling capabilities, it is very easy to implement parallelism by running several models or feature combinations in separate machines, greatly increasing training performance.
 - It supports Hyperparametrization, allowing to concurrently train several instances of a model using different parameter values. Such feature is essential for optimizing our most accurate model.
 - It integrates with IAM, allowing to keep a least privilege approach regarding authentication and authorization.
 - It supports a wide range of frameworks, including scikit-learn, the one that Sorts uses.
 - EC2 workers performance can be monitored via CloudWatch.
 - Logs for training jobs can be monitored via CloudWatch.
 
Alternatives
- IBM Watson Studio: It does not integrate with EC2 or S3, increasing overall complexity. Pending to review.
 - GCP Vertex AI: It does not integrate with EC2 or S3, increasing overall complexity. Pending to review.
 - Azure machine learning: It does not integrate with EC2 or S3, increasing overall complexity. Pending to review.
 
Usage
- We use SageMaker as the Machine Learning platform for training sorts, our ML-based software vulnerability scanner.
 - We do not use SageMaker spot instances. Pending to implement.