End-to-end ML on AWS — from training and fine-tuning to deployment and monitoring
A collection of repositories covering the full SageMaker AI lifecycle: AutoML with AutoGluon, LLM fine-tuning (QLoRA/Full), sparse embeddings, time-series forecasting, MLOps templates, and SageMaker Python SDK v3 patterns.
Comprehensive Claude Code skill for Amazon SageMaker AI. Covers the full lifecycle: inference endpoints, training (classical ML + LLM fine-tuning), HyperPod, Model Monitor, AutoGluon, and SDK v3 patterns. Consolidates and replaces three earlier skills.
View on GitHubA Kiro Power providing battle-tested guidance for deploying and training ML models on Amazon SageMaker AI. Covers inference endpoints, LLM fine-tuning, HyperPod clusters, Model Monitor, AutoGluon, and SDK v3 patterns. Pure knowledge base — no MCP servers required.
View on GitHubValidated SFT recipes for fine-tuning Qwen 3.5 (4B and 9B Base) on SageMaker Training Jobs. Includes QLoRA and full fine-tuning YAML recipes, an interactive recipe generator, and pinned dependency fixes for the PyTorch 2.9.0 DLC.
View on GitHubTrain, evaluate, deploy, and orchestrate AutoGluon models on Amazon SageMaker using SDK v3. Covers tabular classification, time-series forecasting, multimodal text+tabular fusion, and custom Docker images.
Fine-tune SPLADE sparse embedding models with ANCE hard negative mining, self-contained in a single SageMaker training job. Two-phase pipeline on three domains with automatic best-model selection. Deployed via TEI endpoint with SPLADE pooling.
View on GitHubElectricity theft detection comparing three feature engineering strategies: baseline statistics, enhanced temporal features, and Chronos-2 forecast residuals — all classified with XGBoost on SageMaker.
Fork of the official aws-samples repository. Migrated the entire DIY agents workshop to SageMaker Python SDK v3, updating 8 notebooks with new patterns for ModelTrainer, boto3 integration, and modern SDK architecture.
XGBoost training and deployment example using Amazon SageMaker Python SDK v3.
View on GitHubMCP Server that uses SageMaker AI APIs to monitor and manage resources.
View on GitHubCDK/CloudFormation template to automatically shut down Amazon SageMaker Canvas apps on a schedule. Choose daily at 8PM or Fridays at 8PM to save costs on idle Canvas environments.
View on GitHubSageMaker Script Mode examples and patterns.
View on GitHubA SageMaker Projects template to deploy a model from Model Registry with your choice of deployment method: real-time inference, asynchronous inference, or batch transform.
View on GitHubCollection of awesome SageMaker Projects templates for MLOps workflows.
View on GitHubScheduling a SageMaker Processing job with SageMaker Pipelines and Amazon EventBridge.
View on GitHub