Teldar

Scale-to-Zero AI/ML Infrastructure on Google Cloud

Industry:

AI & Machine Learning Consulting

Core Technologies:

Background

Teldar is a Swiss AI consultancy that handles demanding machine learning workloads. Because their projects require high-end hardware like NVIDIA H100 GPUs, keeping infrastructure running 24/7 is financially impractical. They needed a way to access peak computing power exactly when a model needs to run, without paying for that hardware while it sits idle.

Challenges

Teldar faced several challenges managing GPU-intensive workloads in the cloud:

Idle GPU Costs: Premium A3 GPU instances incurred significant costs during downtime between workloads.
Manual Provisioning: Spinning up specialized GPU VMs manually slowed down project execution.
Operational Overhead: Developers were spending time managing infrastructure instead of building models.
Deployment Consistency: Ensuring identical environments across runs was difficult without automation.

Solutions

Zazmic designed and implemented an automated, event-driven scale-to-zero architecture on Google Cloud.

Automated GPU Lifecycle Management:
Google Cloud Functions automatically provision and terminate GPU-enabled VMs based on workload demand, ensuring resources exist only when needed.
Spot GPU Optimization:
The platform leverages GCP Spot VMs, enabling access to NVIDIA H100 performance at a significantly reduced cost.
Infrastructure as Code:
Terraform defines all cloud resources, providing repeatable, reliable deployments with minimal manual effort.
Production Handover:
Zazmic delivered detailed runbooks and documentation, enabling Teldar to manage the platform independently.

Outcomes

The new platform transformed how Teldar runs high-demand AI workloads:

Significant Cost Reduction: GPU compute costs are incurred only while models are actively running.
Faster Time to Execution: Automated provisioning removes delays associated with manual setup.
Operational Autonomy: Teldar’s team manages the infrastructure without ongoing external support.
Production-Ready Stability: Consistent, predictable environments reduce operational risk and technical debt.

Conclusion

By shifting to an automated scale-to-zero architecture, Teldar can run demanding AI workloads without the burden of fixed GPU costs. The solution turns high-end infrastructure into an on-demand utility—allowing the team to scale efficiently, control spend, and focus fully on delivering AI innovation.