Senior site reliability engineer/devops

RCS TECH • mexico city, Mexico

Location mexico city, mexico city

Job Type Full-time

Posted June 03, 2026

Role Description

                What You’ll DoReliability & Operations
-Own availability, latency, and scalability across Saa S and AI systems
- Define and enforce SLOs, SLIs, and error budgets
- Participate in a global on-call rotation (~1 week every 4 weeks)
- Lead incident response and drive blameless postmortems with systemic fixes
Platform & Infrastructure
- Architect and operate on-premise and multi-region, multi-cloud environments
- Manage large-scale Kubernetes workloads
- Build and evolve infrastructure using Terraform and Ansible
- Improve system resilience, fault isolation, and capacity planning
AI/ML & Automation
- Build and scale agentic AI systems for triage, anomaly detection, and self-healing
- Ensure reliability of model serving infrastructure
- Operate, optimize and scale distributed systems
What You Bring
- 5+ years in SRE, Production Engineering, or Platform Engineering
- Strong experience with cloud providers (AWS/GCP/OCI), Kubernetes, and Ia C (Ter...
            

Ready to Apply?

Apply for this Position