Checkmate (itsacheckmate.com) Logo

Checkmate (itsacheckmate.com)

Cloud Engineering Manager

Posted 18 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in India
Senior level
Remote
Hiring Remotely in India
Senior level
The Lead Cloud Engineer will oversee the design and operation of AWS infrastructure, ensuring high availability and scalability while mentoring a team in operational excellence and cloud architecture standards.
The summary above was generated by AI

Checkmate builds technology solutions that enable restaurants to drive sales and connect with customers wherever and whenever they order. Our enterprise technology runs on cutting edge and innovative platforms leveraging AI, ML, and LLM technologies along with integrations into best of class tools and platforms to help restaurants achieve their goals however they choose. From first-party to third-party ordering and loyalty to data analytics, brands have access to the tools, data, and guidance to power, manage, and evolve their digital businesses using Checkmate.

We are looking for a Cloud Engineering Manager to serve as the technical leader of our Cloud, SRE, and DevOps functions having deep expertise in AWS infrastructure, reliability engineering, and operational excellence. This role is ideal for a senior-level engineer who has experience leading other team members designing and operating production systems at scale and is comfortable owning cloud architecture decisions end-to-end.

As a Cloud Engineering Manager, you will define AWS standards, guide platform architecture, and lead initiatives that improve scalability, security, performance, and cost efficiency. You will partner closely with application engineers and other engineering team members to ensure systems are production-ready, observable, and resilient, while managing and mentoring other engineers and raising the overall DevOps and cloud maturity level of the organization.

Essential Job Functions:

AWS Cloud Architecture & Platform Leadership
  • Lead the design and evolution of AWS infrastructure supporting highly available, scalable production systems.
  • Define architectural standards and best practices across AWS services such as VPC, EC2, ECS/EKS, RDS, S3, ALB/NLB, IAM, and CloudFront.
  • Lead cloud-level decision making, including trade-offs around scalability, reliability, cost, and operational complexity.
  • Drive cloud modernization initiatives and guide teams toward resilient, well-architected AWS solutions that scale.
Infrastructure as Code & Automation
  • Own and maintain infrastructure as code using Terraform and/or CloudFormation.
  • Design reusable, modular infrastructure components that enable consistency across environments.
  • Build automation for provisioning, configuration, scaling, and lifecycle management of AWS resources.
  • Eliminate manual operational tasks through scripting, tooling, and platform improvements.
Reliability, Monitoring & Incident Response
  • Define and own monitoring, logging, and alerting standards across AWS and application services.
  • Build observability using tools such as Datadog, CloudWatch, Prometheus, or equivalent.
  • Lead infrastructure-related incident response, including coordination, mitigation, and communication in achievement of RTO and RPO objectives.
  • Conduct thorough root-cause analysis and drive long-term reliability improvements.
Database Performance Optimisation & Scaling
  • Partner with application engineers to ensure databases are performant, scalable, and cost-effective.
  • Optimize AWS database services such as RDS and Aurora for performance, availability, and growth.
  • Analyze and improve query performance, indexing strategies, connection management, and resource utilization.
  • Design and implement scaling strategies including read replicas, storage scaling, and high-availability configurations.
  • Monitor database performance and proactively address bottlenecks before they impact customers.
  • Support backup, recovery, and disaster-recovery strategies aligned with business requirements.
Security, Compliance & Cost Management
  • Implement AWS security best practices including IAM, network segmentation, encryption, and audit logging.
  • Support SOC 2 and other compliance efforts through secure infrastructure design and change controls.
  • Monitor and optimize AWS costs, driving efficient resource usage without compromising reliability.
  • Ensure infrastructure changes follow defined approval, review, and documentation processes.
Leadership & Cross-Functional Collaboration
  • Act as a technical leader and mentor for DevOps and platform engineers.
  • Set expectations and standards for operational excellence across engineering teams.
  • Communicate architecture decisions, system risks, and operational status clearly to technical and non-technical stakeholders.
  • Take full ownership of cloud initiatives from design through implementation and ongoing operations.

Requirements
  • Bachelor’s degree in Computer Science, Software Engineering, or a related field (or equivalent experience).
  • 8+ years of experience in cloud infrastructure, DevOps, or site reliability engineering roles.
  • Experience leading other cloud or platform engineers, devops engineer, or site reliability engineers
  • Deep, hands-on expertise with AWS running production environments.
  • Strong experience with infrastructure as code (Terraform strongly preferred).
  • Strong understanding of AWS networking, IAM, security, and high-availability patterns.
  • Experience optimising and scaling cloud-managed databases (RDS, Aurora, or similar)
  • Demonstrated leadership in incident response and operational decision-making.
  • Strong scripting and automation skills (Bash, Python, or equivalent).
  • Experience supporting compliance frameworks such as SOC 2.
  • Excellent communication skills and the ability to lead without micromanagement.
  • Highly accountable, proactive, and comfortable owning critical infrastructure systems.
  • Must be comfortable working in US hours at least till 5 pm EST.

Top Skills

Aurora
AWS
Bash
CloudFormation
Cloudwatch
Datadog
Prometheus
Python
Rds
Terraform

Similar Jobs

6 Days Ago
In-Office or Remote
Bengaluru, Bengaluru Urban, Karnataka, IND
Senior level
Senior level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Lead and manage a high-performing engineering team, drive software delivery, oversee project management, and contribute to architecture on cloud platforms.
Top Skills: AWSAzureGCPGoJavaKotlinKubernetesOpensearchPostgresScala
Yesterday
Remote
Tamil Nadu, IND
Senior level
Senior level
Artificial Intelligence • Big Data • Cloud • Machine Learning • Software
Lead small engineering teams in developing and delivering Genesys Cloud solutions, while implementing AI integrations and maintaining platform reliability through CI/CD and automated testing.
Top Skills: AWSCi/CdGenesys CloudJSONNode.jsOauthPythonRest ApisTypescript
An Hour Ago
Remote or Hybrid
India
Senior level
Senior level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Lead complex digital transformation projects focusing on mobile banking. Engage senior stakeholders, manage risks, and ensure delivery within Agile frameworks.
Top Skills: AccountsAgileAuthenticationCardsMobile BankingPaymentsSafeScrumSecurity

What you need to know about the Delhi Tech Scene

Delhi, India's capital city, is a place where tradition and progress co-exist. While Old Delhi is known for its rich history and bustling markets, New Delhi is defined by its modern architecture. It's clear the region places a strong emphasis on preserving its cultural heritage while embracing technological advancements, particularly in artificial intelligence, which plays a central role in shaping the city's tech landscape, fueled by investments in research and development.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account