Lead the intelligent transformation of operations support functions through AI and data solutions, overseeing hands-on technical delivery and team leadership in AIOps initiatives.
Job Summary:
We are looking for an experienced Al Ops Tech Leader — Operations Support to lead the intelligent transformation of our operations support functions. This senior technical leadership role combines deep hands-on contribution to data & Al solutions that directly enhance operations support processes with strategic leadership in developing AIOps tools/platforms and driving Al technical direction for all operations support initiatives.
The role is explicitly centered on operations support domains — including incident management, major incident response, problem management, change enablement, service desk / Level 1—3 support, monitoring & observability, service reliability, and operational resilience. You will remain actively involved (-40—50%) in delivering technical support in data solutions that solve real operations support pain points, while leading the design, build, and evolution of AIOps tooling and serving as the principal Al technical authority for operations support transformation programs.
This is a player-coach position in the operations support space: hands-on technical delivery + team leadership + Al architecture governance for operational excellence.
Key Responsibilities
Hands-on Data & Al Solutions for Operations Support
• Actively lead and contribute to high-impact data/AI projects that directly improve operations support outcomes — e.g., real-time incident enrichment, predictive alerting, automated root-cause analysis, change risk scoring, ticket clustering & autotriage, knowledge mining for support agents, and intelligent runbooks.
• Design and deliver scalable features embedded into operations support workflows and platforms (ServiceNow, Jira Service Management, monitoring tools, ITSM systems, etc.) in collaboration with multidisciplinary competency teams.
• Ensure solutions meet strict operations support SLAs for reliability, low latency, auditability, explainability, and zero-downtime deployment.
• Up-to-date with innovations and research in AIOPS Tools
AIOps Tools & Platform Leadership for Operations Support
• Lead the architecture, development, and continuous enhancement of internal AIOps platforms and reusable components that power operations support teams — including integration with ITSM, observability (Prometheus/Grafana/ELK/Dynatrace/Splunk), ticketing, and automation tooling.
• Support MLOps/AlOps best practices specifically for production operations support Al systems: model monitoring in live ops environments, drift & performance degradation detection, rollback mechanisms, and cost control at operational scale.
Al Technical Leadership for Operations Support Initiatives
• Serve as the lead Al technical authority and trusted advisor for all operations support programs, automation movements, and Al transformation efforts across service operations, NOC, support desks, infrastructure operations, and reliability engineering.
• Lead technical discussions, architecture reviews, PoCs, vendor evaluations, and solution selection whenever Al is being considered or applied to operations support challenges.
• Identify, prioritize, and drive the highest-ROI Al use cases in operations support —
e.g., reducing MTTR/MTTD, automating Level 1 triage, predicting PI incidents, autogenerating post-mortems, optimizing shift handovers, and enabling proactive operations support.
Team & People Leadership
• Build, mentor, and lead a high-performing squad of AIOps specialists focused on operations support outcomes.
• Foster a culture of rapid experimentation, production-first mindset, and relentless focus on operational impact (reduced toil, faster resolution, higher availability).
• Perform technical coaching, design/code reviews, and career development with emphasis on operations support domain knowledge.
Stakeholder & Cross-Functional Collaboration
• Partner intensively with operations support leaders, incident managers, service owners, reliability engineers, ITSM/process teams, and infrastructure groups to align Al initiatives with operational priorities and pain points.
• Strong collaboration with DS&AI Competency.
Qualifications & Experience
Required
• 10+ years in data engineering, Al/ML engineering, or operations support technology roles, with 4—6+ years in technical leadership positions within operations support / IT operations / service operations environments.
• Proven track record delivering production Al/ML/data solutions that measurably improved operations support KPIs (MTT R, MTT D, ticket deflection, toil reduction, availability).
• Strong hands-on expertise with modern data/AI stacks (Python, Spark, Kafka, Airflow, cloud data platforms, PyTorch/TensorFlow, LLM frameworks) and integration into operations support ecosystems (ServiceNow, PagerDuty, Splunk, Datadog, Moogsoft, BigPanda, etc.)., Databricks, Azure/ADF.
• Deep practical experience with AIOps patterns in live operations support settings:
event correlation, anomaly detection, automated actions, predictive analytics, GenAI for ops.
• Experience leading development or significant enhancement of AIOps/internaI tooling platforms specifically for operations support teams.
Preferred
• Background in ITIL-aligned operations support processes (incident, problem, change, service request, knowledge management).
• Hands-on work with GenAl/LLM applications in operations support (ops copilots, auto-remediation agents, intelligent knowledge search, summarization of alerts/incidents).
• Prior success scaling AIOps capabilities in large-scale operations support / NOC / shared service environments.
Leadership & Soft Skills
• Ability to stay deeply technical while leading people and strategy in a high-velocity operations support context.
• Excellent communication — can explain complex Al concepts to operations support practitioners and translate operational pain into technical roadmaps for executives.
• Strong bias for action, production impact, and reducing operational toil through intelligent automation.
Top Skills
Adf
Airflow
Azure
Bigpanda
Cloud Data Platforms
Databricks
Datadog
Kafka
Moogsoft
Pagerduty
Python
PyTorch
Servicenow
Spark
Splunk
TensorFlow
Similar Jobs
Artificial Intelligence • Productivity • Software • Automation
Lead and develop a team of backend engineers for Integration Quality at Zapier, focusing on reliability, security, and cross-functional collaboration to improve product outcomes.
Top Skills:
Node.jsTypescript
Cloud • Fintech • Information Technology • Machine Learning • Software • App development • Generative AI
Responsible for building cloud-based SaaS solutions, mentoring teams, driving technical decisions, and ensuring software quality within a collaborative environment.
Top Skills:
.NetAWSAzureC#GCPJavaScriptNo-SqlReactSQLTypescript
Cloud • Information Technology • Productivity • Software • Automation
As a Principal Engineer, you will develop software solutions, guide teams through the software lifecycle, resolve customer issues, mentor engineers, and own features from gathering requirements to deployment.
Top Skills:
AWSConfluenceDevops ToolsGraphQLJavaJIRAMySQLNoSQLPostgresRestRpcSoapWsdl
What you need to know about the Delhi Tech Scene
Delhi, India's capital city, is a place where tradition and progress co-exist. While Old Delhi is known for its rich history and bustling markets, New Delhi is defined by its modern architecture. It's clear the region places a strong emphasis on preserving its cultural heritage while embracing technological advancements, particularly in artificial intelligence, which plays a central role in shaping the city's tech landscape, fueled by investments in research and development.



