Sarvam Logo

Sarvam

Embedded Infrastructure Engineer, Chanakya

Posted 10 Days Ago
Be an Early Applicant
In-Office
Delhi, New Delhi, Delhi, IND
Senior level
In-Office
Delhi, New Delhi, Delhi, IND
Senior level
Design, build, and operate terabyte-scale data storage and ingestion platforms for on-premise and air-gapped AI deployments. Implement indexing, partitioning, query optimisation, pipeline orchestration, observability, capacity planning, and infrastructure automation while collaborating with data scientists and product teams.
The summary above was generated by AI
About Sarvam

Sarvam is building the bedrock of Sovereign AI for India. The company is developing India's full-stack sovereign AI platform, building across research, models, infrastructure and applications with a singular focus on making AI genuinely work for India. Sarvam works with leading enterprises and public institutions and is backed by Lightspeed, Peak XV, and Khosla Ventures. Sarvam partners with India's leading brands, including Tata Capital, SBI Life, CRED, IDFC, and LIC.

 

About the Role

Embedded Infrastructure Engineers design, build, and maintain the data infrastructure that underpins AI system deployments at client sites. You work alongside Embedded Data Scientists and Strategic Deployment Engineers to ensure that terabyte-scale datasets can be ingested, stored, queried, and served to AI reasoning engines reliably and performantly.

This means building and operating data platforms that handle terabyte-scale persistent data stores and sustained large daily ingestion volumes across structured records, documents, imagery, audio, and geospatial data. You will design and maintain the databases, object stores, ingestion pipelines, and processing layers that make this possible.

You will make critical decisions about storage architecture, indexing strategies, pipeline orchestration, and system performance — often in constrained, air-gapped, or operationally sensitive environments where you cannot rely on managed cloud services or standard enterprise tooling. You will own the reliability and performance of the infrastructure layer in your assigned accounts.

 

What You'll Do

•   Design and operate data storage architectures (relational, document, vector, object storage) capable of managing terabyte-scale datasets across multiple modalities

•   Build and maintain ingestion pipelines that reliably process daily data influx — including batch and streaming workloads — with monitoring, error handling, and backpressure management

•   Implement indexing, partitioning, and query optimisation strategies that allow AI systems and data scientists to retrieve and reason over large datasets with acceptable latency

•   Work with Embedded Data Scientists to translate ontologies, schemas, and semantic structures into performant physical data models and storage configurations

•  Deploy and manage database systems, vector stores, and search infrastructure in air-gapped, on-premise, or security-constrained environments

•  Build observability into the data platform: monitor pipeline health, storage utilisation, query performance, and ingestion lag

•  Own capacity planning and scaling decisions for data infrastructure across assigned client deployments

•  Collaborate with product and engineering teams to feed infrastructure learnings back into the core platform and tooling

 

What We're Looking For

•   4–8 years in data infrastructure, data engineering, platform engineering, or site reliability engineering, ideally at organisations operating at significant data scale

•   Direct experience managing multi-terabyte data stores — you have personally built or operated systems handling 10TB+ of persistent data and sustained high-throughput ingestion

•   Deep working knowledge of at least two of: PostgreSQL, MongoDB, Elasticsearch, ClickHouse, or comparable systems — including tuning, indexing, partitioning, and operational management

•   Experience building production data ingestion pipelines using Apache Kafka, Apache Spark, Airflow, Flink, dbt, or equivalent frameworks

•  Strong proficiency in Python and/or Go, with experience writing production infrastructure tooling and automation

•  Solid understanding of storage systems and formats: object storage (S3/MinIO), columnar formats (Parquet, ORC), and how to choose the right storage layer for the workload

•  Familiarity with containerisation and orchestration (Docker, Kubernetes) in production settings

•  Experience with infrastructure-as-code and deployment automation (Terraform or similar)

 

Bonus Points

•  Experience with vector databases or embedding stores (Milvus, Weaviate, Qdrant, pgvector, or similar)

•  Experience deploying and operating infrastructure in air-gapped, on-premise, or hybrid environments

 

Note: We are looking for people who can own the outcomes described here, not

people who match every line of this specification. If this problem excites you and

you believe you can do this work; we want to hear from you.

 

Why Sarvam?

Sarvam is a fast-moving, high talent-density team building full-stack AI for India, working on problems that push the frontiers of AI with real population-scale impact.

•  Work alongside researchers, engineers, builders, and business leaders who move fast and hold each other to a very high bar

•  High ownership and high impact, from day one

•  Everything we do is AI-first, from the way we build and ship to the way we think about problems

•  You can work on problems that could change how an entire country learns, works, and communicates

 

If you want to work on problems at the frontier of AI in India, Sarvam is the place to be.

 

Similar Jobs

13 Hours Ago
Remote or Hybrid
India
Senior level
Senior level
Digital Media • Information Technology • News + Entertainment
Design, develop, and maintain microservices-based Python applications and REST APIs. Implement asynchronous processing, ensure test coverage and code quality, deploy and operate services in Docker/Kubernetes, contribute to CI/CD pipelines, mentor junior engineers, collaborate with cross-functional teams, and optimize performance, scalability, and reliability.
Top Skills: CeleryDjangoDjango Rest FrameworkDockerFastapiGit/GitlabGitlab CiJenkinsKubernetesMongoDBNoSQLPostgresPythonRabbitMQRedisSQL
16 Hours Ago
Remote or Hybrid
India
Senior level
Senior level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Lead business analysis for Risk Weighted Assets (RWA) and Counterparty Credit Risk (CCR) initiatives. Translate Basel III/IV and SA-CCR/IMM/EAD regulatory requirements into BRDs, FRDs, user stories, process flows and data mappings. Liaise with Market Risk, CCR, Regulatory Reporting, Finance, Quant and Technology teams to support capital calculations, RWA optimization and regulatory reporting for transformation programs.
16 Hours Ago
Remote or Hybrid
India
Senior level
Senior level
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
Design and develop web applications and UI/UX solutions, collaborate with stakeholders to define technical solutions, implement Python and JavaScript components, evaluate visualization tools and recommend approaches, deliver quality work in Agile teams under tight deadlines.
Top Skills: JavaScriptPython

What you need to know about the Delhi Tech Scene

Delhi, India's capital city, is a place where tradition and progress co-exist. While Old Delhi is known for its rich history and bustling markets, New Delhi is defined by its modern architecture. It's clear the region places a strong emphasis on preserving its cultural heritage while embracing technological advancements, particularly in artificial intelligence, which plays a central role in shaping the city's tech landscape, fueled by investments in research and development.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account