Nanonets Logo

Nanonets

Senior Deep Learning Engineer

Reposted 9 Days Ago
Be an Early Applicant
Easy Apply
India
Senior level
Easy Apply
India
Senior level
Nanonets seeks a Senior Deep Learning Engineer with expertise in DL/ML to develop and optimize architectures, focusing on NLP and CV applications.
The summary above was generated by AI

Location: Bangalore (Hybrid) | $40M+ Funded | Building State-of-the-Art AI

Nanonets is transforming the way businesses work. Our AI platform takes the manual, messy, time consuming work — that bog down industries like finance, healthcare, supply chain, and more — and turns them into seamless, automated processes. What once took hours of human effort now takes seconds with Nanonets. Our client footprint spans across 34% of Fortune 500 enabling businesses across various industries to unlock the potential of AI in automating their business processes. 

More than 10,000 businesses trust Nanonets because we don’t just promise efficiency — we deliver it with unmatched accuracy, seamless integrations.

Join Nanonets to push the boundaries of what's possible with deep learning. We're not just implementing models – we're setting new benchmarks in document AI, with our open-source models achieving nearly 1 million downloads on Hugging Face and recognition from global AI leaders.

Backed by $40M+ in total funding including our recent $29M Series B from Accel, alongside Elevation Capital and Y Combinator, we're scaling our deep learning capabilities to serve enterprise clients including Toyota, Boston Scientific, and Bill.com. You'll work on genuinely challenging problems at the intersection of computer vision, NLP, and generative AI.

Here's a quick 1-minute intro video.

Read about the release here:

Article 1

Article 2

What You'll BuildCore Technical Challenges:
  • Train & Fine-tune SOTA Architectures: Adapt and optimize transformer-based models, vision-language models, and custom architectures for document understanding at scale
  • Production ML Infrastructure: Design high-performance serving systems handling millions of requests daily using frameworks like TorchServe, Triton Inference Server, and vLLM
  • Agentic AI Systems: Build reasoning-capable OCR that goes beyond extraction – models that understand context, chain operations, and provide confidence-grounded outputs
  • Optimization at Scale: Implement quantization, distillation, and hardware acceleration techniques to achieve fast inference while maintaining accuracy
  • Multi-modal Innovation: Tackle alignment challenges between vision and language models, reduce hallucinations, and improve cross-modal understanding using techniques like RLHF and PEFT
Engineering Responsibilities:
  • Design distributed training pipelines for models with billions of parameters using PyTorch FSDP/DeepSpeed
  • Build comprehensive evaluation frameworks benchmarking against GPT-4V, Claude, and specialized document AI models
  • Implement A/B testing infrastructure for gradual model rollouts in production
  • Create reproducible training pipelines with experiment tracking 
  • Optimize inference costs through dynamic batching, model pruning, and selective computation

We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity.

Technical RequirementsMust-Have:
  • 3+ years of hands-on deep learning experience with production deployments
  • Strong PyTorch expertise – ability to implement custom architectures, loss functions, and training loops from scratch
  • Experience with distributed training and large-scale model optimization
  • Proven track record of taking models from research to production
  • Solid understanding of transformer architectures, attention mechanisms, and modern training techniques
  • B.E./B.Tech from top-tier engineering colleges
Highly Valued:
  • Experience with model serving frameworks (TorchServe, Triton, Ray Serve, vLLM)
  • Knowledge of efficient inference techniques (ONNX, TensorRT, quantization)
  • Contributions to open-source ML projects
  • Experience with vision-language models and document understanding
  • Familiarity with LLM fine-tuning techniques (LoRA, QLoRA, PEFT)
Why This Role is Exceptional
  • Proven Impact: Our models approaching 1 million downloads – your work will have global reach
  • Real Scale: Your models will process millions of documents daily for Fortune 500 companies
  • Well-Funded Innovation: $40M+ in funding means significant GPU resources and freedom to experiment
  • Open Source Leadership: Publish your work and contribute to models already trusted by nearly a million developers
  • Research-Driven Culture: Regular paper reading sessions, collaboration with research community
  • Rapid Growth: Strong financial backing and Series B momentum mean ambitious projects and fast career progression
Our Recent Achievements
  • Nanonets-OCR model: ~1 million downloads on Hugging Face – one of the most adopted document AI models globally
  • Launched industry-first Automation Benchmark defining new standards for AI reliability
  • Published research recognized by leading AI researchers
  • Built agentic OCR systems that reason and adapt, not just extract
  • Secured $40M+ in total funding from Accel, Elevation Capital, and Y Combinator

Top Skills

Caffe
Jax
Keras
Python
PyTorch
TensorFlow
Theano
Torch

Similar Jobs

2 Hours Ago
In-Office
Bengaluru, Bengaluru Urban, Karnataka, IND
Senior level
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Develop and optimize deep learning solutions for NVIDIA's automotive systems, collaborating globally and improving performance metrics.
Top Skills: CC++CaffeCudaKerasOnnxPyTorchTensorFlowTensorrt
Yesterday
In-Office
Pune, Maharashtra, IND
Senior level
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Develop and optimize algorithms for deep learning and computer vision applications, collaborating with teams to deliver innovative multimedia solutions.
Top Skills: C++Computer VisionCudaDeep LearningPythonPyTorchTensorFlow
An Hour Ago
Hybrid
Hyderabad, Telangana, IND
Senior level
Senior level
Information Technology • Insurance • Software
We seek a Senior Software Engineer skilled in Full Stack Development with .NET and SQL Server, responsible for designing, developing, and maintaining web applications, collaborating with teams, optimizing databases, and ensuring code quality.
Top Skills: .NetAngularAsp.NetC#CSSDevOpsDockerHTMLJavaScriptJqueryKubernetesMvcReactSQLSQL ServerSsisWeb Api

What you need to know about the Delhi Tech Scene

Delhi, India's capital city, is a place where tradition and progress co-exist. While Old Delhi is known for its rich history and bustling markets, New Delhi is defined by its modern architecture. It's clear the region places a strong emphasis on preserving its cultural heritage while embracing technological advancements, particularly in artificial intelligence, which plays a central role in shaping the city's tech landscape, fueled by investments in research and development.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account