The role involves developing QA strategies for AI applications, testing models for fairness and performance, and collaborating on automation frameworks.
Join the Future of Supply Chain Intelligence — Powered by Agentic AI
At Resilinc, we’re not just solving supply chain problems — we’re pioneering the intelligent, autonomous systems that will define its future. Our cutting-edge Agentic AI enables global enterprises to predict disruptions, assess impact instantly, and take real-time action — before operations are even touched. Recognized as a Leader in the 2025 Gartner® Magic Quadrant™ for Supply Chain Risk Management, we are trusted by marquee clients across life sciences, aerospace, high tech, and automotive to protect what matters most — from factory floors to patient care.
But the real power behind Resilinc? Our people. We’re a fully remote, mission-led team making sure life-saving products and critical goods get where they’re needed, fast. We offer the chance to do meaningful work in a collaborative, empowering culture—where you can be an agent of change. Join us to tackle critical global challenges through high-impact work that matters.
Resilinc | Innovation with Purpose. Intelligence with Impact.
About The Role
At Resilinc, we build intelligent systems that safeguard the global supply chain. As a pioneer in supply chain risk management, we’re pushing the boundaries of resilience with AI-powered platforms. We are building a team of forward-thinking Agent Hackers (AI SDETs) to join our mission.
What’s an Agent Hacker? It’s not just a title — it’s a mindset. You’re the kind of engineer who goes beyond traditional QA, probing the limits of autonomous agents, reverse-engineering their behavior, and designing smart, self-evolving test frameworks.
In this role, you’ll be at the forefront of testing cutting-edge technologies, including Large Language Models (LLMs), AI agents, and Generative AI systems. You’ll play a critical role in validating the performance, reliability, fairness, and transparency of AI-powered applications—ensuring they meet high standards for both quality and responsible use.
If you think like a tester, code like a developer, and break systems like a hacker — Resilinc is your proving ground.
What You Will Do
- Develop and implement QA strategies for AI-powered applications, focusing on accuracy, bias, fairness, robustness, and performance.
- Design and execute automated and manual test cases to validate AI Agents/LLM models, APIs, and data pipelines and good understanding of data integrity, data models, etc
- Assess AI models using quality metrics such as precision/recall and hallucination detection.
- Test AI models for bias, fairness, explainability (XAI), drift, and adversarial robustness.
- Validate prompt engineering, fine-tuning techniques, and model-generated responses for accuracy and ethical AI considerations.
- Service/tool development.
- Conduct scalability, latency, and performance testing for AI-driven applications.
- Collaborate with data engineers to validate data pipelines, feature engineering processes, and model outputs.
- Design, develop, and maintain automation scripts using Selenium and Playwright for API and web testing
- Work closely with cross-functional teams to integrate automation best practices into the development lifecycle.
- Identify, document, and track bugs while conducting detailed regression testing to ensure product quality.
What You Will Bring
- Proven expertise in testing AI models, LLMs, and Generative AI applications, with hands-on experience in AI evaluation metrics and testing tools like Arize, MAIHEM, and LangTest and Playwright MCP for automated testing workflows.
- Strong proficiency in Python for writing test scripts and automating model validation, along with a deep understanding of AI bias detection, adversarial testing, model explainability (XAI), and AI robustness.
- Demonstrate strong SQL expertise for validating data integrity and backend processes, particularly in PostgreSQL and MySQL.
- Strong analytical and problem-solving skills with keen attention to detail, along with excellent communication and documentation abilities to convey complex testing processes and results.
Why You Will Love It Here
- Next-Level QA – Go beyond traditional testing to challenge AI agents, LLMs, and GenAI systems with intelligent, self-evolving test strategies
- Agentic AI Frontier – Be at the forefront of validating autonomous, ethical AI in high-impact applications trusted by global enterprises
- Full-Stack Test Engineering – Combine Python, SQL, and tools like LangTest, Arize, Selenium & Playwright to test everything from APIs to AI fairness
- Purpose-Driven Mission – Join a remote-first team that protects critical supply chains — ensuring vital products reach people when they need them most
What's in it for you?
At Resilinc, we’re fully remote, with plenty of opportunities to connect in person. We provide a culture where ownership, purpose, technical growth and a voice in shaping impactful technology are at our core. Oh, and the perks? Full-stack benefits for health, wealth and wellbeing to keep you thriving. Hit up your talent acquisition contact for a location-specific FAQ.
Curious to know more about us? Dive in at www.resilinc.ai
If you are a person with a disability needing assistance with the application process please contact [email protected].
Top Skills
Arize
Langtest
Maihem
MySQL
Playwright
Postgres
Python
Selenium
SQL
Similar Jobs
Information Technology • Internet of Things • Marketing Tech
The Engineering Manager - SDET will oversee AI application quality assurance, focusing on test automation, model validation, and AI product testing, while mentoring the SDET team.
Top Skills:
AWSCypressGCPJavaJavaScriptPlaywrightPythonSeleniumTypescript
Artificial Intelligence • Consumer Web • HR Tech • Other
Lead development of automation frameworks for AI/ML applications, ensuring reliability and scalability of LLM-based products while optimizing CI/CD integrations and performance testing.
Top Skills:
AWSGCPGithub ActionsHuggingfaceJavaJenkinsJmeterJunitK6LangchainNoSQLOpenai ApisPlaywrightPythonRest AssuredSeleniumSQLTestng
Artificial Intelligence • Edtech • Mobile • Natural Language Processing • Productivity • Software
Lead and scale QuillBot's AI Engineering & MLOps function by overseeing the full ML lifecycle, optimizing performance, and mentoring a team while collaborating cross-functionally.
Top Skills:
AIGCPKubernetesMlMlopsTensorrtVertex Ai
What you need to know about the Delhi Tech Scene
Delhi, India's capital city, is a place where tradition and progress co-exist. While Old Delhi is known for its rich history and bustling markets, New Delhi is defined by its modern architecture. It's clear the region places a strong emphasis on preserving its cultural heritage while embracing technological advancements, particularly in artificial intelligence, which plays a central role in shaping the city's tech landscape, fueled by investments in research and development.

