Job Title:  T&T | EAD | CMS | Senior Consultant/Manager | NVIDIA AI Infrastructure Architect | Pan India

Job requisition ID ::  99255
Date:  Apr 8, 2026
Location:  Bengaluru
Designation:  Manager
Entity:  Deloitte Touche Tohmatsu India LLP

Your potential, unleashed.

India’s impact on the global economy has increased at an exponential rate and Deloitte presents an opportunity to unleash and realize your potential amongst cutting edge leaders, and organizations shaping the future of the region, and indeed, the world beyond.

 

At Deloitte, your whole self to work, every day. Combine that with our drive to propel with purpose and you have the perfect playground to collaborate, innovate, grow, and make an impact that matters.

 

The Team

Deloitte’s Technology & Transformation practice can help you uncover and unlock the value buried deep inside vast amounts of data. Our global network provides strategic guidance and implementation services to help companies manage data from disparate sources and convert it into accurate, actionable information that can support fact-driven decision-making and generate an insight-driven advantage.

 

Job Description:

Experience – 5+ years and 3+ years of relevant experience

Location – Pan India

Education : Any Graduation, B.Tech/B.E, MBA, MCA, BCA

 

We are seeking a highly experienced Senior NVIDIA AI Infrastructure Architect to lead the design, deployment, and operations of large-scale AI compute environments. This role is anchored in data center engineering, GPU infrastructure, and NVIDIA AI frameworks, with a strong focus on building robust, scalable, production-grade AI platforms.

You will serve as the technical authority for GPU-accelerated infrastructure, driving architecture strategy, reference designs, performance optimization, and end-to-end operational excellence across on-prem and hybrid environments.

 

 

Key Responsibilities

 

 

Architecture, Design & Implementation

 

Architect and deploy large-scale GPU-accelerated AI infrastructure based on NVIDIA platforms (H100/A100 systems, DGX, HGX, OVX).

Lead end-to-end design for AI clusters, including networking (Ethernet/InfiniBand), storage, fabric topology, and high-availability requirements.

Define the architecture for AI factories, high-density GPU clusters, and multi-node training platforms.

Implement and optimize NVIDIA Base Command, NVIDIA AI Enterprise, and NGC stack components.

 

 

NVIDIA AI Frameworks & Platforms

 

 

Integrate and optimize NVIDIA AI frameworks such as:

NVIDIA Triton Inference Server

NVIDIA TensorRT / TensorRT-LLM

NVIDIA CUDA, cuDNN, NCCL

NVIDIA NeMo, Riva, and Clara (as applicable)

Work closely with data science/ML teams to map training and inference needs to GPU platform architectures.

 

 

Data Center Engineering & Operations

 

 

Lead large-scale deployments in enterprise data centers, including rack layout, thermals, power planning, and high-density cooling. {nice to have vs. must have}

Oversee operational runbooks, monitoring, patching, firmware upgrades, and lifecycle management of GPU servers.

Ensure high availability, resiliency, and scalable expansion of AI compute infrastructure.

Performance Optimization & Reliability

Tune training, inference, and workload orchestration pipelines for maximum GPU utilization and throughput.

Optimize networking for multi-node, multi-GPU systems with RDMA, NVLink, NVSwitch.

Conduct performance benchmarking using NVIDIA profiling tools.

 

Collaboration & Leadership

 

 

Serve as a senior technical advisor to engineering, platform, and data science teams.

Evaluate new NVIDIA technologies and contribute to long-term AI infrastructure strategy.

Mentor junior architects and engineers; lead cross-functional engineering initiatives.

 

 

Required Qualifications

 

 

8+ years of experience in infrastructure architecture, data center engineering, or platform engineering.

Strong expertise in GPU-based compute systems, NVIDIA DGX/HGX, and high-density server environments.

Hands-on experience with NVIDIA AI software stack (CUDA, NCCL, TensorRT, Triton, NeMo, Base Command, NGC).

Deep understanding of modern data center design principles:

High-bandwidth networking (InfiniBand, RoCE)

Distributed storage for AI workloads

Power, cooling, and rack integration for GPU clusters

Experience deploying and operating Kubernetes-based AI workloads (NVIDIA GPU Operator, MIG, MPS).

Strong familiarity with HPC/AI scheduling tools (Slurm, Kubernetes, Run:AI, or similar).

Proven capability to lead complex, multi-vendor AI infrastructure initiatives.

 

 

Preferred Skills

 

 

Experience with LLM infrastructure, L40S/H100 optimization, and multi-GPU training architectures.

Knowledge of hybrid cloud and cloud GPU services (Azure NC/NV, AWS P5/P4, GCP A2/H100).

Experience with automation frameworks (Ansible/Terraform), CI/CD pipelines, and Infrastructure-as-Code.

Exposure to NVIDIA networking (Mellanox), DPU/BlueField technologies, or NDR/HDR fabric design.

 

How you’ll grow

 

Connect for impact

 

Our exceptional team of professionals across the globe are solving some of the world’s most complex business problems, as well as directly supporting our communities, the planet, and each other. Know more in our Global Impact Report and our India Impact Report.

 

Empower to lead

 

You can be a leader irrespective of your career level. Our colleagues are characterised by their ability to inspire, support, and provide opportunities for people to deliver their best and grow both as professionals and human beings. Know more about Deloitte and our One Young World partnership.

 

Inclusion for all

 

At Deloitte, people are valued and respected for who they are and are trusted to add value to their clients, teams and communities in a way that reflects their own unique capabilities. Know more about everyday steps that you can take to be more inclusive. At Deloitte, we believe in the unique skills, attitude and potential each and every one of us brings to the table to make an impact that matters.

 

 

 

Drive your career

 

At Deloitte, you are encouraged to take ownership of your career. We recognise there is no one size fits all career path, and global, cross-business mobility and up / re-skilling are all within the range of possibilities to shape a unique and fulfilling career. Know more about Life at Deloitte.

 

 

Everyone’s welcome… entrust your happiness to us                                                                                                                                        

                                                                                                                                       

Our workspaces and initiatives are geared towards your 360-degree happiness. This includes specific needs you may have in terms of accessibility, flexibility, safety and security, and caregiving. Here’s a glimpse of things that are in store for you. 

 

 

Interview tips

 

We want job seekers exploring opportunities at Deloitte to feel prepared, confident and comfortable. To help you with your interview, we suggest that you do your research, know some background about the organisation and the business area you’re applying to. Check out recruiting tips from Deloitte professionals.