Job Title:  T&T | EAD | HCE | Senior Consultant/Manager | NVIDIA AI Infrastructure Architect | Pan India

Job requisition ID ::  99255
Date:  May 7, 2026
Location:  Bengaluru
Designation:  Manager
Entity:  Deloitte Touche Tohmatsu India LLP

T&T | EAD | HCE | Senior Consultant/Manager | NVIDIA AI Infrastructure Architect | Pan India
Job requisition ID : 99255 
Location: Bengaluru
Entity: Deloitte Touche Tohmatsu India LLP 

The team

 

Deloitte’s Technology & Transformation practice can help you uncover and unlock the value buried deep inside vast amounts of data. Our global network provides strategic guidance and implementation services to help companies manage data from disparate sources and convert it into accurate, actionable information that can support fact-driven decision-making and generate an insight-driven advantage. Our practice addresses the continuum of opportunities in business intelligence & visualization, data management, performance management and next-generation analytics and technologies, including big data, cloud, cognitive and machine learning.       

 

Your work profile

 

Experience : 5+ years and 3+ years of relevant experience

 

The team

 

Deloitte’s Technology & Transformation practice can help you uncover and unlock the value buried deep inside vast amounts of data. Our global network provides strategic guidance and implementation services to help companies manage data from disparate sources and convert it into accurate, actionable information that can support fact-driven decision-making and generate an insight-driven advantage. Our practice addresses the continuum of opportunities in business intelligence & visualization, data management, performance management and next-generation analytics and technologies, including big data, cloud, cognitive and machine learning.       

 

Your work profile

  • Strong understanding of NVIDIA Infrastructure.
  • Experience with GPU, CUDA

Key skills required: 

Experience:

  • 6+ Years and 3+ Years relevant Experience

Languages:

  • Strong NVIDIA Infrastructure, GPU and CUDA.

  Technologies:

We are seeking a highly experienced Senior NVIDIA AI Infrastructure Architect to lead the design, deployment, and operations of large-scale AI compute environments. This role is anchored in data center engineering, GPU infrastructure, and NVIDIA AI frameworks, with a strong focus on building robust, scalable, production-grade AI platforms.

You will serve as the technical authority for GPU-accelerated infrastructure, driving architecture strategy, reference designs, performance optimization, and end-to-end operational excellence across on-prem and hybrid environments.

 

Key Responsibilities

 Architecture, Design & Implementation

Architect and deploy large-scale GPU-accelerated AI infrastructure based on NVIDIA platforms (H100/A100 systems, DGX, HGX, OVX).

Lead end-to-end design for AI clusters, including networking (Ethernet/InfiniBand), storage, fabric topology, and high-availability requirements.

Define the architecture for AI factories, high-density GPU clusters, and multi-node training platforms.

Implement and optimize NVIDIA Base Command, NVIDIA AI Enterprise, and NGC stack components.

NVIDIA AI Frameworks & Platforms

Integrate and optimize NVIDIA AI frameworks such as:

NVIDIA Triton Inference Server

NVIDIA TensorRT / TensorRT-LLM

NVIDIA CUDA, cuDNN, NCCL

NVIDIA NeMo, Riva, and Clara (as applicable)

Work closely with data science/ML teams to map training and inference needs to GPU platform architectures.

 

Data Center Engineering & Operations

Lead large-scale deployments in enterprise data centers, including rack layout, thermals, power planning, and high-density cooling. {nice to have vs. must have}

Oversee operational runbooks, monitoring, patching, firmware upgrades, and lifecycle management of GPU servers.

Ensure high availability, resiliency, and scalable expansion of AI compute infrastructure.

Performance Optimization & Reliability

Tune training, inference, and workload orchestration pipelines for maximum GPU utilization and throughput.

Optimize networking for multi-node, multi-GPU systems with RDMA, NVLink, NVSwitch.

Conduct performance benchmarking using NVIDIA profiling tools.

 

Collaboration & Leadership

 

 

Serve as a senior technical advisor to engineering, platform, and data science teams.

Evaluate new NVIDIA technologies and contribute to long-term AI infrastructure strategy.

Mentor junior architects and engineers; lead cross-functional engineering initiatives.

 

Required Qualifications

6+ years of experience in infrastructure architecture, data center engineering, or platform engineering.

Strong expertise in GPU-based compute systems, NVIDIA DGX/HGX, and high-density server environments.

Hands-on experience with NVIDIA AI software stack (CUDA, NCCL, TensorRT, Triton, NeMo, Base Command, NGC).

Deep understanding of modern data center design principles:

High-bandwidth networking (InfiniBand, RoCE)

Distributed storage for AI workloads

Power, cooling, and rack integration for GPU clusters

Experience deploying and operating Kubernetes-based AI workloads (NVIDIA GPU Operator, MIG, MPS).

Strong familiarity with HPC/AI scheduling tools (Slurm, Kubernetes, Run:AI, or similar).

Proven capability to lead complex, multi-vendor AI infrastructure initiatives.

 

Preferred Skills

Experience with LLM infrastructure, L40S/H100 optimization, and multi-GPU training architectures.

Knowledge of hybrid cloud and cloud GPU services (Azure NC/NV, AWS P5/P4, GCP A2/H100).

Experience with automation frameworks (Ansible/Terraform), CI/CD pipelines, and Infrastructure-as-Code.

Exposure to NVIDIA networking (Mellanox), DPU/BlueField technologies, or NDR/HDR fabric design.

 

Education

  • Any Graduation, B.Tech. /B.E., MBA, MCA, BCA.

 

Location and Way of Working: 

  • Base location: PAN India