Job Title: T&T | EAD | HCE | Senior Consultant/Manager | NVIDIA AI Infrastructure Architect | Pan India

T&T | EAD | HCE | Senior Consultant/Manager | NVIDIA AI Infrastructure Architect | Pan India
• Job requisition ID : 99255
• Location: Bengaluru
• Entity: Deloitte Touche Tohmatsu India LLP
The team
Deloitte’s Technology & Transformation practice can help you uncover and unlock the value buried deep inside vast amounts of data. Our global network provides strategic guidance and implementation services to help companies manage data from disparate sources and convert it into accurate, actionable information that can support fact-driven decision-making and generate an insight-driven advantage. Our practice addresses the continuum of opportunities in business intelligence & visualization, data management, performance management and next-generation analytics and technologies, including big data, cloud, cognitive and machine learning.
Your work profile
Experience : 5+ years and 3+ years of relevant experience
The team
Deloitte’s Technology & Transformation practice can help you uncover and unlock the value buried deep inside vast amounts of data. Our global network provides strategic guidance and implementation services to help companies manage data from disparate sources and convert it into accurate, actionable information that can support fact-driven decision-making and generate an insight-driven advantage. Our practice addresses the continuum of opportunities in business intelligence & visualization, data management, performance management and next-generation analytics and technologies, including big data, cloud, cognitive and machine learning.
Your work profile
- Strong understanding of NVIDIA Infrastructure.
- Experience with GPU, CUDA
Key skills required:
Experience:
- 6+ Years and 3+ Years relevant Experience
Languages:
- Strong NVIDIA Infrastructure, GPU and CUDA.
Technologies:
We are seeking a highly experienced Senior NVIDIA AI Infrastructure Architect to lead the design, deployment, and operations of large-scale AI compute environments. This role is anchored in data center engineering, GPU infrastructure, and NVIDIA AI frameworks, with a strong focus on building robust, scalable, production-grade AI platforms.
You will serve as the technical authority for GPU-accelerated infrastructure, driving architecture strategy, reference designs, performance optimization, and end-to-end operational excellence across on-prem and hybrid environments.
Key Responsibilities
Architecture, Design & Implementation
Architect and deploy large-scale GPU-accelerated AI infrastructure based on NVIDIA platforms (H100/A100 systems, DGX, HGX, OVX).
Lead end-to-end design for AI clusters, including networking (Ethernet/InfiniBand), storage, fabric topology, and high-availability requirements.
Define the architecture for AI factories, high-density GPU clusters, and multi-node training platforms.
Implement and optimize NVIDIA Base Command, NVIDIA AI Enterprise, and NGC stack components.
NVIDIA AI Frameworks & Platforms
Integrate and optimize NVIDIA AI frameworks such as:
NVIDIA Triton Inference Server
NVIDIA TensorRT / TensorRT-LLM
NVIDIA CUDA, cuDNN, NCCL
NVIDIA NeMo, Riva, and Clara (as applicable)
Work closely with data science/ML teams to map training and inference needs to GPU platform architectures.
Data Center Engineering & Operations
Lead large-scale deployments in enterprise data centers, including rack layout, thermals, power planning, and high-density cooling. {nice to have vs. must have}
Oversee operational runbooks, monitoring, patching, firmware upgrades, and lifecycle management of GPU servers.
Ensure high availability, resiliency, and scalable expansion of AI compute infrastructure.
Performance Optimization & Reliability
Tune training, inference, and workload orchestration pipelines for maximum GPU utilization and throughput.
Optimize networking for multi-node, multi-GPU systems with RDMA, NVLink, NVSwitch.
Conduct performance benchmarking using NVIDIA profiling tools.
Collaboration & Leadership
Serve as a senior technical advisor to engineering, platform, and data science teams.
Evaluate new NVIDIA technologies and contribute to long-term AI infrastructure strategy.
Mentor junior architects and engineers; lead cross-functional engineering initiatives.
Required Qualifications
6+ years of experience in infrastructure architecture, data center engineering, or platform engineering.
Strong expertise in GPU-based compute systems, NVIDIA DGX/HGX, and high-density server environments.
Hands-on experience with NVIDIA AI software stack (CUDA, NCCL, TensorRT, Triton, NeMo, Base Command, NGC).
Deep understanding of modern data center design principles:
High-bandwidth networking (InfiniBand, RoCE)
Distributed storage for AI workloads
Power, cooling, and rack integration for GPU clusters
Experience deploying and operating Kubernetes-based AI workloads (NVIDIA GPU Operator, MIG, MPS).
Strong familiarity with HPC/AI scheduling tools (Slurm, Kubernetes, Run:AI, or similar).
Proven capability to lead complex, multi-vendor AI infrastructure initiatives.
Preferred Skills
Experience with LLM infrastructure, L40S/H100 optimization, and multi-GPU training architectures.
Knowledge of hybrid cloud and cloud GPU services (Azure NC/NV, AWS P5/P4, GCP A2/H100).
Experience with automation frameworks (Ansible/Terraform), CI/CD pipelines, and Infrastructure-as-Code.
Exposure to NVIDIA networking (Mellanox), DPU/BlueField technologies, or NDR/HDR fabric design.
Education
- Any Graduation, B.Tech. /B.E., MBA, MCA, BCA.
Location and Way of Working:
- Base location: PAN India
