Home
Communities
Companies
- Companies
  
  Discover best places to work
- Compare Companies
  
  Compare & find best workplace
- Add Office Photos
  
  Bring your workplace to life
- Add Company Benefits
  
  Highlight your company's perks
Reviews
- Company reviews
  
  Read reviews for 6L+ companies
- Write a review
  
  Rate your former or current company
Salaries
- Browse salaries
  
  Discover salaries for 6L+ companies
- Salary calculator
  
  Calculate your take home salary
- Are you paid fairly?
  
  Check your market value
- Share your salary
  
  Help other jobseekers
Interviews
- Company interviews
  
  Read interviews for 40K+ companies
- Campus placements
  
  Interviews questions for 1K+ colleges
- Share interview questions
  
  Contribute your interview questions
Jobs
Awards

RATE NOW!
- ABECA 2025
  
  RATE NOW!
  
  AmbitionBox Employee Choice Awards - 4th Edition
- ABECA 2024
  
  AmbitionBox Employee Choice Awards - 3rd Edition
- AmbitionBox Best Places to Work 2022
  
  2nd Edition
- AmbitionBox Best Places to Work 2021
  
  1st Edition

Add office photos

Employer? Claim Account for FREE

Wenger & Watson Inc.

Compare

2.8

based on 29 Reviews

14 Wenger & Watson Inc. Jobs

Machine Learning Compiler Engineer - Deep Learning Models (12-15 yrs)

Wenger & Watson Inc.

2.8

based on 29 Reviews

12-15 years

Mumbai

Machine Learning Compiler Engineer - Deep Learning Models (12-15 yrs)

Wenger & Watson Inc.

posted 2mon ago

Job Role Insights

Flexible timing

Key skills for the job

Python Artificial Intelligence Machine Learning generative ai C++ C

+ 2 more

Job Description

Overview :

- Development and support of new AI/ML compiler features/technologies to accelerate deep learning models.

- Work closely with AI hardware accelerator teams and add support for compiler features covering optimization algorithms, code generation, etc.

to fully utilize the hardware features for maximum efficiency.

- Be well acquainted with the latest trends in ML models and compiler technologies to build innovative solutions in our products.

Roles and responsibilities :

- Develop end-to-end ML compiler leveraging standard compiler infrastructures like TVM, MLIR, Torch Dynamo/Inductor taking advantage of both intra-operator parallelism and graph/pipeline/dataflow parallelism while mapping to hardware compute/processing elements of custom AI accelerator

- Implement low level parallel programming model for development/deployment of high performance kernels fully utilizing the hardware capabilities

- Adapt advance techniques/algorithms for placement/scheduling and parallelization of model graphs to improve the performance of ML applications optimizing execution speed and resource utilization

- Code generation leveraging LLVM for the AI accelerator compute elements/cores of targeted ISA with custom instructions

- Debugging and profiling of features of compiler to identify issues and performance hot spots

- Own responsibility throughout the product lifecycle in solving functional/performance issues during execution phase

- Work closely with hardware architecture team for efficient HW/SW Co-design of AI accelerator IP and overall AI server-class SoC

- Keep up-to-date with the industry trend in compiler frameworks in terms of feature advancements, design methods and approaches

Qualifications :

- Should have minimum 12+ years of relevant experience in AI/ML

- Should have deep practical experience of developing end-to-end ML compiler with any AI accelerator architecture (GPU/ASIC/many-core heterogenous)

- Should have good experience with handling complex hierarchies of compute elements and memories of hardware accelerator and their mapping in backend compiler for efficient power/performance

- Should have very good knowledge in any one of compiler frameworks like TVM, MLIR, Torch Dynamo/Inductor or equivalent

- Good knowledge on SOTA Generative AI model architectures like LLM and model optimization techniques will be helpful

- Good knowledge on popular ML framework ecosystems (PyTorch/TensorFlow/ONNX)

- High proficiency in C/C++, Python, domain-specific languages and parallel programming languages like OpenCL/CUDA