Model partitioning (pipelined, tensor, model and data parallelism), tiling, resource allocation, memory management, scheduling and optimization (for latency, bandwidth and throughput).
What you will bring:
Minimum:
Bachelors degree in Computer Science with 7+ Yrs of relevant industry experience, MSCS Preferred with 5+ yrs of relevant industry experience.
Ability to deliver production quality code in modern C++.
Experience in modern compiler infrastructures, for example: LLVM, MLIR.
Experience in machine learning frameworks and interfaces, for example: ONNX, TensorFlow and PyTorch.
Experience in production compiler development.
Preferred:
Algorithm design ability, from high level conceptual design to actual implementation.
Experience with relevant Open Source ML projects like Torch-MLIR, ONNX-MLIR, Caffe, TVM.
Passionate about thriving in a fast-paced and dynamic startup culture