AI Inference Engineer

We’re hiring an AI Inference Engineer to help us in building reliable, high-performance production systems. You will be focused on:

  • Optimizing the latency and throughput of model inference
  • Building reliable production serving systems
  • Accelerating research on scaling test-time compute

Ideal Experiences

  • Worked on system optimizations for model serving, such as batching, caching, load balancing, and model parallelism
  • Worked on low-level optimizations for inference, such as GPU kernels and code generation
  • Worked on algorithmic optimizations for inference, such as quantization, distillation, and speculative decoding
  • Worked on large-scale, high concurrent production serving
  • Worked on testing, benchmarking, and reliability of inference services

Bonus Skills

  • Experience with verifiable inference or proof systems
  • Familiarity with Bittensor or token-incentivized networks
  • Deployment experience on H100s or large-scale GPU clusters using Kubernetes

MANIFOLD LABS

© 2025 Manifold Labs, Inc. All Rights Reserved