Senior Machine Learning Engineer,
Foundational Systems

Role Summary

We are looking for a Senior Machine Learning Engineer to lead the technical architecture and scaling of our AI systems. This is not a "plug-and-play" role using off-the-shelf APIs.

You will design, build, and optimize the core engines of our product—ranging from fine-tuning large-scale models to architecting high-throughput, low-latency inference pipelines. You are an engineer first, a scientist second, and a systems thinker always.

Key Responsibilities

1. Model Development & Fine-Tuning

Architect Neural Systems: Select, adapt, and implement state-of-the-art architectures (Transformers, Diffusion, SSMs).
Advanced Tuning: Lead implementation of fine-tuning strategies including LoRA/QLoRA, RLHF, and DPO.
RAG & Retrieval: Build sophisticated Retrieval-Augmented Generation (RAG) systems, optimizing vector DB indexing and hybrid search.

2. Engineering & Infrastructure (Scaling)

Distributed Training: Manage training runs across multi-GPU/TPU clusters using PyTorch FSDP, DeepSpeed, or Ray.
Production Inference: Optimize models (Quantization, Distillation) for sub-second latency and high concurrency.
Data Flywheels: Build automated pipelines for synthetic data generation and gold-standard evaluation sets.

3. Evaluation & Reliability

Quantifying Quality: Build rigorous, product-aligned evaluation harnesses (perplexity, factual accuracy, safety).
Observability: Implement monitoring for model drift and silent failures in production.

Elite Qualifications

Experience

6+ years software engineering, 4+ years building/deploying ML models at scale.

The Stack

Deep mastery of PyTorch or JAX. Expert Python + C++/Rust/CUDA experience.

Infrastructure

Cloud-native. Hands-on with K8s, Docker, SageMaker, or Vertex AI.

Mathematical Depth

Strong linear algebra/calculus. Can implement ArXiv papers from scratch in 48h.

Execution

Deep debug capability. Can identify bottlenecks in gradient flow or data loaders.

Architect the Engine

We are seeking systems thinkers who view code as infrastructure. Join us in building the most robust AI OS in existence.

Apply for Role

// Deployment_Stack

Framework:PyTorch / JAX

Optimization:TRT / ONNX

Orchestration:K8s / Ray

Hardware:H100 Clusters