Distributed Systems Engineer

Together AI·San Francisco, CA

AI EngineeringSeniorFull-timeRemote
$180K-$300KPosted 1 months ago

About the Role

Together AI is hiring a Distributed Systems Engineer to build the infrastructure for decentralized AI model training and inference. You will design fault-tolerant systems, optimize network communication, and develop scheduling algorithms for GPU clusters.

Requirements

  • 5+ years of distributed systems experience
  • Strong proficiency in Rust, Go, or C++
  • Experience with large-scale cluster management
  • Knowledge of network protocols and optimization
  • Background in systems programming

Nice to Have

  • Experience with model parallelism (tensor, pipeline, data)
  • Background in HPC or supercomputing
  • Familiarity with InfiniBand or NVLink
  • Experience with decentralized computing systems

Benefits

Early-stage equity
Full health benefits
Remote work option
Latest hardware provided
Conference and learning budget
Flexible schedule

Skills

Distributed SystemsRustGoGPU ClustersNetworkingSystems Programming

Related Jobs

Preparing for Your AI Career?

Vincony has all 400+ AI models in one place — compare responses, AI debate, Image/Video/Voice generator, and 20 more tools to help you learn and build with AI.

Visit Vincony.com