About the role
Join our research team: making LLMs run efficiently on constrained hardware, designing agent architectures, and building inference systems that scale across heterogeneous hardware.
Your work feeds directly into MOSAIC compression, Poolnoodle models, ScaiCore language, and ScaiGrid routing.