Neural Magic offers high-performance inference serving solutions for you to deploy leading open-source LLMs on your private CPU and GPU infrastructure.

Machine Learning Research Scientist

Neural Magic is an early-stage AI software company democratizing high performance for deep learning models. Our goal is to reduce the cost and increase the performance of end-users deploying deep learning applications. Based on decades of research at MIT, Neural Magic has developed a software platform that allows developers to sparsify deep learning models to minimize footprint and run on CPUs at GPU speeds. Please look through our website and GitHub repos to get a feel of what we are about.

Founded by an award-winning team of computer scientists and researchers out of MIT, we are a venture-backed company headquartered in Davis Square, Somerville, MA. Our investors include Amdocs, Andreessen Horowitz, Comcast Ventures, NEA, and Pillar VC.

We are seeking a machine learning research scientist with a proven publication history related to model compression techniques such as pruning and quantization. This person will work closely with our research team to identify, report on, and create new algorithms within the deep learning field. If you are someone who wants to contribute to solving challenging technical problems at the forefront of deep learning, this is the role for you!


  • Use your deep understanding of machine learning to tackle meaningful technical problems
  • Collaborate with product development teams to transfer your ideas into product solutions
  • Perform fundamental research by defining, designing, implementing, and evaluating algorithms
  • Actively engage with the academic community by collaborating with universities, publishing and presenting your work, and attending conferences
