Intermediate ML-Ops engineer

Intermediate ML-Ops engineer

This job is no longer open

Paper is looking for an intermediate ML-Ops engineer to join the growing R&D team. In this role, you will own our end-to-end machine learning platform and core infrastructure, from research to Machine Learning production pipelines. The ultimate mission of this role is to minimize the time for rolling out Data Science algorithms and Machine Learning models as well as maximizing the reliability of the production ML system. You should be self-directed and comfortable implementing training and inference pipelines in collaboration with a wide range of stakeholders and cross-functional teams. As an ideal candidate, you need to have a strong background in machine learning and software development.

This position can be located in any geography in the US or Canada.

Responsibilities:

  • Design and implement large-scale ML systems to support training and serving workloads.
  • Collaborating and share knowledge with our cloud ops team to compress time-to-production for Machine Learning.
  • Build tooling and pipelining abstractions to allow Data scientists to focus on experimentation while empowering self-service workflows to deploy and serve models reliably and consistently.
  • Help Data Scientists produce clean, reproducible, and highly performant machine learning systems through rigorous code review with a lens on software quality.
  • Advocate for automation and monitoring at all steps of ML system construction, and help to define best practices based on personal industry experience and research across the Machine Learning team.
  • Participate in sprint planning, estimations, and reviews.

Qualifications:

  • 3+ years of software development experience, preferably in Python.
  • Experience with maintaining functional, production reference architectures for
    end-to-end Machine Learning in cloud.
  • Familiarity with ML-Ops tools and platforms such as Vertex AI, MLFlow and DVC.
  • Strong Linux system administration skills.
  • Experience with declarative infrastructure and Kubernetes (GKE) for model serving and scalable inference.
  • Exposure to automated testing and CI/CD in the ML context.
  • Understanding of fundamental ML concepts.
  • A constant desire to grow and learn.
  • Strong cross-team communication and collaboration skills.
  • A desire to see teammates succeed together.

Job perks:

  • Work with a dynamic team that provides support whenever you get stuck.
  • Remote first environment.
  • Annual company-wide meetup.
  • Opportunity for career development with a fast-growing company.
  • A unique opportunity to make an impact by making education more equitable.
  • Stipend to help support the growth of your home office.
  • Unlimited access to tutoring for children of Paper employees.  

#LI-Remote #LI-ST01

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.