Staff Data Infrastructure Engineer - Data Science Foundations

Staff Data Infrastructure Engineer - Data Science Foundations

This job has closed but is shown for context on data science work at GitHub.

The Data Science group’s vision at GitHub is to supercharge our platform and services by leveraging our data in order to improve the software engineering workflow. We seek to enable more people to engage in the creation, collaboration and consumption of code while having an active hand in the participation of the future development of the software supply chain. To that end, we both produce features powered by data science/machine learning techniques, as well as build products to enable data/data science collaboration.

The Data Science Foundation team’s role in this overarching vision is to empower scientists and engineers to collaborate on novel and creative data products by bringing data science / machine learning workflows into parity with modern software development and infrastructure management. Pragmatically, this looks like serving internal and external data scientists and researchers with improved workflows - building actions to automate pipelines, improving notebook collaboration, and leading the vision for what data collaboration products will look like on the GitHub platform.

Given the broad scope of how Data Science at GitHub interacts with both our platform and business, the leader we're searching for must be able to review, assess and help articulate fit-for-purpose products, services, and infrastructure in what is a very fast moving field, while simultaneously helping to guide and mentor the team using a well reasoned and pragmatic approach to data science collaboration product creation and delivery.


  • Develop and maintain scalable internal and external data science collaboration services
  • Build products, tools, and systems that empathetically and pragmatically meet real collaboration needs of GitHub users, with an emphasis on Data Scientists and Researchers
  • Drive infrastructure requirements to support data science collaboration products
  • Use data to understand the availability, reliability, and sustainability of our product, services and infrastructure
  • Cultivate the open source projects developed & contributed to by GitHub and build things you are proud to share
  • Work closely with other teams from across the organization

Minimum Qualifications:

  • Excitement about building, operating and maintaining resilient, scalable services
  • Demonstrated understanding of container networking and security
  • Exposure (either directly or through team) to Data Science and Machine Learning workflows
  • A track record of developing and managing a network of cross functional partnerships across engineering and product organizations
  • Strong written and verbal communication skills that crystallize and distill complex topics into action

Preferred Qualifications:

  • Professional data science ecosystem experience
  • Deep expertise in data science (inclusive of Machine & Deep Learning) theory, methods and their practical application
  • Experience building highly available services at scale
  • Experience with Kubernetes or other container orchestration systems
  • Experience with Azure and Azure Machine Learning products
  • Technical writing skills
  • History of success in a remote work environment
  • A passion for GitHub's brand, platform and product and a desire to improve it via novel data science products

Who We Are:

GitHub is the developer company. We make it easier for developers to be developers: to work together, to solve challenging problems, and to create the world’s most important technologies. We foster a collaborative community that can come together—as individuals and in teams—to create the future of software and make a difference in the world.

Leadership Principles:

Customer Obsessed - Trust by Default - Ship to Learn - Own the Outcome - Growth Mindset - Global Product, Global Team - Anything is Possible - Practice Kindness

Why You Should Join:

At GitHub, we constantly strive to create an environment that allows our employees (Hubbers) to do the best work of their lives. We've designed one of the coolest workspaces in San Francisco (HQ), where many Hubbers work, snack, and create daily. The rest of our Hubbers work remotely around the globe. Check out an updated list of where we can hire here:

We are also committed to keeping Hubbers healthy, motivated, focused and creative. We've designed our top-notch benefits program with these goals in mind. In a nutshell, we've built a place where we truly love working, we think you will too.

GitHub is made up of people from a wide variety of backgrounds and lifestyles. We embrace diversity and invite applications from people of all walks of life. We don't discriminate against employees or applicants based on gender identity or expression, sexual orientation, race, religion, age, national origin, citizenship, disability, pregnancy status, veteran status, or any other differences. Also, if you have a disability, please let us know if there's any way we can make the interview process better for you; we're happy to accommodate!

Please note that benefits vary by country. If you have any questions, please don't hesitate to ask your Talent Partner.


This job has closed but is shown for context on data science work at GitHub.
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.