Senior Software Engineer, Data Platform

Senior Software Engineer, Data Platform

This job is no longer open

About the role:

Samsara is seeking an experienced senior software engineer to join our Data Platform team. The Data Platform team’s goal is to ensure we have a scalable and reliable data platform to meet the data needs of everyone at Samsara - engineers, data scientists, product managers, and more.

Samsara has hundreds of thousands of devices deployed throughout the world and over 20,000 customers using our cloud based products. This results in a vast amount of data that we need to bring in to our Data Platform every day.

Our overall goal on the Data Platform team is to make sure the rest of the company can succeed in accessing and analyzing our rich Samsara product data. This is used for both internal and direct customer facing use cases.

The team is responsible for the big data cloud infrastructure that enables Samsara to make critical business decisions and enables us to explore new product directions. This includes:

  • Building pipelines in Go and Spark to replicate product data from primary data stores (e.g RDS MySQL, S3) to our central data lake
  • Scaling our data lake storage layer (delta/parquet on S3)
  • Optimizing our compute layer for reliability and performance (Spark/Databricks on EC2)
  • Building tools (e.g orchestration frameworks, metadata catalogs) that democratize the use of data for the whole business, including software engineers, data engineers, and data scientists

This team also works on building internal data products & frameworks for other software engineers, data engineers, and data scientists to use.

The team works closely with the following teams:

  • Data Engineering: a team focused on building data assets in our data lake that are easy to use
  • Data Analytics: a team who helps translate product related questions from across the company into internal dashboards.
  • Product engineers and Product Managers: teams across the company building the features directly into our web and mobile applications; they use the Data Platform as part of their live customer facing features, as well as using it for analysis such as how features are being used or outage investigations.

Past projects from the team include:

  • Developing a reliable pipeline for ingesting data from our RDS databases into our Data Lake
  • Building a data pipeline framework to allow engineers to build Spark workflows that produce output tables for customer facing reports
  • Designing an easy to use library and corresponding infrastructure for developers to stream large volumes of data from Go microservices into the data lake
  • Deploying the infrastructure for our data cataloging service, Amundsen, and ingesting our Spark data lake metadata to it

Looking ahead, while we have several projects in mind, we are looking for someone interested in identifying our biggest needs and driving our roadmap. Some ideas we have are:

  • Tapping into Spark execution plans to identify query optimizations
  • Data lake storage improvements (e.g looking at our partitioning strategy)
  • Better data access controls framework

We’re just starting to scratch the surface of what we can do with Samsara’s rich data.  We are looking for someone who is excited to have a big influence in our data platform both in terms of technology and processes.

In this role, you will: 

  • Develop software to reliably ingest vast amounts of data into our data lake
  • Explore new infrastructure needed to support the growing needs of our data platform
  • Build libraries and data management tooling for other software engineers, product managers and data scientists to use the data platform effectively
  • Expand our ability to stream data into the data lake to support near real time access
  • Be responsible for the uptime, reliability, and monitoring of our data platform
  • Work with stakeholders across the company to use the data platform in new areas
  • Champion, role model, and embed Samsara’s cultural principles (Focus on Customer Success, Build for the Long Term, Adopt a Growth Mindset, Be Inclusive, Win as a Team) as we scale globally and across new offices

Minimum requirements for the role:

  • Bachelor's Degree in Computer Science/Engineering or equivalent practical experience.
  • 4+ years experience on data platform focused teams
  • Strong programming skills (experience with Go, Python, or SQL is a plus)
  • Experience working with Spark or related data processing technologies
  • AWS knowledge and expertise (S3, Lambda, SQS, Kinesis)

An ideal candidate also has:

  • 8+ years experience on data platform focused teams
  • Experience with Databricks
  • AWS knowledge and expertise (S3, Lambda, SQS, Kinesis, Step Functions)
  • Familiarity using Terraform

#LI-Remote

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.