Senior Software Engineer, Data Infrastructure

Senior Software Engineer, Data Infrastructure

This job is no longer open

This community of users generates 65B analytics events per day, each of which is ingested by the Data Platform team into a data warehouse that sees 55,000+ daily queries.

As a data infrastructure engineer, you will build and maintain the data infrastructure tools used by the entire company to generate, ingest, and access petabytes of raw data. A focus on performance and optimization will enable you to write scalable/fault tolerant code while collaborating with a team of top engineers. All while learning about and contributing to one of the most powerful streaming event pipelines in the world.

Not only will your work directly impact hundreds of millions of users around the world, but your output will also shape the data culture across all of Reddit!

How you will contribute:

  • Refine and maintain our data infrastructure technologies to support real-time analysis of hundreds of millions of users.
  • Consistently evolve data model & data schema based on business and engineering requirements.
  • Own the data pipeline that surfaces 65B+ daily events to all teams, and the tools we use to improve data quality.
  • Support warehousing and analytics customers that rely on our data pipeline for analysis, modeling, and reporting.
  • Build data pipelines with distributed streaming tools such as Kafka, Kinesis, Flink, or Spark
  • Ship quality code to enable scalable, fault-tolerant and resilient services in a multi-cloud architecture science, machine learning, and product.

Qualifications:

  •  4+ years of coding experience in a production setting writing clean, maintainable, and well-tested code.
  • Experience with object-oriented programming languages such as Scala, Python, Go, or Java.
  • Degree in Computer Science or equivalent technical field highly preferred. 
  • Experience with scaling large production systems.
  • Experience working with any of the following;; Terraform, Helm, Prometheus, Docker, Kubernetes, Kafka, Spark, Flink and CI/CD.
  • Excellent communication skills to collaborate with stakeholders in engineering, data science, machine learning, and product.
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.