Reddit: Senior Software Engineer (Kafka), Ads Data Infrastructure

San Francisco

501-1,000 employees

Reddit is a network of communities based on people's interests. Find communities you're interested in, and become part of an online community!

Other Open Roles

Prior Listings

Other Jobs in Data Engineering

See all

Senior Software Engineer (Kafka), Ads Data Infrastructure

Data Engineering

This job is no longer open

"The front page of the internet," Reddit brings over 430 million people together each month through their common interests, inviting them to share, vote, comment, and create across thousands of communities. Come for the cats, stay for the empathy.

As a data engineer, you will build and maintain the data infrastructure tools used by the Ads Monetization Org to generate, ingest, and access petabytes of raw data. A focus on performance and optimization will enable you to write scalable / fault tolerant code while collaborating with a team of top engineers, all while learning about and contributing to one of the most powerful streaming event pipelines in the world.

Not only will your work directly impact hundreds of millions of users around the world, but your output will also shape the data culture across all of Reddit!

Note: we are open to candidates based and authorized to work anywhere in the United States or Canada

Responsibilities:

Design, build, and maintain streaming data infrastructure systems such as Kafka, Kafka Consumers such as built using Flink, Spark used by all of Reddit’s Ads engineering teams
Design alerting and testing systems to ensure the accuracy and timeliness of these pipelines. (e.g., improve instrumentation, optimize logging, etc)
Debug production issues across services and levels of the stack
Plan for the growth of Reddit’s Ads infrastructure
Build a great customer experience for developers using your infrastructure
Work with teams to build and continue to evolve data models and data flows to enable data driven decision-making
Identify the shared data needs across Reddit Ads, understand their specific requirements, and build efficient and scalable data pipelines to meet the various needs to enable data-driven decisions across Reddit Ads

Qualifications:

A strong engineering background and exposure to Data engineering work
Experience developing, maintaining and debugging distributed systems built with open source tools
Experience building infrastructure as a product centered around users needs
Experience optimizing the end to end performance of distributed systems
Experience with scaling distributed systems in a rapidly moving environment
Experience managing and designing data pipelines
Can follow the flow of data through various pipelines to debug data issues
Familiarity with ETL design (both implementation and maintenance)

Bonus:

Experience working on stream processing systems such as Kafka, Flink
Experience working on real-time analytic systems as Druid
Experience with Java, Golang, Scala and Python
Experience with Lambda Architecture systems
Experience with (or desire to learn) Kubernetes

This job is no longer open