O'Reilly: Data Engineer

Description

About O’Reilly Media

O’Reilly’s mission is to change the world by sharing the knowledge of innovators. For over 40 years, we’ve inspired companies and individuals to do new things—and do things better—by providing them with the skills and understanding that’s necessary for success.

At the heart of our business is a unique network of experts and innovators who share their knowledge through us. O’Reilly Learning offers exclusive live training, interactive learning, a certification experience, books, videos, and more, making it easier for our customers to develop the expertise they need to get ahead. And our books have been heralded for decades as the definitive place to learn about the technologies that are shaping the future. Everything we do is to help professionals from a variety of fields learn best practices and discover emerging trends that will shape the future of the tech industry.

Our customers are hungry to build the innovations that propel the world forward. And we help you do just that.

Learn more: https://www.oreilly.com/about/

Diversity

At O’Reilly, we believe that true innovation depends on hearing from, and listening to, people with a variety of perspectives. We want our whole organization to recognize, include, and encourage people of all races, ethnicities, genders, ages, abilities, religions, sexual orientations, and professional roles.

Learn more: https://www.oreilly.com/diversity

About the Team

Our data platform team is dedicated to establishing a robust data infrastructure, facilitating easy access to quality, reliable, and timely data for reporting, analytics, and actionable insights. We focus on designing and building a sustainable and scalable data architecture, treating data as a core corporate asset. Our efforts also include process improvement, governance enhancement, and addressing application, functional, and reporting needs. We value teammates who are helpful, respectful, communicate openly, and prioritize the best interests of our users. Operating across various cities and timezones in the US, our team fosters collaboration to deliver work that brings pride and fulfillment.

About the Role

We are looking for a thoughtful and experienced data engineer to help grow a suite of systems and tools written primarily in Python. The ideal candidate will have a deep understanding of modern data engineering concepts and will have shipped or supported code and infrastructure with a user base in the millions and datasets with billions of records. The candidate will be routinely implementing features, fixing bugs, performing maintenance, consulting with product managers, and troubleshooting problems. Changes you make will be accompanied by tests to confirm desired behavior. Code reviews, in the form of pull requests reviewed by peers, are a regular and expected part of the job as well.

Salary Range: $110,000 - $138,000

What You’ll Do

Develop data pipelines or features related to data ingestion, transformation, or storage using Python and relational databases (e.g., PostgreSQL) or cloud-based data warehousing (e.g.,BigQuery)
Collaborate with product managers to define clear requirements, deliverables, and milestones
Team up with other groups within O’Reilly (e.g. data science or machine learning) to leverage experience and consult on data engineering best practices
Review a pull request from a coworker and pair on a tricky problem
Provide a consistent and reliable estimate to assess risk for a project manager
Learn about a new technology or paper and present it to the team
Identify opportunities to improve our pipelines through research and proof-of-concepts
Help QA and troubleshoot a pesky production problem
Participate in agile process and scrum ceremonies

What You’ll Have

Required:

3+ years of professional data engineering experience (equivalent education and/or experience may be considered)
2+ year experience of working in an agile environment
Proficiency in building highly scalable ETL and streaming-based data pipelines using Google Cloud Platform services and products
Experience in building data pipelines using Docker
Experience in building data pipelines using tools such as Talend, Fivetran
Proficiency in large scale data platforms and data processing systems such as S3, GCS, Google BigQuery and Amazon Redshift
Excellent Python and PostgreSQL development and debugging skills
Experience building systems to retrieve and aggregate data from event-driven messaging frameworks (e.g. RabbitMQ and Pub/Sub)
Experience with deployment tools such as Jenkins to build automated CI/CD pipelines
Strong drive to experiment, learn and improve your skills
Respect for the craft—you write self-documenting code with modern techniques
Great written communication skills—we do a lot of work asynchronously in Slack and Google Docs
Empathy for our users—a willingness to spend time understanding their needs and difficulties is central to the team
Desire to be part of a compact, fun, and hard-working team

Preferred:

Experience with Google Cloud Dataflow/Apache Beam
Experience with Django RESTful endpoints
Experience working in a distributed team
Knowledge and experience with machine learning pipelines
Contributions to open source projects
Knack for benchmarking and optimization

O'Reilly

Prior Listings

Other Jobs in Data Engineering

Data Engineer

Data Engineer

Description