Principal Data Engineer

Principal Data Engineer

This job is no longer open
What You'll Do:
As a pivotal member of the team, you will lead the design and development of a robust data architecture that guides data modeling, integration, processing, and delivery standards enabling modern data product development at Scribd.

You will also serve as a data and analytics solution architect, leading architecture initiatives encompassing data warehousing, data pipeline development, data integrations, and data modeling. You will shape Scribd’s data strategy, guiding stakeholders in how they consume and act on data.

We’re looking for someone with proven proficiency in architecting, designing and development experience with batch and real time streaming infrastructure and workloads. Your expertise will help establish standards for data modeling, integration, processing, and delivery and also help translate business requirements into technical specifications.

At Scribd, we leverage deep data insights to inform every aspect of our business, from product development, experimentation, to understanding our subscriber engagement and tracking key performance indicators. You'll join a data engineering team tackling complex challenges within a rich domain encompassing three distinct brands – Scribd, Everand, and Slideshare – all serving a massive user base with over 200 million monthly visitors and 2 million paying subscribers. You'll have the opportunity to make a real impact as we are heavily investing in improving our core data layer and this exciting new role puts you right at the forefront of this initiative.

Based on the project, this might involve cross-functional work with the Data Science, Analytics, and other Engineering and Business teams to design cohesive data models, database schemas and data storage solutions, consumption strategies and patterns. Almost everything you will be working on will be to increase the "customer satisfaction" for internal customers of Scribd data.

Required Skills:
• 7+ years of experience in data strategy, data architecture, modeling, solution design, data engineering, or a similar role
• Hands-on experience and knowledge of data lake technologies (Databricks, Snowflake, etc),data storage formats (Parquet, Avro etc.)  and query engines (Athena,Presto etc.), data schemas, optimization of queries and associated concepts for building optimized solutions at scale
• Strong understanding of distributed systems, Restful APIs and data consumption patterns
• Proficiency in data modeling, ETL processes, and real-time and batch analytics frameworks.
• Proficient with at least one dialect of SQL.
• Hands-on experience in Scala or Python.

Desired Skills:
• Experience and working knowledge of streaming platforms, typically based around Kafka.
• Strong grasp of AWS data platform services and their strengths/weaknesses.
• Hands on experience in implementing data pipelines for data ingestion and transformation to support analytics and ML pipelines
• Strong experience communicating asynchronously using collaboration tools like Jira, Slack, etc.
• Experience using automation and CI/CD tooling like Git, GitHub,Docker,Jenkins, Terraform, etc.
• Experience developing standards for database design and implementation of various strategic data architecture initiatives around data quality, data management policies/standards, data governance, privacy and metadata management
• Working experience integrating with BI frameworks like Qlik, ThoughtSpot, Looker, Tableau, etc.
At Scribd, your base pay is one part of your total compensation package and is determined within a range. Our pay ranges are based on the local cost of labor benchmarks for each specific role, level, and geographic location. San Francisco is our highest geographic market in the United States.

In the state of California, the reasonably expected salary range is between $191,500 [minimum salary in our lowest geographic market within California] to $282,000 [maximum salary in our highest geographic market within California]. 

In the United States, outside of California, the reasonably expected salary range is between $158,000 [minimum salary in our lowest US geographic market outside of California] to $268,000 [maximum salary in our highest US geographic market outside of California]. 

In Canada, the reasonably expected salary range is between $198,500 CAD[minimum salary in our lowest geographic market] to $267,000 CAD[maximum salary in our highest geographic market]. 

We carefully consider a wide range of factors when determining compensation, including but not limited to experience; job-related skill sets; relevant education or training; and other business and organizational needs. The salary range listed is for the level at which this job has been scoped. In the event that you are considered for a different level, a higher or lower pay range would apply. This position is also eligible for a competitive equity ownership, and a comprehensive and generous benefits package.
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.