YOUR MISSION
As a Staff Data Engineer on the Data Modeling and Analytics team, you will be responsible for designing, developing, and maintaining data pipelines to extract, transform, and load data from various sources into our data lakehouse. You will be designing data models to ensure accurate metrics are available to both internal and in-product reports. Your role will be crucial in ensuring the accuracy, efficiency, and usability of our data, enabling data-driven decision-making across the organization.
WHAT YOU'LL DO
- Lead and contribute hands-on to major initiatives from inception to delivery
- Design, develop and maintain data schemas/models, and metadata to enhance the understanding and governance of our data
- Develop robust and scalable ETL pipelines in Databricks to curate data from diverse sources and transform them to produce highly consistent and accurate data/metrics
- Ensure the reliability and efficiency of ETL workflows to meet the analytical and reporting needs of our stakeholders
- Collaborate with data analysts and stakeholders to understand data requirements, and design appropriate data models for analytical and reporting purposes
- Monitor and fine-tune data pipelines, ETL processes for improved performance and efficiency
- Adhere to data governance and security best practices to safeguard sensitive data and maintain compliance with relevant regulations
- Collaborate with data platform engineers, data analysts, product managers, product engineers and other business stakeholders to enhance their understanding of data structure and metrics
- Provide technical guidance and mentorship to senior data engineers, fostering a collaborative and high-performance environment
WHAT YOU'LL BRING
- 6+ years of experience in data engineering, with a focus on ETL processes, data modeling, and ML/AI-driven solutions
- Strong background as a tech lead or architect
- Strong proficiency in data modeling techniques (e.g., dimensional modeling, star schema, snowflake schema) to support internal and in-product reporting
- Solid programming skills in languages like Python, SQL, with experience in ETL toolsets (e.g., dbt, Airflow/Astronomer)
- Proficiency in Databricks, Redshift, S3, Astronomer/Airflow, DBT, ETL processes and data integration techniques
- Understanding of data quality, data governance, and data security best practices
- A passion for working with data and turning insights into action
- Knowledge of big data technologies such as Hadoop, Spark, or NoSQL databases
- Proven track record of designing and optimizing data pipelines for efficiency and accuracy
- An undying concern about data quality and a commitment to maintaining high standards
- A willingness to take ownership of projects and see them through to successful completion
- Bachelor's or Master's degree in Computer Science, Computer Engineering, or a related field
- Excellent command of English, both written and verbal
- Ability to work independently from a remote location with a global team
NICE TO HAVE
- Familiarity with data science concepts and tools
For roles based in New York City, California, Colorado, and Washington, the base salary for this role ranges from $149,600 - $187,000 + equity + benefits. Compensation offered will be determined by factors such as location, level, job-related knowledge, skills, and experience. Certain roles may be eligible for variable compensation.