Senior Data Engineer

Senior Data Engineer

About the role

Our Data team plays a critical role in this mission by writing code that processes customer-provided data to power both the ordering application and the recommendations within, allowing us to optimize our grocery partner's ordering and business processes. Our customers give us large amounts of data, and ensuring that our system can accurately, reliably and scalably process that data is key to our success. 

  • Write production code to implement fast, reliable, scalable data pipelines in Python (and Pandas and Spark) and SQL to process billions of historical data points collected from tens of thousands of retail stores across the US
  • Design, build, scale, and deploy our ETL pipelines (built in PySpark on Databricks) and data platform (powered by a combination of Postgres, Snowflake, Delta Lakes) that power our recommendation engine and ordering system
  • Work closely with Afresh internal stakeholders, solution architects, and our customers to understand how to transform customer data, define business logic and ensure data quality
  • Collaborate with an interdisciplinary team of experts in machine learning, data science, design, software engineering, and business operations to drive and implement solutions to ensure data accuracy and quality
  • Independently analyze and find issues in the data and work with internal stakeholders to either clearly communicate high-priority issues to the customer or find novel workarounds to extract the signal we need from customer data
  • Monitor, analyze, and understand the data flowing through our system by adding the necessary visualizations and dashboards

 

Skills and experience

The following represent attributes our ideal candidate possesses. We encourage all highly-qualified candidates to apply, even if they do not fulfill all the listed criteria.

  • Ability to identify a problem or area of improvement, design and solution and see it through to implementation
  • Strong understanding and experience with various data stores (databases, data warehouses, key/value stores, etc.). Experience with Pandas, Apache Spark, or other Big Data frameworks and cloud infrastructure preferred
  • Experienced architecture & design of data driven solutions
  • Strong problem-solving ability and ability to work through ambiguity and incomplete specifications
  • Dedication to code quality, testing, design processes, automation, and operational excellence
  • Excellent written, verbal communication and collaboration skills

 

Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.