Data Engineer: Data Platform Team

Data Engineer: Data Platform Team

About us

Constructor.io powers product search and discovery for some of the largest retailers in the world. We serve billions of requests every week, and you’ve probably seen our results somewhere and used our product without knowing it. We differentiate ourselves by focusing on metrics over features, and reinventing search and discovery from the ground up as a machine learning challenge with the specific goal of improving metrics like revenue. We’re approximately doubling year over year despite the market slow down and have customers in every eCommerce vertical. We’re a passionate team of technologists who love solving problems and want to make our customers’ and coworkers’ lives better. We value empathy, openness, curiosity, continuous improvement, and are excited by metrics that matter. We believe that empowering everyone in a company to do what they think is best can lead to great things.


Data Platform Team

The Data Platform team within Data Science and Engineering is an integral unit that serves internal stakeholders. It develops a data platform that

  • Stores, processes data, produces artefacts the is used by the backend team to run in production
  • Convenient tools for engineers to create, schedule, and run their data workloads.
  • Data quality validation.
  • Real-time Analytical API that serves analytics to our customers.

Data Science and Engineering consist of a mix of data engineers & analysts owning & collaborating on multiple projects. As a Data Platform team member, you will use world-class analytical, engineering, and data processing techniques to build the foundational infrastructure, tooling, and analytical capabilities and enable the business to move forward.


Challenges you will tackle

Constructor integrates with our customers by providing them with client-side libraries that interact with our API. These libraries transmit logs/events (behavioral events) that Constructor uses to

  • Train ML models
  • Make business decisions
  • Conduct AB tests
  • …and so much more

Logs are handled by Behavioral API — a Python service that collects them and stores in S3. This piece of infrastructure is owned by another team, and we’re about to take ownership to control the full data cycle from start to finish.

So your first task is:

  • Separate the API and deploy it as a standalone service
  • Integrate it with the Data Platform Team infra

We also have other long-term goals for the team, such as

  • New task scheduler (Airflow/Prefect/etc.)
  • Recommendation systems DB
  • Performance and stability improvements of Analytics Service
  • Spark pipelines cost and performance optimization
  • A smarter data model for the DWH
  • You have high proficiency in any programming language (Python is preferred). You have experience with backend development with any web framework (for Python it can be Django, Flask, FastAPI, …).
  • You are proficient at SQL (any variant)
  • You have experience working with AWS and have knowledge of its services (EC2, IAM, S3, Lambda, ECS, ECR, …) used for data processing.
  • You have an excellent understanding of data storage, database types and architecture, and you can apply this knowledge to build an effective data infrastructure.
  • You enjoy working with big amounts of data
  • You proactively find opportunities to improve the product and lead this endeavor to success.
  • Bonus points: experience working with ClickHouse
  • Bonus points: advanced knowledge of AWS (CloudFormation, CDK, SNS, SQS)
  • Bonus points: Familiarity with the big data stack (Spark, Presto/Athena, Hive).
  • Unlimited vacation time -we strongly encourage all of our employees take at least 3 weeks per year
  • A competitive compensation package including stock options
  • Company sponsored US health coverage (100% paid for employee)
  • Fully remote team - choose where you live
  • Work from home stipend! We want you to have the resources you need to set up your home office
  • Apple laptops provided for new employees
  • Training and development budget for every employee, refreshed each year
  • Parental leave for qualified employees
  • Work with smart people who will help you grow and make a meaningful impact
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.