We specialize in developing a cutting-edge search and discovery platform, encompassing search, autocomplete, browse, and recommendations, tailored specifically for online retailers, suppliers, and shops. Our goal is to provide all online stores with ability to have state-of-the-art discovery powered with AI and modern technologies, that optimize for the key e-commerce metrics, which differentiates us from our competitors.
Boasting a robust infrastructure, we handle billions of requests daily and manage a vast data ecosystem, including ~400 GB of data intake daily and approximately 2PB stored in our data lake.
The Data Platform is the central piece and the foundation of the data processing at Constructor. It is the set of tools and infrastructure that every data person uses in our company daily + it's used by some of our public-facing API to fetch data from. It is responsible for data landing from external clients (API), providing external visualization tools with data (Cube.js, ClickHouse), stores data in appropriate formats (S3, ClickHouse, Delta, Parquet), data processing (Python, Spark/Databricks, ClickHouse, AWS Athena, AWS Lambda), monitoring (Prometheus, PagerDuty, Sentry, custom internal APIs with FastAPI), automated testing of pipelines and data quality, cost observability, optimizations and many more things.
We're hiring a Senior Data Engineer to work on our Data Platform. You'll be working in a team of engineers that both own, manage and develop cloud infrastructure that powers the platform and develop data pipelines, cubes, services and other tools that make it work. Tens of engineers, top management and external users will depend on your work.
We're looking for a senior engineer with at least 4 years of experience who has solid skills in any programming language (ideally Python), big data engineering, web services, cloud providers (ideally AWS), and is willing to build many diverse things to develop the platform.
You will work on building various data platform components with us, listening to user feedback, and proactively changing things for the better. Here're some of the projects you may be involved with:
* Implement a data model for our tables (bronze, silver, gold) to incapsulate parsing logic, lower the costs and provide better abstractions
* Power our recommendation system with a fast, global data store.
* Create a framework or service that regularly checks the data integrity and quality.
* Improve the build system in our repositories to speed up test execution, job deployment and thus improving the developers productivity
* Optimize Spark pipelines, ClickHouse tables/queries, and streaming processing in AWS Lambda.