Data Engineer

Data Engineer

This job is no longer open

Please visit here for more information about our Hiring Process.

About the role:

DataCamp, being a data-driven organization runs a data lake on GCP Big Query and reports are created in Metabase, Power BI and in custom Shiny applications. These reports help the company’s teams and leadership members take action using this data. DataCamp’s Airflow cluster runs a thousand tasks each day (from data ingestion pipeline tasks, to data processing tasks that provide data-sets and data-models used by DataCamps team of data scientists).

To facilitate data processing we have a highly automated pipeline built with Terraform and Ansible which allows infrastructure provisioning of all data engineering tooling, this allows DataCamps data scientists and customers to be provided with all the latest data-sets, refreshed on a daily basis. Through good documentation and continuous improvement, we want to continue to enhance the data engineering capability at DataCamp.

It will be your role as a part of the cross functional Infrastructure and Data team and to work directly with the senior data-engineer and data science team on all data engineering initiatives from the business. You will learn how to maintain and create new data pipelines, and you will be managing company wide shared data resources which support our data architecture, and building upon those internal processes as well as having the creative freedom to shape the processes and roadmap for data engineering at DataCamp. 

The team has a strong bias towards providing self-serve systems for deployment and infrastructure provisioning, and aim is to support other teams using these services, making sure they are available and functional, rather than being a central bottleneck in the company. You will under the guidance of our senior data engineer play a key part in planning future improvements and owning your day to day work.

Besides providing data engineering skills to DataCamp you will equally be adept at writing Python and having an understanding of authoring Data Models and a passion for data science and data management, governance and Security on the platform (Python, R, SQL, ..). Evolutions towards regional deployment models are envisioned and will be pivotal for the growth of DataCamp and its data engineering capability.

The ideal candidate:

  • Is an experienced infrastructure or DevOps engineer looking to transition into a Data Engineering role
  • Has some experience of data warehousing (e.g., Big Query or Snowflake) and data engineering related tools (e.g.  Airflow, Metabase, Fivetran)
  • Has 1+ years of administering/maintaining DevOps related tools (AWS, Docker, CI/CD, K8s)
  • Can develop in Python
  • Has excellent oral and written communication skills
  • Is interested in understanding and scaling complex data pipelines
  • Is interested in monitoring and self healing systems
  • Is highly organized with a flexible, can-do attitude and a willingness/aptitude for learning
  • Improves the team with code reviews, technical discussions and documentation
  • Is able to work collaboratively in teams and develop meaningful relationships to achieve common goals

It's a plus if:

  • You have an entrepreneurial spirit.
  • You have experience in DBT
  • You have experience with Infrastructure-as-code (Terraform, Ansible, etc)
  • You have experience with API-gateways or service meshes (Kong, Istio, etc)
  • You are passionate about data science and education
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.