Data Engineer

Data Engineer

This job is no longer open

The data.world team is seeking an experienced data engineer to help build and maintain our modern analytics stack in ways that support creating ELT pipelines, reusable data models, insights and queries for the company and our customers.

As a member of our Product Science team, you will establish yourself as a key expert and evangelist on our data, working cross-functionally to support data self-service and generate reusable data models and assets. The data assets created by this team reliably and easily support generating insights about our business in support of key initiatives. You’ll develop customer-facing data products that become a part of the platform itself. You will build and iterate on a modern, SaaS-based event collection, data warehouse, analytics, visualization, and catalog infrastructure stack.

This role is a balance of understanding what makes our business and our customers’ businesses tick, putting data modeling and pipelining skills to work, generating analysis and insights, and telling the data story. We’re looking for someone who is able to effectively communicate and collaborate with others, and is passionate about working with data.

What you’ll do:

  • Collaborate with business groups at data.world to ask thoughtful questions and gather requirements around key insights and reporting needs.
  • Engineer lasting solutions that turn application data and streaming events into self-service reports and insights using AWS Cloud Services and SaaS tools.
  • Develop and maintain a suite of production SQL data models using dbt to transform our streaming and transactional data into a form suitable and efficient for analytics and data science work.
  • Administer and optimize our Snowflake data warehouse and BI infrastructure.
  • Evolve our star schema and data flows to ensure utility and performance. Diagram and document as needed to support understanding and troubleshooting.
  • Write and maintain SQL jobs in support of ETL/ELT and BI analysis, reporting, and visualization, as well as ability to troubleshoot SQL jobs as required.
  • Build and maintain internal data catalog including data dictionaries, glossary, and curated datasets in support of easy self-service by the rest of the company.
  • Develop BI data reports, visualizations, and queries to support measuring our KPIs and supporting the success of our SaaS business.
  • Implement new productized data and analytics capabilities in support of customers understanding their usage and improving data governance.
  • Conduct evidence-based investigations and draw actionable conclusions in support of company and team goals and overall product success.
  • Be a steward and evangelist for data driven culture and data best practices within the company.
  • Be customer zero, leveraging our product and providing feedback as one of the key target users that data.world is actually intended for.

Our data stack:

  • Segment as our customer data platform
  • AWS cloud services and Stitch/Fivetran for extraction and load jobs
  • Dagster for workflow orchestration
  • Snowflake and dbt for in-warehouse transformations
  • data.world for data catalog, governance, and collaboration
  • Tableau as our data serving and BI layer
  • CircleCI for CI/CD

Experience and capabilities you have:

  • 2+ years of core data engineering experience writing production grade ELT jobs using scripting languages such as Python and any workflow orchestration tools such as Airflow, Dagster, Prefect, Luigi, etc.
  • 2+ years of experience writing production grade SQL and working with any of the modern data warehouses such as Snowflake, BigQuery or Redshift.
  • 2+ years of experience in deploying cloud resources using Infrastructure as Code (IaC) tools such as AWS Cloud Formation, Terraform, etc.
  • Familiarity with dbt (data build tool) for cloud data warehouse transformations.
  • Strong data modeling experience using star schema or other data modeling patterns.
  • Strong interpersonal skills and experience interfacing with others internally and externally from the company.
  • Good communication and presentation skills with the ability to explain concepts and conclusions around data and insights in a clear, concise, and compelling way.

Big pluses:

  • Experience working with dbt (data build tool) either in cloud or self-hosted environments.
  • Experience developing data visualizations and choosing the best way to present information using BI tools such as Tableau, Looker etc.
  • Experience maintaining production grade CI/CD pipelines.
  • Experience with Operational Analytics or Reverse ETL tools such as Hightouch, Census, etc.
  • Experience working with streaming data infrastructure such as AWS Kinesis, Kafka, Materialize, etc.
  • Experience working in SaaS or enterprise software companies in the data or analytics space.
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.