Data Engineer

Data Engineer

This job is no longer open

YOUR ROLE AND MISSION:

As a data engineer, you will be responsible for maintaining and operating the data warehouse and connecting in Apollo's data sources.

DAILY ADVENTURES/RESPONSIBILITIES

  • Develop and maintain scalable data pipelines and build new integrations to support continuing increases in data volume and complexity.
  • Implement automated monitoring, alerting, self-healing (restartable/graceful failures) features while building the consumption pipelines
  • Implement processes and systems to monitor data quality, ensuring production data is always accurate and available.
  • Write unit/integration tests, contributes to engineering wiki and document work.
  • Define company data models and write jobs to populate data models in our data warehouse.
  • Work closely with all business units and engineering teams to develop a strategy for long-term data platform architecture.

COMPETENCIES:

  • Excellent communication skills to work with engineering, product, and business owners to develop and define key business questions and build data sets that answer those questions.
  • Self-motivated and self-directed
  • Inquisitive, able to ask questions and dig deeper
  • Organized, diligent, and great attention to detail
  • Acts with the utmost integrity
  • Genuinely curious and open; loves learning
  • Critical thinking and proven problem-solving skills required

QUALIFICATIONS:

Required:

  • Bachelor's degree in a quantitative field (Physical / Computer Science, Engineering or Mathematics / Statistics)
  • Experience in data modeling, data warehousing, and building ETL pipelines
  • Deep knowledge of data warehousing with an ability to collaborate cross-functionally

Preferred:

  • 2+ years experience in data engineering or a similar role
  • Experience using the Python data stack
  • Experience deploying and managing data pipelines in the cloud
  • Experience working with technologies like Airflow, Hadoop and Spark
  • Understanding of streaming technologies like Kafka, Spark Streaming
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.