Data Engineer (AWS/Pyspark)

Data Engineer (AWS/Pyspark)

This job is no longer open

ABOUT THE ROLE:

Data is the driver for our future at Cars. We’re searching for a collaborative, analytical, and innovative engineer to build scalable and highly performant platforms, systems and tools to enable innovations with data. If you are passionate about building large scale systems and data driven products, we want to hear from you.

Responsibilities Include:

  • Build data pipelines and deriving insights out of the data using advanced analytic techniques, streaming and machine learning at scale
  • Work within a dynamic, forward thinking team environment where you will design, develop, and maintain mission-critical, highly visible Big Data and Machine Learning applications
  • Build, deploy and support data pipelines and ML models into production.
  • Work in close partnership with other Engineering teams, including Data Science, & cross-functional teams, such as Product Management & Product Design
  • Opportunity to mentor others on the team and share your knowledge across the Cars.com organization

Required Skills

  • Ability to develop Spark jobs to cleanse/enrich/process large amounts of data.
  • Ability to develop jobs to read from various source systems such as kafka, databases,API's,files etc.
  • Experience with tuning Spark jobs for efficient performance including execution time of the job, execution memory, etc.
  • Sound understanding of various file formats and compression techniques.
  • Experience with source code management systems such as Github and developing CI/CD pipelines with tools such as Jenkins for data.
  • Ability to understand deeply the entire architecture for a major part of the business and be able to articulate the scaling and reliability limits of that area; design, develop and debug at an enterprise level and design and estimate at a cross-project level.
  • Ability to mentor developers and lead projects of medium to high complexity.
  • Excellent communication and collaboration skills.

Required Experience

  • Data Engineering | 1-2 years of designing & developing complex applications at enterprise scale; specifically Python, Pyspark and/or Scala. 
  • Big Data Ecosystem | 2 years of hands-on, professional experience with tools and platforms like Spark, EMR, Kafka.
  • AWS Cloud | 1+ years of professional experience in developing Big Data applications in the cloud, specifically AWS.
  • Bachelor's degree in Computer Science or Engineering or related field

Preferred:

  • Experience with developing REST APIs.
  • Experience in deploying ML models into production and integrating them into production applications for use.
  • Understanding of Machine Learning products.
  • Experience with developing real time data analytics using Spark Streaming, Kafka, etc.

#LI-KO1 #LI-Remote

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.