Data Engineer

Data Engineer

This job is no longer open

Mission

Speechify is the easiest way to listen to the world’s information. Articles on the web, documents in the cloud, books on your phone. We absorb it all and let you listen to it at your desk, on the go, at your own speed, and with tools that make learning easier, deeper, and faster.

What streaming services have done for audio entertainment, we’re doing for audio information. And whatever we’re doing seems to be working. We’re #1 in our category, and experiencing exponential growth.

Overview

As a Data Engineer at Speechify, you will play a crucial role in designing, developing, and maintaining our data infrastructure. You will work closely with cross-functional teams, including data scientists, analysts, and software engineers, to ensure the availability and accessibility of high-quality data for various business needs.

What You’ll Do

  • Design, implement, and maintain scalable data pipelines to ingest, transform, and load data from various sources into our data warehouse.
  • Collaborate with data scientists and analysts to understand their data requirements and ensure data availability and quality.
  • Optimize and tune existing data pipelines for performance and efficiency.
  • Build and maintain ETL (Extract, Transform, Load) processes to ensure data is transformed and loaded accurately and on schedule.
  • Troubleshoot and resolve data pipeline issues and performance bottlenecks.
  • Implement data governance and data quality standards to ensure data accuracy and consistency.
  • Monitor and maintain data storage solutions, such as data warehouses, data lakes, and databases.
  • Stay up-to-date with industry best practices and emerging technologies in data engineering.

An Ideal Candidate Should Have 

  •  Bachelor's degree in Computer Science, Information Technology, or a related field.
  • Data Engineer or a similar role in a data-intensive environment.
  • Strong proficiency in SQL for data extraction, transformation, and loading.
  • Experience with data warehousing solutions (e.g., Amazon Redshift, Google BigQuery, Snowflake) and cloud-based data platforms (e.g., AWS, Azure, GCP).
  • Proficiency in programming languages such as Python or Java for building data pipelines.
  • Experience with ETL tools and frameworks (e.g., Apache NiFi, Apache Airflow).
  • Knowledge of data modeling concepts and database design.
  • Familiarity with version control systems (e.g., Git) and collaborative development practices.
  • Excellent problem-solving and communication skills.
  • Ability to work collaboratively in a fast-paced, team-oriented environment.
  • Strong attention to detail and commitment to data accuracy and quality.
Preferred Qualifications:
  • Advanced degree in Computer Science, Data Science, or a related field.
  • Experience with big data technologies (e.g., Hadoop, Spark).
  • Knowledge of containerization and orchestration tools (e.g., Docker, Kubernetes).
  • Familiarity with data security and compliance standards (e.g., GDPR, HIPAA).
  • Experience with real-time data processing and streaming technologies (e.g., Kafka, Apache Flink).
  • Certifications in relevant data engineering or cloud platforms.

What We Offer

  • A fast-growing environment where you can help shape the culture
  • An entrepreneurial crew that supports risk, intuition, and hustle
  • A hands-off approach so you can focus and do your best work
  • The opportunity to make an impact in a transformative industry
  • A competitive salary, a collegiate atmosphere, and a commitment to building a great asynchronous culture

Think you’re a good fit for this job? 

Tell us more about yourself and why you're interested in the role when you apply.
And don’t forget to include links to your portfolio and LinkedIn.

Not looking but know someone who would make a great fit? 

Refer them! 

Speechify is committed to a diverse and inclusive workplace. 

Speechify does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.