Automattic

San Francisco
1,001-5,000 employees
We are the people behind WordPress.com, WooCommerce, Jetpack, Simplenote, Longreads, VaultPress, Akismet, Gravatar, Crowdsignal, Cloudup, Tumblr, and more.

Data Engineer, Openverse

Data Engineer, Openverse

This job is no longer open

Primary responsibilities

Architect, build and maintain the existing open search catalog, including:

  • Ingesting content from new and existing sources of CC-licensed and public domain works, this includes images, audio, and video content.
  • Scaling the catalog to support billions of records and various media types.
  • Implementing resilient, distributed data solutions that operate at web scale.
  • Automating data pipelines and workflows.

Technical Requirements

  • Experience building and deploying data services at scale, including: 
    • database design and modeling
    • ETL processing
    • performance optimization
  • Proficiency with Python
  • Experience with Apache Airflow
  • Proficiency with Apache Spark
  • Experience with AWS or other cloud computing platforms

Nice to have (but not required):

  • Experience with contributing to or maintaining open source software
  • Experience with web crawling
  • Experience with Docker

How to apply?

Does this sound exciting? If yes, click the Apply button below and fill out our application form. In your cover letter, let us know what you can contribute to the team. Proofread! Make sure you spell and capitalize WordPress and Automattic accurately. We are lucky to receive many applications for this position, so try to make your application stand out.

Please include an answer to the following question in your cover letter:

  • Tell us about an interesting technical problem that you’ve worked on. What made it enjoyable?

Applicants who do not submit their cover letter with their answers will not be considered.

#LI-JC1

About Automattic

We are the people behind WordPress.comWooCommerceJetpackTumblrSimplenoteLongreads, and more. We believe in making the web a better place.

We’re a distributed company with more than 1400 Automatticians in 80+ countries speaking 90+ different languages. Our common goal is to democratize publishing so that anyone with a story can tell it, regardless of income, gender, politics, language, or where they live in the world.

We believe in Open Source and the vast majority of our work is available under the GPL.

Diversity, Equity, and Inclusion at Automattic

We’re improving diversity, equity, and inclusion in the tech industry. At Automattic, we want people to love their work and show respect and empathy to all. We welcome differences and strive to increase participation from traditionally underrepresented groups. Our DEI committee involves Automatticians across the company and drives grassroots change. For example, this group has helped facilitate private online spaces for affiliated Automatticians to gather and helps run a monthly DEI People Lab series for further learning. Diversity, Equity and Inclusion is a priority at Automattic, though our dedication influences far more than just Automatticians: We make our products freely available and translate our products into and offer customer support in numerous languages. We require unconscious bias training for our hiring teams and ensure our products are accessible across different bandwidths and devices. Learn more about our dedication to diversity, equity, and inclusion and our Employee Resource Groups.

Apply for this Job

← Work With Us

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.