Data Engineer, Baseball Research & Development

Data Engineer, Baseball Research & Development

Summary:

The Washington Nationals are looking to hire a software engineer focusing on data engineering and infrastructure to join our Baseball R&D group. The data engineer will help ensure our datasets are well organized and accessible for our analysts and web developers. We are looking for candidates who are passionate about building impactful solutions around data workflows and enthusiastic about working in a baseball front office.

The Washington Nationals Baseball R&D group is responsible for deriving insights from our baseball datasets and building proprietary metrics and data products which are used to inform baseball decision making throughout our organization.

We prefer candidates who are willing to relocate to Washington, DC area for in person/hybrid work at Nationals Park but are willing to consider a fully remote option for exceptional candidates.

Essential Duties and Responsibilities:

  • Build robust data imports that pull data from a variety of sources (HTTP API's, cloud object stores like AWS S3, relational databases) and write to our internal data systems.
  • Develop validation processes to monitor data quality and flag potential sources of error.
  • Help with the deployment, orchestration, and monitoring of our data pipelines. We use Prefect for orchestration, utilizing Docker and AWS ECS.
  • Design and build solutions to make working with our internal datasets easier. This work includes maintaining database tables and views, merging datasets from different sources for easier access, and possibly building internal API's or other microservices to make data more accessible.
  • Assist with the maintenance of our cloud computing infrastructure: manage and configure servers, databases, and other internal tools.
  • Research and advocate for any new tooling that can aide in timely, accurate and accessible data delivery
  • Write documentation
  • Participate in code reviews

Requirements:

Minimum Education and Experience Requirements

  • Bachelor’s degree in computer science, computer engineering, information science, or a related field.
  • 3+ years of relevant work experience.

Knowledge, Skills, and Abilities necessary to perform essential functions

  • Fluent in Python and experience with Pandas.
  • Proficient with MySQL, PostgreSQL, or other relational database systems.
  • Experience with Docker.
  • Comfortable working on the command line in a Linux environment.
  • Experience using git for version control.
  • Some experience with R is preferred, but not required.
  • Ability to work independently with close attention to detail.
  • Enthusiastic about working in baseball.
  • Authorized to work in the United States.

Physical/Environmental Requirements

  • Office: Working conditions are normal for an office environment. Work may require occasional weekend and/or evening work.

Our Stack

  • We write most of our imports in Python, using Prefect for orchestrating our data workflows, which are dockerized and deployed on AWS ECS. Our analysts work mostly in R.
  • Our servers run Ubuntu Linux.
  • We utilize several AWS services, primarily EC2, RDS, S3, ECS, Batch, EFS.
  • We use MySQL, PostgreSQL, MongoDB for our databases. We also leverage SQLite, DuckDB, DynamoDB for certain applications.
  • We use Terraform, Ansible, and Packer for managing our infrastructure
  • We use a self-hosted GitLab instance for our code repositories and for CI/CD.

All applicants for employment at the Washington Nationals are required to be fully vaccinated against COVID-19 prior to commencing employment. Applicants who receive a conditional offer of employment will be required to produce proof of vaccination status prior to their first day of employment. Applicants with qualifying disabilities or bona fide religious objections may be exempted from this requirement or otherwise accommodated if they are unable to be vaccinated.

Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.