Lead (Staff) Data Engineer

Lead (Staff) Data Engineer

This job is no longer open
GoodRx is America’s healthcare marketplace. Each month, millions people visit goodrx.com to find reliable health information and discounts for their healthcare — and we’ve helped people save $30 billion since 2011. We provide prescription discounts that are accepted at more than 70,000 pharmacies in the U.S., as well as telehealth services including doctor visits and lab tests. Our services have been positively reviewed by Good Morning America, The New York Times, NBC News, AARP, and many others.

Our goal is to help Americans find convenient and affordable healthcare. We offer solutions for consumers, employers, health plans, and anyone else who shares our desire to provide affordable prescriptions to all Americans.

We’re committed to growing and empowering a more inclusive community within our company and industry. That’s why we hire and cultivate diverse teams of the best and brightest from all backgrounds, experiences, and perspectives. We believe that true innovation happens when everyone has a seat at the table and the tools, resources, and opportunities to excel.

With that said, research shows that women and other underrepresented groups apply only if they meet 100% of the criteria. GoodRx is committed to leveling the playing field, and we encourage women, people of color, those in the LGBTQ+ communities, and Veterans to apply for positions even if they don’t necessarily check every box outlined in the job description. Please still get in touch - we’d love to connect and see if you could be good for the role!

About the Role
GoodRx is looking for extremely smart and curious data engineers, who are deft at working with a wide variety of languages, such as Python and SQL, a variety of raw data formats, such as parquet and CSV, in a fast-paced and friendly environment. You will collaborate and work with teams across GoodRx to build outstanding data pipelines and processes that stitch together complex sets of data stores in order to guide enterprise data decisions.

Responsibilities:

    • Collaborate with product managers, data scientists, data analysts and engineers to define requirements and data specifications.
    • Develop, deploy and maintain data processing pipelines using cloud technology such as AWS, Kubernetes, Airflow, Redshift, EMR.
    • Develop, deploy and maintain serverless data pipelines using Kafka, Event Bridge, Kinesis, AWS Lambda, S3 and Glue.
    • Define and manage overall schedule and availability for a variety of data sets.
    • Work closely with other engineers to enhance infrastructure, improve reliability and efficiency.
    • Make smart engineering and product decisions based on data analysis and collaboration.
    • Act as an in house data expert and make recommendations regarding standards for code quality and timeliness.
    • Lead and architect cloud-based data infrastructure solutions to meet stakeholder needs.

Skills & Qualifications:

    • Bachelor’s degree in analytics, statistics, engineering, math, economics, computer science, information technology or related discipline.
    • 9+ years professional experience in the big data space.
    • 7+ years experience in engineering data pipelines using big data technologies (Kafka, Spark, Scala etc...) on large scale data sets.
    • Expert knowledge in writing complex SQL and ETL development with experience processing extremely large datasets.
    • Expert in applying SCD types on S3 data lake using Delta Lake/Hudi.
    • Demonstrated ability to analyze large data sets to identify gaps and inconsistencies, provide data insights, and advance effective product solutions.
    • Deep familiarity with AWS Services (S3, Event Bridge, Glue, EMR, Redshift, Lambda)
    • Ability to quickly learn complex domains and new technologies
    • Innately curious and organized with the drive to analyze data to identify deliverables, anomalies and gaps and propose solutions to address these findings
    • Thrives in fast-paced startup environment

Good To Have:

    • Experience with customer data platform tools such as Segment.
    • Experience using Jira, GitHub, Docker, CodeFresh, Terraform.
    • Experience contributing to full lifecycle deployments with a focus on testing and quality.
    • Experience with data quality processes, data quality checks, validations, data quality metrics definition and measurement.
GoodRx is America's healthcare marketplace. The company offers the most comprehensive and accurate resource for affordable prescription medications in the U.S., gathering pricing information from thousands of pharmacies coast to coast, as well as a telehealth marketplace for online doctor visits and lab tests. Since 2011, Americans with and without health insurance have saved $30 billion using GoodRx and million consumers visit goodrx.com each month to find discounts and information related to their healthcare. GoodRx is the #1 most downloaded medical app on the iOS and Android app stores. For more information, visit www.goodrx.com.
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.