Data Engineer

Data Engineer

This job is no longer open
How you will help
You will support the engineering team’s data endeavors, diving in to fix issues, optimize processes, and automate what you do more than once.  You’ll use the best tools for the job, whether modern and revolutionary or time tested and proven, to deliver elegant, scalable solutions that meet business and technical needs. 

What you will do
 Work with internal stakeholders to load data into HealthVerity's data warehouse
• Troubleshoot and resolve issues relating to data integrity
• Help establish procedures and best practices for transforming and storing dataLead requirements gathering around data pipeline automation improvements
• Work with some of the most exciting open-source tools like Spark, Hadoop, Docker, Airflow, Zeppelin
• Leverage distributed computing and serverless architecture such as AWS EMR & AWS Lambda, to develop pipelines for transforming data
• Enjoy the peace that comes with working in a mature software development environment
• Marvel at the speed with which your creation makes it into production
• Research and implement new technologies with a team of developers to execute strategies and implement solutions
• Produce peer reviewed quality software
• Solve complex problems related to the real-time discovery of large data

About you
• Experienced in writing scalable applications on distributed architectures
• Data driven, testing and measuring as much as you can
• Eager to both review peer code and have your code reviewed
• Comfortable on the command line and consider it an essential tool
• Confident in SQL, you know it, write smart queries, it’s no big deal

Required skills and experience
• 5+ years of work experience
• 3+ years of experience with Python and Scala
• 3+ years of experience with PySpark and Spark-SQL (writing, testing, debugging spark routines)
• 1+ years of experience with AWS EMR, AWS S3 service.
• Comfortable using AWS CLI and boto3
• Comfortable working in remote environmentsComfortable using *nix command line (shell scripting, AWK, SED)
• Experience with MySQL and Postgres

Bonus experience
• Experience with Apache Airflow
• Experience with Apache Zeppelin
• Experience with healthcare data
About HealthVerity
Pharmaceutical manufacturers, payers and government organizations have partnered with HealthVerity to solve some of their most complicated use cases through transformative technologies and real-world data infrastructure. The HealthVerity IPGE platform, based on the foundational elements of Identity, Privacy, Governance and Exchange, enables the discovery of RWD across the broadest healthcare data ecosystem, the building of more complete and accurate patient journeys and the ability to power best-in-class analytics and applications with flexibility and ease. Together with our partners, HealthVerity has built the modern way to data for the health insights economy. To learn more about the HealthVerity IPGE platform, visit www.healthverity.com.

Our company challenges
• Empowering clients with highly rewarding data discovery and licensing tools
• Ingesting and managing billions of healthcare records from a wide variety of partners
• Standardizing on common data models across data types
• Orchestrating an industry-leading HIPAA privacy layer
• Innovating our proprietary de-identification and data science algorithms
• Building a culture that supports rapid iteration and new possibilities

We have big plans
The infrastructure and culture we are building will provide an environment that cultivates innovation. We want to move fast knowing we can fix anything we break along the way. If a new need arises, we want to turn around a solution quickly. We want to solve our challenges in ways that create even more possibilities. We’ve created a platform that will scale to support an ever-growing array of data providers and innovative products and services. You must be able to think big while still delivering on near-term requirements.

HealthVerity is an equal opportunity employer devoted to inclusion in the workplace. We believe incorporating different ideas, perspectives and backgrounds make us stronger and encourages an environment where ageism, racism, sexism, ableism, homophobia, transphobia or any other form of discrimination are not tolerated. At HealthVerity, we’re working towards an innovative and connected future for healthcare data and believe the future is better together. We can only do that if everyone has a seat at the table. Read our Equity Inclusion and Diversity Statement.

If you require a reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employee selection process, please direct your inquiries to careers@healthverity.com

HealthVerity offers in-office and remote options, so you can work from anywhere within the US! #LI-Remote
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.