IT Lead Data Engineer

IT Lead Data Engineer

This job is no longer open

Why Mayo Clinic

Mayo Clinic is top-ranked in more specialties than any other care provider according to U.S. News & World Report. As we work together to put the needs of the patient first, we are also dedicated to our employees, investing in competitive compensation andcomprehensive benefit plans – to take care of you and your family, now and in the future. And with continuing education and advancement opportunities at every turn, you can build a long, successful career with Mayo Clinic. You’ll thrive in an environment that supports innovation, is committed to ending racism and supporting diversity, equity and inclusion, and provides the resources you need to succeed.


As a member of the Data and Analytics organization, you will be responsible for building and delivering best-in-class clinical data initiatives aimed at driving best-in-class solutions. You will collaborate with analytic partners and business partners from product strategy, program management, IT, data strategy, and predictive analytics teams to develop effective solutions for our partners.

Lead data design, prototype, and development of data pipeline architecture pipelines. Lead implementation of internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability. Lead cause analysis on external and internal processes and data to identify opportunities for improvement and answer questions. Excellent analytic skills associated with working on unstructured datasets. Understand the architecture, be a team player, lead technical discussions and communicate the technical discussion. Be a senior Individual contributor of the Data or Software Engineering teams. Be part of Technical Review Board along with Manager and Principal Engineer. Be a technical liaison between Manager, Software Engineers and Principal Engineers. Collaborate with software engineers to analyze, develop and test functional requirements. Write clean, maintainable code 30% of the time and performing peer code-reviews. Mentor and Coach Engineers. Work with team members to investigate design approaches, prototype new technology and evaluate technical feasibility. Work in an Agile/Safe/Scrum environment to deliver high quality software. Establish architectural principles, select design patterns, and then mentor team members on their appropriate application. Facilitate and drive communication between front-end, back-end, data and platform engineers. Play a formal Engineering lead role in the area of expertise. Keep up-to-date with industry trends and developments.

Job Responsibilities:

  • Act as Product Owner for Data platform’s and Lead the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
  • Evaluate the full technology stack of services required including PaaS, IaaS, SaaS, DataOps, operations, availability, and automation. 
  • Research, design, and develop Public & Private Data Solutions, including impacts to enterprise architecture
  • Build high-performing clinical data processing frameworks leveraging Google Cloud Platform, GCP Shared Services like Google Healthcare API, Big Query, and HL7 FHIR store.
  • Participate in evaluation of supporting technologies and industry best practices with our cloud partners and peer teams.
  • Lead Modern Data Warehouse Solutions and Sizing efforts to create defined plans and work estimates for customer proposals and Statements of work.
  • Conduct full technical discovery, identifying pain points, business, and technical requirements, “as is” and “to be” scenarios!
  • Design and Develop clinical data pipelines integrating ingestion, harmonization, and consumption frameworks for onboarding clinical data from various data sources formatted in various industry standards (FHIR, C-CDA, HL7 V2, JSON, XML, etc.).
  • Build state-of-the-art data pipelines supporting both batch and real-time streams to enable Clinical data collection, storage, processing, transformation, aggregation, and dissemination through heterogeneous channels.
  • Build design specifications for health care data objects and surrounding data processing logic.
  • Lead innovation and research building proof of concepts for complex transformations, notification engines, analytical engines, and self-service analytics
  • Bring a DevOps mindset to enable big data and batch/real-time analytical solutions that leverage emerging technologies.


Bachelor’s Degree in Computer Science/Engineering or related field with 6 years of experience OR an Associate’s degree in Computer Science/Engineering or related field with 8 years of experience.

Knowledge of professional software engineering practices and best practices for the full software development life cycle (SDLC), including coding standards, code reviews, source control management, build processes, testing, and operations. Have in-depth knowledge of data engineering and building data pipelines with a minimum of 5 years of experience in data engineering, data science or analytical modeling and basic knowledge of related disciplines. Worked and lead Data Engineering teams in Continuous Integration / Continuous Delivery model. Build/Lead Data products highly resilient in nature. Build/Lead Test Automation suites, Unit Testing coverage, Data Quality, Monitoring & Observability. A minimum experience of 5 years using relational databases and NoSQL Databases. Experience with cloud platforms such as GCP, Azure, AWS.

Continuous Integration using Jenkins, Git Hub Actions or Azure Pipelines. Experience with cloud technologies, development and deployment. Experience with tools like Jira, GitHub, SharePoint, Azure Boards. Experience using advanced data processing solutions/capabilities such as Apache Spark, Hive, Airflow and Kafka, GCP Dataflow. Experience using big data, statistics and knowledge of data related aspects of machine learning. Experience with Google BigQuery, FHIR APIs, and Vertex AI. Knowledge of how workflow scheduling solutions such as Apache Airflow and Google Composer related to data systems. Knowledge of using Infrastructure as code (Kubernetes, Docker) in a cloud environment.

  • Hands-on experience in architecture, design, and development of enterprise data applications and analytics solutions within the health care domain
  • Experience in Google Cloud Platform/Shared Services such as Cloud Dataflow, Cloud Storage, Pub/sub, Cloud Composer, Big Query, and Health care API (FHIR store)
  • They should be able to deliver an ingestion framework for relational data sources, understand layers and rules of a data lake and carry out all the tasks to operationalize data pipelines.
  • Experience in Python, Java, Spark, Airflow, and Kafka development
  • Hands-on experience working with “Big Data” technologies and experience with traditional RDBMS, Python, Unix Shell scripting, JSON, and XML
  • Experience working with tools to automate CI/CD pipelines (e.g., Jenkins, GIT)
  • Must have great articulation and communication skills.
  • Working in a fluid environment, defining, and owning priorities that adapt to our larger goals. You can bring clarity to ambiguity while remaining open-minded to new information that might change your mind.
  • Should have a strong understanding of healthcare data, including clinical data in proprietary and industry-standard formats.
  • Participate in architectural discussions, perform system analysis which involves a review of the existing systems and operating methodologies. Participate in the analysis of newest technologies and suggest the optimal solutions which will be best suited for satisfying the current requirements and will simplify the future modifications
  • Design appropriate data models for the use in transactional and big data environments as an input into Machine Learning processing.
  • Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability
  • Design and Build the necessary infrastructure for optimal ETL from a variety of data sources to be used on GCP services
  • Collaborate with multiple stakeholders including Product Teams, Data Domain Owners, Infrastructure, Security and Global IT
  • Identify, Implement, and continuously enhance the data automation process
  • Develop proper Data Governance and Data Security
  • Demonstrate strategic thinking and strong planning skills to establish long term roadmap and business plan
  • Work with stakeholders to establish and meet data quality requirements, SLAs and SLOs for data ingestion
  • Experience in Self-service Analytics/Visualization tools like PowerBI, Looker, Tableau
  • Proven knowledge in implementing security & IAM requirements
  • Experience building and maintaining a Data-Lake with DeltaLake
  • Experience with ETL/ELT/DataMesh frameworks
  • Experience with GCP Dataplex (Data Catalog, Clean Rooms)

Authorization to work and remain in the United States, without necessity for Mayo Clinic sponsorship now, or in the future (for example, be a U.S. Citizen, national, or permanent resident, refugee, or asylee). Also, Mayo Clinic does not participate in the F-1 STEM OPT extension program.

Exemption Status


Compensation Detail

$138,236.80 - $200,408.00 / year

Benefits Eligible



Full Time

Hours/Pay Period


Schedule Details

Monday - Friday, 8:00 am - 5:00 pm

Weekend Schedule

As needed

International Assignment


Site Description

Just as our reputation has spread beyond our Minnesota roots, so have our locations. Today, our employees are located at our three major campuses in Phoenix/Scottsdale, Arizona, Jacksonville, Florida, Rochester, Minnesota, and at Mayo Clinic Health System campuses throughout Midwestern communities, and at our international locations. Each Mayo Clinic location is a special place where our employees thrive in both their work and personal lives.Learn more about what each unique Mayo Clinic campus has to offer, and where your best fit is.

Affirmative Action and Equal Opportunity Employer

As an Affirmative Action and Equal Opportunity Employer Mayo Clinic is committed to creating an inclusive environment that values the diversity of its employees and does not discriminate against any employee or candidate. Women, minorities, veterans, people from the LGBTQ communities and people with disabilities are strongly encouraged to apply to join our teams. Reasonable accommodations to access job openings or to apply for a job are available.


Miranda Grabner

Apply Now

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.