Paige

New York, NY

11-50 employees

We're building next generation computational technology that unlocks insights from each sample for doctors to optimize patient outcomes

Prior Listings

Other Jobs in Data Engineering

See all

Clinical Data Engineer, Modeling

Paige

Clinical Data Engineer, Modeling

Data Engineering

This job is no longer open

Paige is a software company helping pathologists and clinicians make faster, more informed diagnostic and treatment decisions by mining decades of data from the world’s experts in cancer care. We are leading a digital transformation in pathology by leveraging advanced Artificial Intelligence (AI) technology to create value for the oncology clinical team.

We are the first company to develop clinical grade AI tools for the pathologist, which resulted in our receiving FDA breakthrough designation for our first product. Paige has also received FDA-clearance for our digital viewer, FullFocus™. We have also established multiple relationships with biopharma, laboratory, and equipment manufacturers that enables Paige to develop an ecosystem ready to help patients receive better diagnoses and treatment.

We’re seeking a Clinical Data Engineer, Modeling who will be a key contributor in developing data management pipelines and processes to enable a new generation of artificial intelligence applications for cancer detection and treatment. Following modern software development practices, you will assist in the design, implementation, and maintenance of tools that extract and manipulate data from various sources, including in-house and external databases.

This is an extraordinary opportunity to be part of a high-performing team and to pursue a life-changing mission with unique technical challenges!

This is a work-from-home position that can be based from anywhere in the US.

Responsibilities

Use SparkSQL and dbt to maintain and extend data warehouse dimensional model
Gather requirements from end users
Clean and transform large, complex, and valuable data sources
Ensure high data quality
Generate documentation and maintain data catalog
Build data subject matter expertise, as you will be working at the intersection of various domains and interact with business users, data engineers, AI staff and clinical experts
Create and implement ingestion and SQL ETL pipelines for large amounts of structured and unstructured data from various filesystems and databases
Deliver data insights to end users via Looker dashboards and visualizations as well as ad-hoc SQL queries
Advocate for and pioneer data centralization and standardization across development teams
Handle the challenges that come with managing terabytes of data
Design, develop, and test data specific software systems
Work independently to produce required functional, technical, and user documentation on assigned projects
Work and collaborate with data engineers, scientists, engineers, IT operations and medical doctors to design and implement data tools that enable AI development within the company
Follow development best practices such as testing, code reviews, and CICD

 Key Requirements

Bachelor’s degree in computer science, bioinformatics or a related field, or equivalent years of experience.
3+ years of industry experience as a bioinformatics, software, or data engineer
Expertise in SQL
Expertise in best practices for data modeling (star schemas, facts, dimensions)
Expertise in designing and maintaining BI visualization solutions
Experience implementing and testing data ETL and processing pipelines
Experience ingesting and standardizing data into data warehouses (e.g. Amazon Redshift, Microsoft SQL Server, Google BigQuery or Snowflake)
Experience with RDBMS and NoSQL databases (e.g. MongoDB)
Familiarity with python
Familiarity with modern development practices and DevOps

Desirable

Experience in work with healthcare or clinical data sets

This job is no longer open