Anomaly

New York
501-1,000 employees
Anomaly is pioneering a new paradigm in healthcare—precision payments—to improve payment accuracy and streamline payments for payers and providers.

Senior Data Engineer, Platform & Data

Senior Data Engineer, Platform & Data

This job is no longer open

As our Senior Data Engineer, Platform & Data, you will be responsible for developing, maintaining and optimizing data pipelines, integrations, transformations, databases and data infrastructure that supports company objectives.  Your goal is to maintain and optimize our data services that provide our organization with high quality data ingredients that can be used to reveal actionable insights. You will be part of a team that fosters the cultivation of a data-driven company culture.

In this role, you will facilitate data scientists, machine learning, and analytics users with optimized data feeds. You will work in concert with the Director of Data Engineering to build data infrastructure and data flows that can be leveraged across the company. You will work closely with and mentor Data Engineers on staff as well as with product designers and managers to discern appropriate data sources and providers for ingestion and storage within the data warehouse.  In addition, you will work in collaboration with software engineers and DevOps professionals to assure we develop a strong data platform that can serve cross-functional applications and services.

You will report to and work with the Director of Data Engineering. The position can be fully remote within the United States or based out of our New York City headquarters.

What you'll do:

  • Data Integration
    • Establish data pipelines to external and internal data sources
    • Work with both structured and unstructured data to efficiently extract and load into our data warehouse
    • Design and implement data orchestration using tools like Airflow to manage scheduling and data flow across all the components of the data infrastructure
  • Data Infrastructure
    • Test and iterate quickly to establish an agile architecture that can scale as our demands increase
    • Work with data warehouses/lakes to effectively store data that will be a valuable resource for analytics across the organization
    • Build agile systems and processes that decouple compute and storage allowing for infrastructures that can scale more efficiently, such as Trino or Presto for query engines and structured file storage approaches like Parquet or Apache Iceberg
  • Data Management
    • Capture metadata and preserve the fidelity of the source and data lineage from origin to staging and serving data environments
    • Build data models that can be leveraged across the data ecosystem and work with relational databases that can support a master data store
  • Data Delivery
    • Leverage data transformation tools to transform and craft derivatives that can generate information and cultivate insights
    • Iterate with BI Analysts to package data in a way that can be easily consumed by BI tools such as dashboards
    • Effectively communicate the value of the new data to stakeholders within the data team
    • Test and iterate quickly to establish an agile architecture that can scale as our demands increase

What you'll need:

  • Background
    • 4-7+ years of experience developing data integration, transformation and data infrastructure systems using Python and SQL
    • Experience using data integration tools in FiveTran, Airbyte or Snowpipe
    • Experience data warehouses/lakes, such as Snowflake, Databricks, Dremio or Redshift
    • Experience with relational databases such as CockroachDB, Postgres RDS or Aurora
    • Experience with distributed query engines like Presto, Trino, Sonar or Athena
    • Ability to collaborate with cross-functional stakeholders and communicate technical requirements and considerations effectively
    • Ability to be flexible within a start-up environment
    • Ability to structure technical work and align with team objectives
  • Technical Skills
    • Expertise in SQL and Python
    • Data integration experience using ETL/ELT tools and ability to build pipelines and manage data within data lake management systems
    • Data integration experience using ETL/ELT tools and ability to build pipelines to internal (i.e., NetSuite, Greenhouse) and external (i.e., CB Insights, Vettd.ai) data providers
    • Data storage and experience with data warehouses/lakes and ability to leverage them  
    • Experience with database systems that can scale out using distributed systems    
    • Data quality assurance approaches
    • Ability to work with stakeholders within the organization to access data needs and valuable assets that can be used by analysts
    • Data management and storage systems and critical thinking about how you would store data and what type of database system would be best suited for particular data sources
  • Behavioral Skills
    • Collaborates. Works cooperatively with others across the organization to achieve shared objectives. Represents own interests while being fair to others and their areas. Partners with others to get work done. Credits others for their contributions and accomplishments. Gains trust and support of others. Building partnerships and collaborating with others to meet shared objectives.
    • Communicates Effectively. Is effective in a variety of communication settings: one-on-one, small and large groups, or among diverse styles and position levels. Attentively listens to others. Adjusts to fit the audience and the message. Provides timely and helpful information to others across the organization. Encourages the open expression of diverse ideas and opinions. Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences.
    • Action Oriented. Readily takes action on challenges, without unnecessary planning. Identifies and seizes new opportunities. Displays a can-do attitude in good and bad times. Steps up to handle tough issues. Taking on new opportunities and tough challenges with a sense of urgency, high energy and enthusiasm. 
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.