Data Engineer, Machine Learning

Data Engineer, Machine Learning

This job is no longer open
Pachama is looking for a senior data engineer to be responsible for laying critical infrastructure for our mission to map and monitor the planet's forests. You will be working with a diverse group of scientists, machine learning experts, and software engineers who are collectively responsible for developing remote sensing techniques to ensure that forest carbon credits are real, additional, long-lasting, and responsible for a global net reduction in atmospheric CO2. 

In this role, you will design and implement the workflows responsible for ingesting large amounts of raster and vector geospatial data for efficient access and processing. You will work alongside an interdisciplinary team to ensure data is analysis-ready and easy to interact with. The systems you build and the research you enable will underpin the core technology of the company, creating the heartbeat by which our business operates.

We're looking for engineers who find joy in the craft of building and who want to make an impact; who push initiatives forward by asking great questions, cutting through ambiguity, and organizing to win; and who are relentlessly detail-oriented, methodical in their approach to understanding trade-offs, and have a bias for action.

WHO WE ARE

Pachama is a mission-driven company looking to restore nature to help address climate change. Pachama brings the latest technology in remote sensing and AI to the world of forest carbon in order to enable forest conservation and restoration to scale. Pachama’s core technology harnesses satellite imaging with artificial intelligence to measure carbon captured in forests. Through the Pachama marketplace, responsible companies and individuals can connect with carbon credits from projects that are protecting and restoring forests worldwide.

We are backed by mission-aligned investors including Breakthrough Energy Ventures, Amazon Climate Fund, Chris Sacca, Saltwater Ventures, and Paul Graham.

You will:

    • Design and construct robust pipelines for the ingest of geospatial data
    • Create a geospatial data warehouse and platform for scientists and machine learning engineers
    • Build new data sources for geospatial information such as satellite imagery, LiDAR, radar, and field plots
    • Implement cutting edge pre-processing algorithms for sensor data to produce quality features for ML
    • Develop tools and infrastructure to increase iteration time and improve developer experience
    • Design infrastructure to accelerate research and experimentation
    • Develop plans for cross-team initiatives related to infrastructure and deliver on them

We are looking for strengths with:

    • Building scalable and fault-tolerant distributed systems that process large amounts of data
    • Handling batch and event data for geospatial data and machine learning systems
    • Cloud-native solutions
    • Algorithms and data structures, domain-driven design, and software engineering fundamentals
    • Geospatial data sources and storage
    • Interdisciplinary collaboration and deep user empathy
    • Working with ML models in production systems

We expect you to:

    • Approach problems with curiosity and humility
    • Own solutions end-to-end
    • Take part in strategic thinking
    • Leave code better than when you found it
    • Communicate well and document better
    • Have fun!!
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.