Labelbox

San Francisco

11-50 employees

The training data platform for production AI

Prior Listings

Other Jobs in Data Engineering

See all

Data Engineer

Labelbox

Data Engineer

Data Engineering

This job is no longer open

Labelbox’s mission is to build the best products for humans to advance artificial intelligence. Real breakthroughs in AI are reliant on the quality of the training data. Our training data platform enables organizations to improve their machine learning models far quicker and more accurately. We are determined to build software that is more open, easier-to-use, and singularly focused on getting our customers to performant ML faster.

Current Labelbox customers are transforming industries within insurance, retail, manufacturing/robotics, healthcare, and beyond. Our platform is used by Fortune 500 enterprises including Allstate, Black + Decker, Bayer, Warner Brothers and leading AI-focused companies including FLIR Systems and Caption Health. We are backed by leading investors including SoftBank, Andreessen Horowitz, B Capital, Gradient Ventures (Google's AI-focused fund), Databricks Ventures, Snowpoint Ventures and Kleiner Perkins.

About the Role

Labelbox is hiring a Data Engineer to build new data pipelines and scale existing ones. As our company grows, this person will build data infrastructure that brings together tech, product, and operational functions and informs strategic decision making at the executive level. You will be responsible for transforming raw data in the data warehouse into clean, reliable, organized data models that allow our organization to make informed data-driven decisions. Our tech stack currently consists of Bigquery, DBT, and Looker along with other tools to replicate all of our data to our data warehouse.

What You'll Do

Develop and optimize large-scale batch and real-time data pipelines that ingest structured and unstructured data from a variety of sources using a combination DBT, Fivetran, and other tools
Build, rebuild and performance tune data transformation tasks within the central data store
Take over and scale our DBT and Looker setup
Manage incoming data requests and prioritize the highest value projects in an organized fashion
Communicate data-backed findings to a diverse constituency of internal and external stakeholders
Help create best practices and standards for data modeling, documentation, and testing
You will have strong autonomy designing and implementing operationally excellent data interfaces
Rigorously design data warehouse schemas to allow for performant access to digestible datasets
Become the analytics infrastructure and tooling expert, supporting business-focused pipelines and data interfaces
Data modeling, Data warehouse management, and Data orchestration

About You

Expert-level SQL skills
Experience in a role performing data warehouse and analytics solution design and development using a variety of techniques such as clustering and partitioning on tables over 1B rows
Understanding of data architecture design, data modeling, and physical database design and tuning
Hands-on experience in the implementation of cloud data warehouses using Bigquery, postgres, and Mysql databases
Experience using DBT
Knowledge of data visualization tools such as Looker
Hands-on coding experience in Python

Technology You’ll Use

Bigquery/GCS, Mysql, Postgres
DBT
Fivetran
Looker
Github
Jira

Do great work. From anywhere.

We hire great people regardless of where they live. Work wherever you’d like as reliable internet access is our only requirement. We communicate asynchronously, work autonomously, and take ownership of our work.

#LI-Remote

This job is no longer open