Sonatype: Data Scientist

Sonatype is the software supply chain security company. We provide the world’s best end-to-end software supply chain security solution, combining the only proactive protection against malicious open source, the only enterprise grade SBOM management and the leading open source dependency management platform. This empowers enterprises to create and maintain secure, quality, and innovative software at scale.

As founders of Nexus Repository and stewards of Maven Central, the world’s largest repository of Java open-source software, we are software pioneers and our open source expertise is unmatched. We empower innovation with an unparalleled commitment to build faster, safer software and harness AI and data intelligence to mitigate risk, maximize efficiencies, and drive powerful software development.

More than 2,000 organizations, including 70% of the Fortune 100 and 15 million software developers, rely on Sonatype to optimize their software supply chains.

We’re looking for a Senior Data Scientist to join our growing AI & Data Science team.

You’ll operate as an internal AI consultant and technical lead, helping multiple teams across Sonatype apply machine learning and generative AI to real-world problems.

You’ll explore complex datasets, design experiments, build models, and collaborate closely with product, engineering, and security experts to turn research ideas into practical, scalable solutions.

This role is ideal for someone who thrives on autonomy, loves translating ambiguous ideas into working systems, and enjoys working across boundaries rather than in a single product lane.

What you'll do:

Lead applied AI projects from concept to impact — prototype, validate, and help teams deploy practical ML and GenAI solutions.
Collaborate cross-functionally: Partner with product, engineering, and research teams to scope problems, identify opportunities, and co-develop solutions.
Act as an internal consultant: Advise teams on ML/AI best practices, model evaluation, and productive use of generative technologies.
Design robust experiments and establish evaluation pipelines for model reliability, accuracy, and business impact.
Bridge research and production: Package research insights into usable APIs, tools, or workflows for other teams.
Explore new techniques (e.g., LLMs, embeddings models, retrieval-augmented generation, agentic workflows) to enhance developer and security experiences.
Share knowledge and mentor peers, helping elevate the organization’s AI literacy and capabilities.

What you bring:

3+ years of experience in applied data science, machine learning, or AI research
Strong Python skills and hands-on experience with ML/AI libraries and platforms such as Databricks, OpenAI API, and Scikit-learn
Comfortable working with large, messy, or unstructured datasets — you know how to turn chaos into features, insights, and beautiful visualizations
Deep familiarity with LLMs and GenAI ecosystems (e.g. OpenAI, Claude, Hugging Face): skilled in prompt engineering, parameter tuning, and evaluating model behavior against ground truth
Experience taking ML or GenAI systems from prototype to production, even if small-scale or incremental
Strong analytical thinking, experimentation skills, and appreciation for trustworthy, data-driven evaluation
Proficiency with Git and collaborative code workflows (GitHub or similar)
A balanced mindset — equally comfortable exploring research ideas and implementing production-ready systems
Proactive and self-directed: you don’t wait for perfect specs; you find meaningful problems and drive them to completion

It's great if you also bring:

Experience with AI-assisted coding tools (Copilot, Claude Code, Codex, etc.)
Familiarity with agentic workflows, Model Context Protocol (MCP), and tool-use integrations
Exposure to cybersecurity, anomaly detection, or code analysis
Understanding of MLOps practices (MLflow, AWS SageMaker, model serving, or monitoring)

At Sonatype, we value diversity and inclusivity. We offer perks such as parental leave, diversity and inclusion working groups, and flexible working practices to allow our employees to show up as their whole selves. We are an equal-opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. If you have a disability or special need that requires accommodation, please do not hesitate to let us know.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Apply for this job

Sonatype

Other Open Roles

Other Jobs in Data Science

Data Scientist

Data Scientist

What you'll do:

What you bring:

It's great if you also bring: