Associate Data Scientist

Associate Data Scientist

This job is no longer open
Sonatype is the software supply chain management company. We're on a mission to change how the world innovates by making software development easier. From running the world's largest repository of Java open-source components (Maven Central) to inventing componentized software development and then software supply chain management to creating the only solution that stops malicious open-source malware in its tracks, we're constantly leading the industry while helping thousands of customers manage open source every day.

Already used by 15 million developers, we have lofty goals for our technology to be in the hands of every engineering team. And we need you to do that. Join us!

Learn more at www.sonatype.com.


Sonatype’s mission is to enable organizations to better manage their software supply chain.  We offer a series of products and services including the Sonatype Nexus Repository and Sonatype Lifecycle.
 
***This position is 100% remote and candidates must currently live in the United States.***

You’ll be working with one of our advanced research teams to help turn large amounts of data into valuable insights for our customers. We’re building our data science program, so you’ll be helping to build out our standard processes as we grow. We have a large team of versatile data engineers, data scientists and data analysts so you can focus on doing what you do best, building models.

What you'll be doing

    • Interacting with product management and data engineers to think through the potential ways to leverage data.
    • Collaborate with senior data scientists and multi-functional teams to understand business requirements and translate them into ML/AI projects.
    • Apply ML/AI techniques to analyze and extract insights from complex and large-scale datasets, delivering valuable solutions to address business challenges.
    • Perform data preprocessing tasks, including data cleaning, feature selection, and feature engineering, to ensure high-quality data inputs for model development.
    • Develop and implement ML/AI models and algorithms, such as regression, classification, clustering, and deep learning, to tackle specific data problems.
    • Collaborate with data engineers to create efficient data pipelines that integrate, transform, and prepare data for ML/AI model training and evaluation.
    • Evaluate the performance of ML/AI models using appropriate metrics and validate their effectiveness through meticulous testing and experimentation.
    • Continuously research and explore new ML/AI techniques, algorithms, and tools to enhance the capabilities and efficiency of data science workflows.
    • Assist in the visualization and interpretation of data insights, presenting findings to technical and non-technical partners in a clear and concise manner.
    • Collaborate with the data governance team to ensure compliance with data privacy regulations and ethical considerations in all data science activities.
    • Stay up to date with the latest advancements and trends in ML/AI techniques and contribute to the growth of the data science practice within the company.
    • It is expected that you are an authority in machine learning so you will largely be driving the direction of your work since you know best what is possible.

Requirements

    • Strong academic credentials in computer science, data science, statistics, or a related field
    • 0 to 2 years of experience with advanced programming languages such as Python or R, and experience with relevant ML/AI libraries and frameworks (e.g., scikit-learn, TensorFlow, PyTorch).
    • Solid understanding of machine learning and AI techniques, including supervised and unsupervised learning algorithms
    • Familiarity with data preprocessing techniques, feature engineering, and model evaluation methods
    • Basic understanding of statistical analysis concepts and their application in data science.
    • Ability to work with large datasets and perform data manipulations using SQL or other data querying languages.
    • Strong problem-solving skills and a keen interest in finding innovative solutions to sophisticated data challenges.
    • Good communication skills, with the ability to collaborate effectively within a team and present findings to both technical and non-technical partners.
    • Proven ability to learn quickly, adapt to new technologies and tools, and work in a fast-paced and evolving environment.

Preferred Qualifications

    • Experience with data visualization tools (e.g., Tableau, matplotlib) is a plus.
    • Proficiency using Jupyter or Databricks notebooks
    • Familiarity with Databricks, AWS, S3, EMR, Sagemaker, would be beneficial.
    • Experience with Git and preferably Github, PySpark, MLflow
    • Our data engineers primarily use Java and Scala. We don't expect you to be writing production level Scala code, but some experience and a willingness to learn is an asset.
$0 - $0 a year
NA
We are Sonatype, and we have assembled a world class team of employees, investors, and partners. We are proud to be recognized as a Deloitte Technology Fast 500 company for 2016. With more than 120,000 installations and counting, Nexus products are helping modern development organizations thoughtfully source, manage, assemble, and maintain open source and third-party components, so they can improve the quality, security, and speed of their software supply chains.
We are curious and constantly innovating without fear of failure. We are attacking a huge and emerging market and seeking remarkably dedicated individuals to join us on our journey.

At Sonatype, we value diversity and inclusivity. We offer perks such as parental leave, diversity and inclusion working groups, and flexible working practices to allow our employees to show up as their whole selves. We are an equal-opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. If you have a disability or special need that requires accommodation, please do not hesitate to let us know.


#LI-Remote
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.