Senior Data Scientist, AI Data Team

Senior Data Scientist, AI Data Team

This job is no longer open

View transcript

DESCRIPTION

Job summary
The AI Data Team in Amazon Web Services (AWS) is looking for a Senior Data Scientist with a passion for developing innovative methods to maximize the power of natural language data. This position is an opportunity to apply your expertise in a challenging but supportive environment. The position may be remote, with a preference for Santa Clara, Seattle, or New York City. We work in distributed teams, and strengthen our connections by traveling a total of about one week per quarter for in person meetings.

The mission of the AI Data Team is to engineer the datasets critical to the success of AWS’s machine learning services. From chatbots to subtitles to search results and beyond, these products support dozens of languages and impact millions of people every day. We are a group of language engineers, linguists, data scientists, data engineers, and program managers, and we partner closely with the science, engineering, and product teams. We are customer obsessed and committed to delivering results with the highest quality and integrity.

As a Senior Data Scientist, you will start by diving deep into a high profile project to launch a new product offering. You will consult with stakeholders in science, engineering, and product teams to understand machine learning model development plans, and collaborate with language engineers to strategize on data collection. You will determine the appropriate metrics for data analysis and quality checks on data annotations, considering domain gaps, data bias, and data noise in your analysis. You will document and present results to the project team using effective data visualization.

You will then broaden your expertise to understand the AI Data Team’s range of methods and general challenges for producing or acquiring data, and for generating high quality labels. You will identify opportunities to advance the team’s capabilities, staying up to date on developments in the field of data-centric AI, and experimenting with new techniques for data cleaning, labeling, and augmentation. You will share your results in written documents and presentations to the data team and stakeholders.

This expertise will enable you to organize and lead large data team initiatives to scale these methods for ongoing data collection and annotation. You will collaborate with data engineers and language engineers to design and build technical assets such as data warehousing and analytics tooling, and write Python packages to provide metrics and insights across datasets that ensures the data is discoverable and reusable.


Key job responsibilities
* Lead the development of metrics and analytics for natural language data collection, to ensure quality and effectiveness of data and annotations for training, testing, and benchmarking machine learning models.
* Design and write Python packages for analyzing natural language datasets, including domain gaps, data bias, and data noise
* Lead large initiatives to develop and scale innovative techniques for data cleaning, labeling, and augmentation
* Contribute to data warehousing and analytics tooling, writing code to provide metrics and insights across datasets that ensures the data is discoverable and reusable

BASIC QUALIFICATIONS

* MS or PhD in a quantitative field such as computer science, mathematics, or computational linguistics
* 5+ years hands on experience as a data scientist in an industry setting, including statistical modeling and data visualization
* Proficiency in Python
* Experience with natural language data and NLP techniques
* Experience working with stakeholders such as product and engineering teams
* Experience driving complex initiatives

PREFERRED QUALIFICATIONS

* Experience in developing and evaluating data annotation and data quality metrics
* Experience using AWS tools and services
* Strong written and verbal communication skills, with an ability to present complex technical information in a clear and concise manner to a variety of audience

*The pay range for this position in Colorado is $128,400 - $149,800 yr.; however, base pay offered may vary depending on job-related knowledge, skills, and experience. A sign-on bonus and restricted stock units may be provided as part of the compensation package, in addition to a full range of medical, financial, and/or other benefits, dependent on the position offered. This information is provided per the Colorado Equal Pay Act. Base pay information is based on market location. Applicants should apply via Amazon's internal or external careers site.



Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

Pursuant to the Los Angeles Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Workers in New York City who perform in-person work or interact with the public in the course of business must show proof they have been fully vaccinated against COVID or request and receive approval for a reasonable accommodation, including medical or religious accommodation.

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.