Data Scientist II, AI Data Team

Data Scientist II, AI Data Team

This job is no longer open

DESCRIPTION

Job summary
The AI Data Team in Amazon Web Services (AWS) is looking for a Data Scientist with a passion for developing innovative methods to maximize the power of natural language data. This position is an opportunity to apply your expertise in a challenging but supportive environment. The position may be remote, with a preference for Santa Clara, Seattle, or New York City. We work in distributed teams, and strengthen our connections by traveling approximately five days per quarter for in person meetings

The mission of the AI Data Team is to engineer the datasets critical to the success of AWS’s machine learning services. From chatbots to subtitles to search results and beyond, these products support dozens of languages and impact millions of people every day. We are a group of language engineers, linguists, data scientists, data engineers, and program managers, and we partner closely with the science, engineering, and product teams. We are customer obsessed and committed to delivering results with the highest quality and integrity.

As a Data Scientist, you will start by learning the full context of critical projects for Lex, a conversational AI service used for building chatbots, consulting with stakeholders in science, engineering, and product teams. You will determine the appropriate metrics for data analysis and quality checks on data annotations, to ensure that the data is optimized for developing models that exceed customer expectations. You will consider domain gaps, data bias, and data noise in your analysis. You will design and write Python packages for these processes, and work with data engineering to implement them in data pipelines built to scale.

You will also work with language engineers to understand the challenges we face in producing or acquiring data, and in generating high quality labels. You will stay up to date on developments in the field of data-centric AI, and experiment with new techniques for data cleaning, labeling, and augmentation. You will share your results in written documents and presentations to the data team and stakeholders. You will scale these methods for ongoing data collection and annotation, collaborating with data engineering as necessary.

You will then contribute to large data team initiatives to design and build technical assets such as data warehousing and analytics tooling that support multiple programs. You will gain an understanding of data collection methods across the data team, and write code to provide metrics and insights across datasets that ensures the data is discoverable and reusable.

Key job responsibilities
* Design and write Python packages for analyzing natural language datasets, including domain gaps, data bias, and data noise
* Develop innovative techniques for data cleaning, labeling, and augmentation, and scale these methods for ongoing data collection and annotation
* Contribute to large initiatives for data warehousing and analytics tooling, writing code to provide metrics and insights across datasets that ensures the data is discoverable and reusable

BASIC QUALIFICATIONS

* Degree in a quantitative field such as computer science, mathematics, or computational linguistics
* 3+ years hands on experience as a data scientist in an industry setting, including statistical modeling and data visualization
* Proficiency in Python
* Experience with natural language data and NLP techniques
* Experience working with stakeholders such as product and engineering teams

PREFERRED QUALIFICATIONS

* Experience in developing and evaluating data annotation and data quality metrics
* Experience using AWS tools and services
* Strong written and verbal communication skills, with an ability to present complex technical information in a clear and concise manner to a variety of audiences

* The pay range for this position is $136,000 - $207,500 (yr.); however, base pay offered may vary depending on job-related knowledge, skills, and experience. A sign-on bonus and restricted stock units may be provided as part of the compensation package, in addition to a full range of medical, financial, and/or other benefits, dependent on the position offered.

*This information is provided per the Colorado Equal Pay Act, the Pay Transparency Regulation of Jersey City Municipal Code, the New York City Human Rights Law. Base pay information is based on market location. Applicants should apply via Amazon's internal or external careers site.


Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.

Pursuant to the Los Angeles Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

Workers in New York City who perform in-person work or interact with the public in the course of business must show proof they have been fully vaccinated against COVID or request and receive approval for a reasonable accommodation, including medical or religious accommodation.

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.