Data Scientist Applied Research, Content Modeling

Data Scientist Applied Research, Content Modeling

This job is no longer open
At Scribd (pronounced “scribbed”), we believe reading is more important than ever. Join our cast of characters as we work to change the way the world reads by building the world’s largest and most fascinating digital library: giving subscribers access to a growing collection of ebooks, audiobooks, magazines, documents, Scribd Originals, and more. In addition to works from major publishers and top authors, our community includes over 1.5M subscribers in nearly every country worldwide.

We’re defining the workforce of the future with Scribd Flex, a program that embraces multiple perspectives while leaning into our belief that no matter where each team member is, we trust them to accomplish our shared business goals. This program lets employees, in partnership with their manager, choose where they work while creating intentional in-person meetings with co-workers that build culture and connect us personally.

Remote employees must have their primary residence in: Arizona, California, Colorado, Delaware, DC, Florida, Hawaii, Iowa, Massachusetts, Michigan, Missouri, Nevada, New Jersey, New York, Ohio, Oregon, Tennessee, Texas, Utah, Vermont, Washington, Ontario (Canada), and British Columbia (Canada).
*This list may not be complete or accurate, and candidates should speak with their recruiter about their specific location for remote work.


Applied Research: Content Modeling is a team of data scientists who extract quantitative understanding from the complex unstructured data in our growing corpus of varied multi-modal content. We are a world class AI/ML organization whose mission is to break new ground in how our customers discover content. 

We are skilled in employing a diversity of methods from metadata extraction, natural language processing, computer vision, and classification. We build the rich semantic connections between our content and our users. We strive to know our users better than they know themselves, enabling us to transform personalized discovery experiences.

We are a full-stack data science team that runs exploratory analyses, sizes business impact, develops models, creates data pipelines, taking projects from research prototype to production system serving product surfaces. We work on Scribd’s unique and massive dataset consisting of hundreds of millions of documents, books, audiobooks, articles, slides and podcasts. 

You will:

• Use SQL to query tables and build training datasets
• Build machine learning models in Python (with some work in Spark)
• Leverage state-of-the-art models using deep learning frameworks such as PyTorch and Tensorflow
• Work with Senior Data Scientists to operationalize data science projects
• Educate stakeholders through written and verbal communications methods on the approaches and results of projects, while writing detailed, accurate and concise project write ups
• Investigate innovative methods of solving our most challenging problems at Scribd.

You Have:

• 1-2 years of experience deploying machine learning models (natural language processing experience preferred) Beginner level or greater experience with SQL and possibly Spark
• Intermediate to advanced knowledge with Python
• Intermediate level in at least one of these fields: natural language processing, deep learning, computer vision, and bayesian or frequentist statistics
• A keen interest in learning what’s necessary to solve a business problem and make a positive business impact.
• Bachelors or Masters in relevant quantitative discipline (e.g. Computer Science, Software Engineering, Data Science, Machine
Learning, Artificial Intelligence, Computational Linguistics, Mathematics, Statistics)
Benefits, Perks, and Wellbeing at Scribd

• Healthcare Insurance Coverage: Scribd pays for employee’s Medical, Vision, and Dental premiums and a portion of dependent premiums
• 401k/RSP plans provided, plus company matching with no vesting period
• Professional development: generous annual budget for our employees to attend conferences, classes, and other events
• Quarterly Wellness, Connectivity & Comfort Benefit
• Concern mental health digital platform
• Free subscription to Scribd + gift memberships for friends & family
• Leaves: 12 weeks paid parental leave, company paid short-term/long-term disability plans and milestone Sabbaticals
• Generous Paid Time Off: Paid Holidays, Flexible Sick Time, Volunteer Day + office closure between Christmas Eve and New Years Day
• Company-wide Diversity, Equity, & Inclusion programs which include learning & development opportunities, employee resource groups, and hiring best practices.

Want to learn more? Check out our office and meet some of the team at www.linkedin.com/company/scribd/life

Scribd is committed to equal employment opportunity regardless of race, color, religion, national origin, gender, sexual orientation, age, marital status, veteran status, disability status, or any other characteristic protected by law.

We encourage people of all backgrounds to apply. We believe that a diversity of perspectives and experiences create a foundation for the best ideas. Come join us in building something meaningful.

 #LI-Remote
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.