Principal Natural Language Processing (NLP) Data Scientist
WEX is an innovative payments and technology company leading the way in a rapidly changing environment. Our goal is to simplify the business of running a business for our customers and free them to spend more time, with less worry, on the things they love. We are on a journey to build a unified, world class user experience across our products and services and leverage customer driven innovation to power our growth and strategic initiatives.
As a Data Scientist with specialization in NLP, you will be front and center within the AI organization, delivering AI algorithms and processes to fuel the advancement of AI at WEX. You will work with stakeholders across WEX’s business verticals to identify and define use cases, and build AI solutions Using effective text representations to transform natural language into useful features.
The use cases you will work on will fall in the categories of: speech recognition, language translation, sentiment analysis, question and answer systems, automatic summary generation via LLMs, Chatbots / generative and conversational AI, and automatic text classification.
About the team:
You will be part of the AI Organization. The goal of this organization is to embed AI in every aspect of our business and technology. The team comprises Data Scientists, Machine Learning Engineers, and AI Specialists.
What you’ll do:
Responsible for designing and developing NLP systems according to requirements. Implementing algorithms and models that enable computers to understand and process human language. This involves working with large datasets, designing and testing algorithms, and optimizing models for accuracy and efficiency.
Defining appropriate datasets for language learning. This includes preprocessing and cleaning large datasets of text data. This involves tasks such as tokenization, stemming, lemmatization, and removing stop words.
Text classification and clustering: Responsible for developing algorithms and models that can classify and cluster text data. This involves tasks such as sentiment analysis, topic modeling, and named entity recognition.
Machine translation: Responsible for developing algorithms and models that can translate text from one language to another.
Speech recognition and synthesis: Responsible for developing algorithms and models that can recognize and synthesize human speech.
Work closely with other teams, such as data scientists, software engineers, and product managers, to develop and implement NLP solutions.
Staying up-to-date with the latest research: NLP is a rapidly evolving field, and responsible for staying up-to-date with the latest research and developments in the field.
Advise technical implementation for LLM infrastructure and more broad NLP applications
Create networks with key decision makers at the company and potentially be an external spokesperson for the organization.
Engage with stakeholders and leaders across the organization to identify, prioritize, frame, and structure complex and ambiguous challenges; advocate for projects where advanced analytics projects or tools can have the biggest impact
Identify and communicate the challenges and opportunities that the group should be working on, highlighting areas for improvement and outlining courses of action
Translate analysis results into business recommendations and articulate them to the appropriate stakeholders
Identify critical insights and flag potential risks found in large data sources; interpret and communicate insights and findings to product, service, and business managers to develop a solution
How you’ll engage:
Insights Driven: Clear hypothesis and objective driven analytics that help drive our business decisions and ongoing metrics
Stakeholder Aligned: Understand the needs and audience for deliverables with a succinct and tailored message to maximize impact
Results Focused: Rigorous focus on how analytics drive the end to end experiences with clear path to production and measurable impact
Dynamic Collaboration: Drive continual improvement of our teams best practices and processes to power collaboration
Quality Mindset: Trust in our findings is critical so data and analytic quality is understood and accounted for from the beginning
Curiosity and Learning: Learn new technologies and collaborate and teach others how to use them as necessary. Hold training and enablement sessions with key stakeholders as necessary.
Experience you’ll bring:
Roughly 15 years of relevant work experience (data scientist or data and analytics), including experience applying advanced analytics to planning and infrastructure problems.
OR 12+ years of relevant work experience with a Masters degree in STEM or a related field OR 8+ years of relevant work experience with a PhD in a related field.
Possessing outstanding skills in statistical analysis, machine learning methods and text representation techniques
Experience creating and productionalizing NLU, NLP, and LLM solutions by fine tuning open source models
Advanced experience with statistical software (e.g., Python) and database languages (e.g., SQL). Experience with libraries and frameworks commonly used in NLP, such as NLTK, spaCy, TensorFlow, or PyTorch, is also beneficial.
Advanced experience building, cleaning, and testing data quality for data sets preceding modeling
Strong problem-framing, problem-solving, project management skills, customer service, and communication skills
Deep understanding of modeling and statistical approaches (e.g. logistic regression, linear regression, random forests, etc.)
Experience with end-to-end feature development (owning feature definition, roadmap development, and experimentation)
Experience with LLM and ML operations architecture
Experience distilling informal customer requirements into problem definitions, dealing with ambiguity, and competing objectives
Superior verbal and written communication and presentation skills, ability to convey rigorous mathematical concepts and considerations to non-experts
Combination of technical and business acumen, with ability to advocate technical solutions for science with engineering and business audiences
Able to lead multiple data science projects or multiple team members concurrently
Company-wide data science leader providing technical vision to multiple parts of the organization
What would make you stand out:
PhD in a related field
Experience in payment processing space
Combination of technical and business acumen
Experience in Agile methodologies and understanding of the SDLC