Formstack

Fishers, IN
201-500 employees
Formstack is a secure workplace productivity platform built to produce ingenious solutions to the everyday work that slow organizations down.

Data Science Intern

Data Science Intern

Formstack improves people’s lives with practical solutions to their everyday work. We are looking for the next Stacker to help us accomplish this mission. 
 
Formstack is a company with team members who live and work across the U.S., Canada, Poland and the globe. We offer more than just a job; we provide a community where you can learn, grow, and thrive your way. Join a dynamic and diverse team that values relationships as much as results. Come build what matters with Formstack.

Formstack is seeking an intern to join our Data Classification team for a 4-month fixed term. This role bridges the gap between data science and software development, offering an opportunity to contribute to the design, training, and deployment of advanced deep learning models. The team’s primary focus is on data classification: categorizing the contents of diverse data sources, including database tables, semi-structured data, and text, to detect data with special security and privacy requirements. The successful applicant will contribute to a central feature of a new workflow automation platform with applications in healthcare, higher education, and beyond.

This is a remote position, with occasional in-person meetings. Formstack has employees around the globe, but the data science team will be primarily based in NYC, so we prefer candidates in the Tri-State (NY/NJ/CT) area.

RESPONSIBILITIES

    • Researching new types of data to add to our classifier.
    • Finding and preparing training data.
    • Helping with data cleanup.
    • Tuning prompts for AI data generation.
    • Developing and analyzing benchmarks measuring accuracy, speed, and consistency of results.
    • Finding new evaluation datasets for the classifier, helping run experiments, and analyzing results.

REQUIRED KNOWLEDGE, SKILLS, AND ABILITIES

    • You’re working towards obtaining a Master’s degree in Data Science or a related field or equivalent practical experience.
    • Strong understanding of machine learning, artificial neural networks, and language models.
    • Proficiency in writing object-oriented code in Python.
    • Experience with PyTorch, SciPy, Jupyter Notebooks, and Hugging Face Transformers.
    • Comfort using the Unix command line.
    • Proven research skills and the ability to find reliable information and datasets.
    • Ability to think critically about the contexts in which companies produce and use data.
    • Excellent communication skills for technical concepts.

PREFERRED KNOWLEDGE, SKILLS, AND ABILITIES

    • Experience working in Git repositories with multiple contributors.
    • Familiarity with AWS, Kubernetes, ArgoCD, SageMaker, and/or Bedrock AI.
    • Knowledge of Java.
$25 - $30 an hour
Don’t meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they meet every qualification. Formstack is dedicated to building a diverse, inclusive, and authentic workplace. If you’re excited about this role, but your experience doesn’t align perfectly with every qualification in the job description, we encourage you to apply anyway. You may be just the right candidate for this or other roles.

Formstack is an equal-opportunity employer. We are passionately committed to equitable hiring and boldly dedicated to diversity in our work and staff. We do not discriminate in employment opportunities or practices based on actual or perceived race, color, religion, national origin, sex (including pregnancy, childbirth, or related conditions), age, marital status, sexual orientation, gender identity or expression, veteran status, uniform service member status, disability or any other characteristic protected by law. Women, people of color, bilingual and bicultural individuals, LGBTQ+ persons, and people with disabilities are encouraged to apply.

All data collected in our application process, from resume collection to application questions, is used for recruitment purposes only. We will store it in our applicant tracking system, Lever, and will not share this data with anyone else. We will keep your data until the role is filled and only continue to store it if we feel you may fit future roles.
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.