Machine Learning Librarian

Machine Learning Librarian

Here at Hugging Face, we’re on a journey to advance responsible Machine Learning. We create techniques that enable people to develop and critique AI, regardless of their background. We contribute to the development of technology informed not only by science but also by society and people. To this end, we have built the fastest-growing open-source machine learning library in the world, with tools and models downloaded over 100 million times, and powering tech production in over a thousand companies.


About the Role

As Machine Learning systems become increasingly diverse and ubiquitous, we see an increasing need for curating documentation and general knowledge about all of the many models, datasets, and application contexts for this technology. This need should be met by new approaches to managing the vast digital collections of information on these systems, and by a new role of Machine Learning librarian to develop and coordinate these approaches.

As our first ML librarian at Hugging Face, you will be in charge of organizing and connecting the growing amounts of information and documentation of Machine Learning systems and their uses, produced both by our team and our community. Hugging Face is home to 40,000 ML models and over 5000 datasets, a multi-part course on using ML systems, extensive technical documentation, interactive demos, a thriving blog, and a community of users who write about their experiences interacting with the technology at every stage of its life cycle.

You will shape our knowledge and documentation strategy to:

  • Organize the information on ML systems available across all Hugging Face libraries
  • Make information on ML systems at HF more accessible and easier to navigate
  • Bring in good practices from library sciences (including library analytics, storage, digital collections) and related fields to help guide the standardization of ML knowledge presentation
  • Make ML information accessible to low-code and less-technical audiences
  • Engage with our community of users to improve and maintain information collaboratively
  • Find ways to incentivize and facilitate good documentation practices


About You

You’ll enjoy working with us in this role if you care about bringing multiple perspectives on new technologies together, building tools that help users gain a deeper understanding of the ML development process, and facilitating information exchanges between many different profiles and disciplines.

You will need proficiency in Python and familiarity with our open-source libraries. Much of the information you will be called upon to process and organize is created directly within the Machine Learning training processes, so we are looking for candidates who have a good high-level prior understanding of the relationships between e.g. ML datasets, models, evaluations, and application domains. You will not have to build ML systems from scratch yourself, but should be able to work with engineers who do!

Beyond that, we also value a range of diverse skills that are relevant to this position, as it is a new role that will be shaped by the applicant’s experience. Examples of tasks that may be within its purview include (but are not limited to):

  • Documenting ML training processes and strategies for medium to large-scale projects
  • Drafting legal documents on derivative data usage terms
  • Mediating data transference and information exchange between different entities involved in a model training
  • Establishing and maintaining registration for DOIs for a range of ML artifacts
  • Creating new ML and NLP models to help maintain and annotate a significant digital collection
  • Summarizing content across multiple sources
  • Communicating with the public about the work you’re doing

We especially encourage anyone with a background in library science, digital journalism, or experience in related areas that have a component of organizing information at scale to apply for this position. We strongly believe in learning from other fields to help guide the emergence of better ML practices. Note that the documentation will be written primarily in English.

If you're interested in joining us, but don't tick every box above, we still encourage you to apply! We're building a diverse team whose skills, experiences, and backgrounds complement one another.


More About Hugging Face

We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where people feel respected and supported—regardless of who you are or where you come from. We believe this is foundational to building a great company and community. Hugging Face is an equal opportunity employer and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

We value development. You will work with some of the smartest people in our industry. We are an organization that has a bias for impact and is always challenging ourselves to continuously grow. We provide all employees with reimbursement for relevant conferences, training, and education.

We care about your well-being. We offer flexible working hours and remote options as well as unlimited PTO. We offer health, dental, and vision benefits for employees and their dependents. We also offer 12 weeks of parental leave (20 for birthing mothers) and unlimited paid time off.

We support our employees wherever they are. While we have office spaces in NYC and Paris, we’re very distributed and all remote employees have the opportunity to visit our offices. If needed, we’ll also outfit your workstation to ensure you succeed.

We want our teammates to be shareholders. All employees have company equity as part of their compensation package. If we succeed in becoming a category-defining platform in machine learning and artificial intelligence, everyone enjoys the upside.

We support the community. We believe major scientific advancements are the result of collaboration across the field. Join a community supporting the ML/AI community.

Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.