CollegeVine is looking for a data scientist to help us with a clustering and algorithm analysis project.
The right applicant for this role has proven experience with supervised and unsupervised learning, modern ML frameworks, and data pipeline design.
We're a data-driven guidance company, and we're looking for a talented data scientist to help us better understand our users and the behavior of our chancing algorithm on them. Learn more about us here https://www.collegevine.com.
About the project
You'll be working directly with the engineers and founders to analyze the behavior of our chancing algorithm across a broad slice of our users. You're free to use any tools you like, although we want to make sure you explain the solutions you'll provide to us at the end of the project.
Deliverables:
- A well-commented program (something like a Jupyter notebook is fine) that generates the "natural" clusters of our students. Ideally the weighting of individual features should be configurable. The output will be a clustering model we can use on new users.
- A well-commented program (something like a Jupyter notebook is fine) that takes a clustering model and two sets of chancing parameters (A & B) as input. The program must output and visualize the delta between the A & B chancing parameters for each cluster.
Available assets
On day one we'll provide you with
- A CSV containing the user data to be clustered.
- A CSV containing the current chancing model parameters
- Access to our internal documentation and APIs showing you how to apply our chancing algorithm to a set of user profiles and model parameters.