New Visions for Public Schools

New York

501-1,000 employees

New Visions for Public Schools designs, supports, and sustains great public schools for NYC's highest-need students.

Prior Listings

Other Jobs in Data Engineering

See all

Data Engineer II

New Visions for Public Schools

Data Engineer II

Data Engineering • Nonprofit

This job is no longer open

New Visions works to make the public education system in New York City a place where students from every background can graduate high school and successfully transition into their post-secondary future. The School Systems and Data Analytics Department aims to accomplish this mission by creating a comprehensive data management ecosystem to support school and district users in making decisions that maximize their students’ likelihood of graduating and succeeding beyond high school. We currently support over 900,000 students in nearly 1600 schools, working with teachers and administrators to help students progress towards graduation and beyond into post-secondary success.

The School Systems and Data Analytics department is at the core of supporting New Visions staff and schools in translating data into action. Multiple times per day, we manage the processing of data from multiple internal and external source systems into New Visions databases and into live tools, providing staff with actionable, timely, and accessible information to make data-informed decisions.

The Data Engineer II plays a crucial role within the unit and the organization, working closely with the data team and portal team to develop and maintain a robust data model, monitor and troubleshoot core data processing pipelines, and manage the New Visions’ data tools ecosystem to provide the right data quickly to key stakeholders. The Data Engineer II is primarily responsible for building and maintaining the infrastructure used to operate and scale the data platforms that the organization supports.

Who You Are

You are excited about public service and the prospect of solving problems that are challenging and affect urban schools everywhere.

You are detail oriented and enjoy organizing data in a way that will facilitate the work of team members.

You care about creating a data model that is optimized for performance and quality

You enjoy working with analysts, designers and product managers to determine the best way to grow our data model to accommodate new features and tools.

You love working in teams to solve complex challenges. You thrive in a fast-paced, highly collaborative environment.

What You’ll Do

Develop, monitor, and improve the NV data model and its pipelines
Use a combination of R, Python, SQL and other tools to manipulate and transfer data
Create cross-sectional and longitudinal data sets from raw data files
Collaborate with software engineers and product managers to create schema in mongoDB and deliver data that fulfills feature requests
Create systems for assuring data quality and accuracy
Create alerts and process-monitoring tools to understand the flow of data
Ensure consistency between data analyses, Google-based tools, Tableau dashboards, and the NV portal
Manage infrastructure for processing large data sets for use in NV data tools and the NV data portal
Maintain repository of R scripts for automated overnight data processing and incorporate new data streams as needed
Research, test, and integrate methods to streamline the NV data model
Support the integration of additional data sources into New Visions data warehouse and data tools
Support the operationalizing of robust data quality assurance within data infrastructure
Monitor and evaluate core data processing efforts to identify areas for improvement as well as troubleshoot technical issues
Collaborate with internal colleagues to support current and forthcoming features for NV tools and the NV data portal
Collaborate with product managers, designers, analysts and software engineers to meet product specifications
Communicate best-practices in data engineering to the respective teams
Deliver data to the teams that meet product specifications
Provide metrics of data processing to highlight areas for growth in core data processing efforts

Required Knowledge and Skills

A minimum of 4 years of experience in data engineering, software engineering, or advanced data analytics
Proficiency in R and/or Python required
Proficiency with SQL databases (Redshift, Postgres, etc.)
Proficiency with noSQL databases (mongoDB, Cassandra, etc.)
Proficiency with ETL development
Expertise managing data pipelines to support continuing increases in data volume and complexity
Demonstrated ability to manage Airflow instances and develop best practices for scheduling jobs
Demonstrated ability in SQL, as well as common practices in schema design and data storage
Exceptional strategic, analytical, and critical thinking skills
Strong project management and organizational skills.
Excellent written and oral communication skills.
Close attention to detail
Demonstrated ability to prioritize, multi-task, work under pressure and meet deadlines
Demonstrated persistence and independence in learning technical subject matter, and in solving technical problems.
Demonstrated ability to identify problems and suggest solutions for discussion
Demonstrated ability to identify problems and lead improvements to existing codebases and processes
Knowledge of public education data in New York State

Desired Knowledge and Skills

Expertise in Git and Github - including branching, merging, diffs, and hotfixes
Expertise in Python, R and SQL
Experience building scalable solutions with AWS Redshift, MongoDB, and PostgreSQL
Experience using big data technologies (Hadoop, Spark, etc.)
Experience building data pipeline tools to manage data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it

Our Technology Stack

Data/Database Layer

AWS (Redshift, S3, RDS), MongoDB

Code

Linux, R and Python

Orchestration Layer

Apache Airflow, Docker, AWS ECS

This job is no longer open