Cornell

Ithaca, NY
1,001-5,000 employees
Cornell University is a private research university that provides an exceptional education for undergraduates and graduate and professional students.

Machine Learning Data Engineer

Machine Learning Data Engineer

This job is no longer open

   

Machine Learning Data Engineer (Remote)

*No Visa Sponsorship is available for this position.


About the Cornell Lab of Ornithology

The Cornell Lab of Ornithology is a major, globally focused institution for research, training, and public communication relating to birds and biodiversity. The Center for Avian Population Studies (CAPS) is one of the programmatic units that carry out the Lab’s mission to interpret and conserve the earth’s biological diversity through research, education, and citizen science focused on birds. One of the major projects in CAPS is eBird, which collects information about the distribution and abundance of birds, taking advantage of the enormous popularity of watching birds to create a global network of volunteers who submit bird observations via the web to a central data repository. With this data, scientists at CAPS use machine learning and statistical methods to combine observations of birds with habitat information from satellites to make predictions of bird occurrence, abundance, and trends at a high spatiotemporal resolution. This project, known as eBird Status and Trends, represents a valuable resource for informing conservation decision making and resource management.

The Opportunity

While position responsibilities vary, every member of our community is expected to foster a culture of belonging and a psychologically healthy work environment by communicating across differences; being cooperative, collaborative, open, and welcoming; showing respect, compassion, and empathy; engaging and supporting others regardless of background or perspective; speaking up when others are being excluded or treated inappropriately; and supporting work/life integration of oneself and others.

The Machine Learning Data Engineer will work on the eBird Status and Trends project and collaborate with computer scientists, statisticians, and ecologists to prepare data for and run large analysis processes on a variety of Linux high performance computers (HPC) platforms including local HPCs and large HPC systems available through the ACCESS program of the National Science Foundation. This will include managing, updating, running, and designing machine and deep learning workflows and scripts that perform big data spatiotemporal analyses with tools like Slurm for job submission, monitoring, and control. The Data Engineer will develop, test, code, and maintain data resources that are provided for data intensive analysis processes, while working in a collaborative development environment. This includes utilizing software development tools like R, Python, team code repositories, and open source libraries. An important goal is to develop software and data products to ensure the data collected by eBird are available and appropriately used to the maximum extent possible by conservationists and researchers. Software produced by the Information Science and Technology Program generally includes R, Python, and SQLite, Parquet, and PostgreSQL backends, all running on Linux operating systems.

What We Need

Required Qualifications:

  • Bachelor’s degree with 3-5 years relevant experience or equivalent combination of education and experience.
  • Must have experience developing, managing, and automating software in a data science or scientific project setting.
  • Must have experience with scientific computing languages R and/or Python, Linux/Unix scripting, and command-line tools.
  • Demonstrable skills in problem solving, critical thinking, and written and verbal communication. Experience designing, implementing, and documenting software.
  • Ability to manage and automate big data analysis workflows within high performance computing (HPC) or cloud computing environments.
  • Experience with SLURM.
  • Experience with relational databases, such as SQLite and PostgreSQL.
  • Experience using collaborative code source versioning with Git.
  • Ability to work effectively independently, work effectively on a team, and to learn technical material quickly.
  • Ability to establish realistic goals and deliver work on schedule. A keen eye for data quality and computational efficiency.
  • Experience in and/or demonstrated commitment to supporting diversity, equity, access, inclusion, and wellbeing.
  • Passionate about working in an organization that values and promotes diversity, equity, inclusion, anti-racism, and wellbeing.

*No Visa Sponsorship is available for this position.

If you have all those things, great! We have a few more things that we would prefer you to have, but it’s ok if you don’t.

Preferred Qualifications:

  • 3-5 years of experience developing, automating, and running R and/or Python statistical or scientific applications within a HPC environment or scalable big data engines, such as Spark.
  • Experience building distributed, reliable data pipelines that ingest and process data at scale, with tools like Flyte.
  • Experience with Singularity, Docker, Kubernetes, AWS, and/or other cloud technologies.
  • Experience collaborating with data science and/or programming teams on software projects is desired.
  • Experience with architecting analysis workflows that are either memory-limited or use GPUs for deep learning.
  • Basic knowledge of GIS functionality and core libraries, such as GDAL.
  • Experience managing and processing large volumes of remote sensing data, such as MODIS products.
  • Experience managing and developing specifically SQLite, PostgreSQL, or cloud databases including advanced SQL knowledge and stored procedures.
  • Prior experience managing priorities and working on a diverse set of application development projects.
  • Experience managing large volumes of data for replication and archival storage, including AWS s3.

What We Offer

Rewards and Benefits

  • This position is based in Ithaca, New York, however, the successful applicant may perform this role remotely anywhere within the United States. Employees who work remotely may receive multiple W-2 Forms depending on their work location.
  • The New York Convenience of employer guidelines require New York State individual tax reporting and withholding for this position. Additional individual state income tax filings may also be required if working temporarily outside of New York State.
  • Cornell receives national recognition as an award-winning workplace for our health, wellbeing, sustainability, and diversity initiatives.
  • Our benefits programs include comprehensive health care options, generous retirement contributions, access to wellness programs, and employee discounts with local and national retail brands. We invite you to follow this link to get more information about our benefits: https://hr.cornell.edu/benefits-pay.
  • Follow this link to learn more about the Total Rewards of Working at Cornell: https://hr.cornell.edu/jobs/your-total-rewards.
  • Our leave provisions include health and personal leave, three weeks of vacation and 13 holidays: Martin Luther King, Jr. Day, Memorial Day, Juneteenth, Independence Day, Labor Day, Thanksgiving and the day after, and an end of the year winter break from December 25-January 1.
  • Cornell's impressive educational benefits include tuition-free Extramural Study and Employee Degree Program, tuition aid for external education, and Cornell Children's Tuition Assistance Program.

   

Familiarize yourself with Cornell's COVID-19 workplace guidance as well as the university's COVID-19 services and information.

  

University Job Title:

Software Engineer III

Job Family:

Information Technology

Level:

F

Pay Rate Type:

Salary

Pay Range:

$90,579.00 - $105,267.00

Remote Option Availability:

Fully Remote

Company:

Contract College

Contact Name:

Maria Avila

Job Titles and Pay Ranges:

Non-Union Positions

Noted pay ranges reflect the potential pay opportunity for each job profile. The hiring rate of pay for the successful candidate will be determined considering the following criteria:

  • Prior relevant work or industry experience

  • Education level to the extent education is relevant to the position

  • Unique applicable skills

  • Academic Discipline (faculty pay ranges reflects 9-month annual salary)

To learn more about Cornell’s non-union staff job titles and pay ranges, see Career Navigator.

Union Positions

The hiring rate of pay for the successful candidate will be determined in accordance with the rates in the respective collective bargaining agreement. To learn more about Cornell’s union wages, see Union Pay Rates.

Current Employees:

If you currently work at Cornell University, please exit this website and log in to Workday using your Net ID and password. Select the Career icon on your Home dashboard to view jobs at Cornell.

Online Submission Guidelines:

Most positions at Cornell will require you to apply online and submit both a resume/CV and cover letter.  You can upload documents either by “dragging and dropping” them into the dropbox or by using the “upload” icon on the application page. For more detailed instructions on how to apply to a job at Cornell, visit How We Hire on the HR website.

Employment Assistance:

For general questions about the position or the application process, please contact the Recruiter listed in the job posting or email mycareer@cornell.edu.

If you require an accommodation for a disability in order to complete an employment application or to participate in the recruiting process, you are encouraged to contact Cornell University's Office of Institutional Equity and Title IX at voice (607) 255-2242, or email at equity@cornell.edu.

Applicants that do not have internet access are encouraged to visit your local library, or local Department of Labor. You may also request an appointment to use a dedicated workstation in the Office of Talent Attraction and Recruitment, at the Ithaca campus, by emailing mycareer@cornell.edu.

Notice to Applicants:

Please read the required Notice to Applicants statement by clicking here. This notice contains important information about applying for a position at Cornell as well as some of your rights and responsibilities as an applicant.

EEO Statement:

Diversity and Inclusion are a part of Cornell University’s heritage. We are a recognized employer and educator valuing AA/EEO, and we do not tolerate discrimination based on any protected characteristic, including race, ethnic or national origin, citizenship and immigration status, color, sex/gender, pregnancy or pregnancy-related conditions, age, creed, religion, actual or perceived disability (including persons associated with such a person), arrest and/or conviction record, military or veteran status, sexual orientation, gender expression and/or identity, an individual’s genetic information, domestic violence victim status, familial status, marital status, or any other characteristic protected by applicable federal, state, or local law. We also recognize a lawful preference in employment practices for Native Americans living on or near Indian reservations in accordance with applicable law. 

Cornell University embraces diversity and seeks candidates who will contribute to a climate that supports students, faculty, and staff to all identities and backgrounds. We encourage individuals from underrepresented and/or marginalized identities to apply.

2023-12-12
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.