Spokeo

Pasadena, CA
51-200 employees
People search engine and free white pages finds phone, address, email, and photos. Find people by name, email, address, and phone for free.

Software Developer, ETL

Software Developer, ETL

This job is no longer open

Note: Contractors (C2C, C2H) that directly apply will not be considered. Individual applicants only.


Spokeo is a people search engine that both enlightens and empowers our customers. With over 12 billion records and 18 million visitors monthly, we reconnect friends, reunite families, prevent fraud, and more.


As a Software Developer, ETL at Spokeo, you will be responsible for implementing and maintaining ETL processes using various data sources. This can include locating and analyzing source data, creating data flows to extract, profile, and store ingested data, defining and implementing data cleansing, mapping data to a common schema, transforming data to satisfy business rules, and validating content in AWS using Pentaho, Python, Spark, S3, EMR, etc.


Responsibilities (including estimated time of how much of an average week is spent doing each item. This is subject to change):

  • 25% - Collaborating with stakeholders to define business logic for source-to-target data mappings and integration workflows.

  • 25% - Developing source-to-target data mappings and transformations to support business requirements.

  • 25% - Collaborating with Data Engineers to enhance and optimize new components in the ETL pipeline.

  • 15% - Leading data analysis, ad-hoc investigations into data anomalies as needed, and maintaining technical documentation.

  • 10% - Use best practices for data governance, quality, cleansing, and other ETL-related activities.

Requirements:

  • 3+ years of professional experience in ETL and Data Engineering, preferably in Big Data in ETL tools such as Pentaho AEL or equivalent tools such as Talend, DataStage, and Informatica)

  • Advanced SQL coding skills for data transformations, profiling, and query tasks.

  • Required skills - Python, SQL, ETL, and cloud experience. Preferred - Pyspark

  • Preference for development experience in highly scalable, distributed systems and cluster architectures (e.g., AWS, EMR, Spark, etc.)

  • Prior experience working with large data sets (>10M+ records)

  • Experience in agile environments such as scrum and Kanban.

  • B.S. preferred in Computer Science, Information Systems, or related fields (foreign education equivalent accepted)


Spokeo extends written offers to candidates who successfully complete their selection process. Spokeo’s offers include a base salary, participation in a company bonus program, stock options, and comprehensive benefits. A final offer will depend on several factors, including, but not limited to, marketplace competition, job leveling, the candidate’s experience, skills, etc.


Privacy Notice for Candidates: https://www.spokeo.com/recruiting-policy


Spokeo is an equal opportunity employer. Applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, or protected veteran status. Spokeo fosters a business culture where ideas and decisions from all people help us grow, innovate, create the best products, and be relevant in a rapidly changing world.


Recruiters or staffing agencies: Spokeo is not obligated to compensate any external recruiter or search firm who presents a candidate or their resume or profile to a Spokeo employee without 1) a current, fully-executed agreement on file, and 2) being assigned to the open position (as a search) via our applicant tracking solution.


#LI-REMOTE

This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.