Data Engineer, Web Scraping

Data Engineer, Web Scraping

This job is no longer open
Lightcast is looking for a talented Data Engineer to work on the Web Scraping team. Their primary responsibility is to design, implement, and maintain data pipelines that transform data, adding value to it in the process. In carrying out the objective of this position, a successful candidate will be committed to our values by working with gratitude, operating as a team, and partnering with our customers.

Core Responsibilities

    • Create and improve processes, tools, workflows, and resilient data architecture to scrape web content.  
    • Manage data accuracy and quality. 
    • Identify and rectify any issues with breaks as well as scale scrapers as needed.

Organizational Relationships

    • This position reports to the Senior Director of Data Innovation.

Knowledge, Skills, and Abilities

    • Web scraping experience (Scrapy or similar technologies)
    • Solid understanding of web technologies (HTML, JavaScript, CSS, XPath, JSON, etc)
    • Experience with ETL and creating data pipelines
    • Basic Linux and Git experience
    • Python
    • Scrapy Python library
    • AWS/AWS Batch
    • Familiarity with data processing tools (pandas, regex, SQL)

Traits

    • Attention to detail
    • Self-starter
    • Curious
    • Customer Oriented
    • Organized
    • Problem solving

Credentials and Experience

    • 2+ years Python experience, preferred
    • Professional experience with web scraping preferred
    • Bachelor’s degree, preferred
Lightcast is proud to be an equal opportunity workplace and is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status. Lightcast has always been, and always will be, committed to diversity, equity, and inclusion. We seek dynamic professionals from all backgrounds to join our team, and we encourage our employees to bring their authentic, original, and best selves to work.

#LI-Remote
This job is no longer open
Logos/outerjoin logo full

Outer Join is the premier job board for remote jobs in data science, analytics, and engineering.