We are seeking a highly skilled and motivated Data Engineer to join our team. As a Data Engineer, you will be responsible for Designing, developing, and maintaining data pipelines that ingest, transform, and enrich data from various sources into data platform and feature stores using Python, SQL and other cloud native data engineering services to serve analytics needs and machine learning model development needs. The ideal candidate will have a solid background in Data Engineering with proficiency and experience in building cloud based data platforms. Experience with AWS and or GCP is preferred.
In this role, you’ll have the opportunity to:
Build and enhance our cloud data platform and feature store
Help the Data Science team build and deploy Machine Learning models
Continually learn, grow, and expand your knowledge, while also supporting others’ learning experiences
Collaborate with our talented Product, Data Science and Engineering teams as well as other parts of the RealPage business to deliver great products
Utilize best practices for architecture, implementation, testing, monitoring, logging, and deployments
Take on ad-hoc projects as they arise & always be willing to support your team where they need you
Communicate and exchange accurate information to others via telephone or internet video applications
Primary Responsibilities:
Contribute to the detailed design and architecture of the data platform ensuring consistency, efficiency and reusability of data components and processes
Perform data cleansing and validation using SQL/Python and other tools and frameworks to remove or correct erroneous, incomplete, or inconsistent data
Responsible for applying data transformations and business logic to enhance, enrich, or standardize data
Handle large-scale and complex data sets using distributed systems & parallel computing to improve the performance of the pipelines
Monitor, troubleshoot, and optimize the performance and cost of data pipelines
Research and evaluate tools and technologies to improve the data platform capabilities
Participate in design discussions and code reviews
Work in an Agile environment with daily stand-ups and 2 week sprints
Partner and work with DevOps, Data Science, Engineering, Product and with other internal team members
Basic Qualifications:
3+ years of Data engineering experience
3+ years of experience writing complex SQLs
3+ years of experience building data pipelines using Python
Experience building / working with cloud data platforms like Redshift / Snowflake / BigQuery
Strong attention to detail, performance and quality
Experience working in a fast-paced, Agile environment
Strong communication skills
Nice to haves:
Experience working with cloud platforms like AWS and or GCP
Experience working with Machine Learning Pipelines
Experience with prompt engineering for GenAI
Experience working with distributed programmimg frameworks like Apache Spark / Flink / Storm etc #LI-JL1 #LI-REMOTE