Data Engineer

Overview

The Data Engineer will be responsible for designing, developing, and optimizing large-scale data pipelines using Databricks, Spark, and Python. This is a short-term, high-priority onsite engagement (6–9 months) supporting critical client data engineering initiatives. The engineer will work directly at the client location in Pennsylvania and collaborate closely with technical teams to deliver scalable and high-performing data solutions.

Job Description

Key responsibilities
Design, build, and maintain Databricks-based ETL/ELT pipelines.
Develop high-performance Spark (PySpark) workflows for data processing.
Work with large-scale data in Lakehouse/Data Lake environments.
Optimize and troubleshoot existing Databricks jobs and clusters.
Collaborate with business and technical stakeholders to understand data requirements.
Implement data quality checks, validation rules, and monitoring processes.
Work with orchestration tools (ADF or equivalent) to schedule and automate workflows.
Ensure best practices in version control, CI/CD, and documentation.
Support production pipelines and resolve data-related issues proactively.

Key competencies
Strong analytical and problem-solving mindset.
Ability to work independently in a fast-paced, onsite environment.
Excellent communication and cross-functional collaboration skills.
Strong ownership and accountability for deliverables.
Adaptability to dynamic project needs.

Skills & Requirements

Databricks (advanced, hands-on), Python, ETL/ELT pipeline development, Spark (SQL/PySpark), Data Engineering, Databricks Lakehouse, Data Lake architecture, Big Data processing, Azure Data Factory (ADF), Workflow orchestration, Data pipeline optimization, Data quality and validation, CI/CD practices, Version control (Git), Performance tuning, Data troubleshooting and monitoring

Apply Now

Join Our Community

Let us know the skills you need and we'll find the best talent for you