This position is part of the Enterprise Data & Analytics Capability team under the Global Technology Organization. In this role, you will lead the design, development, and optimization of large-scale data solutions on the Databricks platform.
Key responsibilities
• Design, build, and maintain scalable data pipelines on Databricks (using Spark, Delta Lake, etc.)
• Write clean, efficient, and maintainable PySpark or SQL code for data transformation
• Design robust data models for analytics and reporting
• Ensure data quality, consistency, and governance
• Handle batch and streaming data workflows
• Provide architectural guidance and support in platform usage
• Drive best practices in data engineering across the team
• Monitor and optimize performance of Spark jobs and cluster usage
• Ensure compliance with security and data privacy standards
Essential skills
• Bachelor’s degree in Computer Science, Engineering, or a related field
• Minimum of 5 years’ programming experience which includes at least one year working with a big data platform; experience in data engineering domain, Python, SQL, and cloud platforms such as Azure
• Familiarity with relevant systems, tools, languages, and business domain which includes Data Lakehouse principles, relational and Kimball data models (required)
• Experience with CI/CD pipelines and version control tools (required)
• Knowledge of data visualization tools and BI platforms (preferred)
• Certification in Databricks or relevant cloud platforms (preferred)
• Good communication (verbal and written)
• Experience in managing client stakeholders
Data engineering, Databricks, Data Factory, Azure Services, Apache Spark, Delta Lake, PySpark, SQL, Python, Big data platforms, Azure cloud platforms, Data Lakehouse principles, Relational data models, Kimball data models, CI/CD pipelines, Version control tools, Data visualization tools, BI platforms, Data governance, Batch data processing, Streaming data workflows, Spark performance optimization, Stakeholder management, Communication skills