Overview
Looking for an experienced Senior Data Architect to design and lead the implementation of scalable, secure, and high-performance data platforms. The ideal candidate will have strong experience in Databricks, modern data architectures (Lakehouse), cloud platforms (Azure/AWS/GCP), and enterprise data strategy.
This role will drive data architecture decisions, enable advanced analytics and AI use cases, and ensure alignment with business objectives.
Job Description
Key Responsibilities
1. Data Architecture & Strategy
- Define and implement enterprise data architecture roadmap
- Design modern data platforms (Lakehouse architecture) using Databricks
- Establish data modeling standards (e.g., dimensional, medallion architecture)
- Lead architecture for data ingestion, transformation, storage, and consumption layers
- Design and define Databricks workspace architecture, including Cluster sizing and workload-based compute strategy and Workspace-level governance (Unity Catalog, access isolation)
- Architect job orchestration and scheduling frameworks (Workflows / external schedulers)
- Enable integration with Azure Key Vault / Secrets management systems, External APIs and enterprise systems, AI/LLM services for advanced analytics use cases
- Perform capacity planning, sizing, and cost estimation for Databricks workloads
2. Databricks Platform Leadership
- Architect and optimize solutions using Databricks (Delta Lake, Unity Catalog, Workflows)
- Design ETL/ELT pipelines using PySpark, SQL, and notebooks
- Implement Delta Lake features (ACID transactions, time travel, schema evolution)
- Drive adoption of Databricks best practices and performance optimization
- Design and validate cloud landing zone architecture for data platform
- Drive the infrastructure sizing and cost optimization across compute, storage, and networking
- Ensure secure data access patterns and network isolation for enterprise data platforms
3. Cloud Data Engineering
- Design scalable solutions in Azure (preferred), AWS, or GCP
- Integrate Databricks with services such as: Azure Data Factory / AWS Glue
ADLS / S3 / BigQuery
- Build real-time and batch data processing pipelines
4. Data Governance & Security
- Implement data governance frameworks using Unity Catalog or similar tools
- Ensure data quality, lineage, cataloging, and compliance
- Define and enforce data security and access management
5. Stakeholder Management
- Collaborate with business stakeholders, data scientists, and engineering teams
- Translate business requirements into scalable technical solutions
- Provide architectural leadership across multiple projects
6. Performance & Optimization
- Optimize data pipelines, storage, and compute costs
- Implement monitoring, logging, and performance tuning mechanisms
- Ensure high availability and scalability of data systems
7. Team Leadership
- Mentor and guide data engineers and architects
- Establish coding standards, best practices, and reusable frameworks
- Lead design reviews and architecture governance forums
Required Qualifications
- Bachelor’s/Master’s degree in Computer Science, Engineering, or related field
- 10+ years of experience in data engineering / architecture
- 4+ years of hands-on Databricks experience
- Strong expertise in: PySpark, SQL, Python, Delta Lake and Lakehouse architecture, Distributed data processing
Technical Skills
Must-Have
- Hands on Knowledge of Databricks - Delta Lake, Unity Catalog, Lakeflow, Lakebase, Databricks Apps, Workflows, job scheduling, ETL/ ELT pipeline orchestration and more
- Strong understanding of Databricks architecture (control plane vs data plane, workspace model)
- Experience with Cluster sizing, autoscaling, and workload-based optimization, Cost estimation and performance tuning and FinOps
- Building CI/CD pipelines and Declarative Automation Bundle (DAB) implementation
- Exposure to AI/ML/LLM integrations within Databricks ecosystem (MLflow, external LLM APIs)
- Strong understanding of cloud landing zone concepts (subscription design, governance, policies)
- Understanding of enterprise security architecture for data platforms
- Cloud platforms: Azure / AWS / GCP
- Data modeling: Dimensional, Data Vault
- Data warehousing
- API and data integration patterns
Good-to-Have
- Understanding of cloud network topology for secure data platforms - VNet injection, private endpoints, subnet isolation, NSG, Secure access to storage, Databricks, and external systems
- Knowledge of hybrid connectivity models - On-premises integration (VPN, ExpressRoute), Cross-cloud architecture (AWS/Azure/GCP interoperability)
- Identity and access management, Secret Management
- Knowledge and Hands on with Data governance tools (Collibra, Alation or equivalent)
- Real-time streaming design and implementation
Skills & Requirements
DevOps, Cloud Network, MLOps, IAM, Cross cloud architecture, Secret management, Real time, AWS, Data Governance, DAB, AI, Lakebase, FinOps, Declarative Automation Bundle, Data ingestion, Lakehouse, GCP, Lakeflow, Data integration, Databricks Apps, CICD, CI/CD, MLFlow, Databricks, Data Modeling, ETL, Data warehouse, ML
Apply Now