Download PDF

Achievements

  • Leadership & Team Collaboration: Drove the formation and growth of various tech teams across multiple regions, fostering a culture of collaboration and innovation in the cloud domain.
  • Strategic Planning: Masterminded the roadmap and OKRs for tech teams, aligning efforts with organizational goals and ensuring progress towards strategic objectives.
  • Cybersecurity & Policy Development: Led the SOC2 compliance initiative by developing and implementing comprehensive cybersecurity policies and controls, significantly enhancing the organization's security posture.
  • Performance Metrics: Implemented comprehensive KPIs for tracking team performance and cloud cost optimization, resulting in significant financial efficiency and resource management.
  • Cloud Infrastructure Management: Led the transition from legacy systems to state-of-the-art cloud technologies using Kubernetes, Terraform, and other cloud-native tools.
  • CI/CD & Observability: Developed a cloud-native CI/CD platform and an in-house observability platform, improving deployment times and system monitoring capabilities.
  • Disaster Recovery Planning: Designed and executed cloud-based disaster recovery strategies, enhancing business resilience and data protection.

Work experience

Smarter AI DMCC

2024-03Present

Directory of Infrastructure and Platform Operations

  • Cloud Migration: Spearheaded the migration from Azure to Google Cloud Platform (GCP), ensuring seamless transition and minimal downtime.
  • Cyber Security Compliance: Led the Cyber Security compliance initiatives for SOC2, successfully achieving compliance and enhancing the security posture of the organization.
  • Access Management: Secured access to all environments by implementing Role-Based Access Control (RBAC), improving security and operational efficiency.
  • Infrastructure Automation: Automated the provisioning and management of cloud infrastructure using Terraform, reducing manual efforts and increasing efficiency.
  • Cost Optimization: Developed and implemented strategies for cloud cost management and optimization, resulting in significant savings for the organization.
  • Disaster Recovery Planning: Designed and implemented disaster recovery strategies on GCP, ensuring business continuity and data protection.
  • Monitoring and Observability: Enhanced system monitoring and observability by integrating tools such as Prometheus and Grafana, leading to improved incident response and resolution times.
  • Continuous Integration/Continuous Deployment (CI/CD): Improved the CI/CD pipeline by integrating tools such as Jenkins and GitLab, accelerating the software delivery process.
  • Collaboration and Communication: Facilitated cross-functional collaboration with development, security, and operations teams to drive projects and initiatives to successful completion.
  • Security Enhancements: Implemented advanced security measures such as network segmentation, VPNs, and firewalls to protect the organization’s digital assets.
  • Performance Optimization: Conducted performance tuning and optimization of cloud environments, ensuring high availability and scalability of services.
  • Policy Development: Developed and enforced cloud governance policies to ensure compliance with industry standards and best practices.
  • Incident Management: Led incident response and root cause analysis for critical infrastructure issues, implementing corrective actions to prevent recurrence.

Cafu App DMCC

2022-072024-02

Senior DevOps Manager

  • Leadership and Strategy: Successfully led and managed the DevOps, IT Ops, and TechOps teams. Designed the strategic roadmap and OKRs for each quarter, and conducted periodic performance reviews. Ensured team members' professional growth through continuous guidance and mentorship.
  • Process and Workflow Optimization: Introduced ITSM, SLA guidelines, Bug Hunt and Reporting process, and an on-call emergency reporting system. Managed scrum ceremonies, fostering collaboration and continuous improvement.
  • Infrastructure Monitoring and Incident Management: Upgraded system monitoring and reporting, implemented a comprehensive support ticketing system, and led the creation of systematic Root Cause Analysis and bug reports. These actions improved system visibility, incident management, and reduced downtime.
  • Automation and Efficiency: Developed multiple automation scripts integrated with Slack, reducing manual efforts and improving productivity. Initiated testing and deployment of new CI/CD systems to accelerate software delivery cycles.
  • Financial Management: Collaborated with the CTO to prepare the fiscal budget for the technical departments, ensuring financial efficiency and the sustainable allocation of resources.
  • Team Building and Policy Development: Built in-house TechOps and DevOps teams from scratch. Drafted and rolled out the organization's first comprehensive IT policy to establish a clear framework for operations and data security.
  • Security and Access Management: Streamlined access management through the organization's Single Sign-On (SSO) system and led the process to select, onboard, and integrate email and endpoint security systems, bolstering the company's cybersecurity infrastructure.
  • Infrastructure Design and Implementation: Designed and architected a new, geographically distributed, and highly redundant cloud infrastructure using Terraform and Atlantis. This enhanced the scalability, resilience, and performance of the company's digital assets.

Talabat, Dubai, U.A.E

2020-082022-06

Senior SRE Manager

  • Automation and Observability: Introduced an automation culture to the team and the organization and led the observability drive across the company. This increased system visibility and optimized workflow efficiency.
  • Team Development and Mentorship: Encouraged teams to learn and grow their skillsets, providing technical guidance and mentorship. Instituted a 'learning hour' activity for the SRE team to consistently grow their technical skills. Conducted yearly and mid-yearly reviews and developed individual growth plans.
  • Strategic Planning and Operation: Designed and implemented OKRs for each quarter along with their measurement metrics. Conducted scrum ceremonies to ensure effective team collaboration and timely support.
  • Infrastructure Migration: Proposed and led a migration project from on-premises to cloud infrastructure, increasing scalability and operational flexibility.
  • Cost Management: Developed cost management dashboards for most third-party vendors, allowing an understanding of expenditure patterns and areas where costs can be reduced.
  • Service Management: Implemented an automated service request system, streamlining IT service management and improving response times.
  • Collaborative Compliance Achievement: Partnered with the cybersecurity team to achieve PCI DSS certification for our cloud infrastructure, demonstrating a commitment to secure and reliable handling of sensitive customer data, enhancing customer trust and business reputation.

Dubizzle FZ LLC, Dubai, U.A.E

2016-022020-06

DevOps Manager

  • Team Building and Management: Built DevOps teams for various regions from scratch. Implemented weekly one-on-one catch-ups with team members and conducted mid-year & yearly reviews, providing feedback and guiding individual growth and improvement plans.
  • Strategic Planning and Operations: Developed strategies and OKRs for the SRE team to measure progress, managed day-to-day operations, and led weekly planning meetings to ensure efficiency and alignment with the company's goals.
  • Infrastructure Migration and Automation: Led the migration of legacy infrastructure to Kubernetes using Infrastructure as Code (IaC) with tools like Terraform, Helm, and Ansible. Built an in-house framework for CI/CD pipelines, significantly reducing deployment times.
  • Disaster Recovery: Set up disaster recovery (DR) policies and infrastructure utilizing AWS Backups with multi-region and multi-retention levels to ensure data integrity and availability.
  • Security and Logging Infrastructure: Designed and implemented a secret management system using a modified version of Hashicorp Vault. Additionally, designed a robust logging infrastructure capable of handling approximately half a billion log lines every 15 minutes, enhancing system monitoring and debugging capabilities.

Palantir, Abu Dhabi, U.A.E

2015-022015-12

Site Reliability Engineer

  • Server Management and Security: Developed utilities to maintain server uptime and security, ensuring optimal performance and safe operations.
  • Infrastructure Automation: Automated server provisioning through the creation of Puppet modules, increasing the efficiency and consistency of infrastructure deployment.
  • System Monitoring: Monitored servers and services for alerts, prioritizing critical systems to ensure maximum uptime and swift issue resolution.
  • Cloud Infrastructure Management: Authored specialized scripts for maintaining AWS infrastructure, enhancing the reliability and manageability of cloud resources.
  • Cross-Team Collaboration: Assisted various teams in troubleshooting server or service-related issues, fostering inter-departmental collaboration and prompt issue resolution.

Dubizzle FZ LLC, Dubai, U.A.E

2012-122015-01

Linux System Administration

  • Deployment Automation: Authored and maintained Chef recipes for automating all deployments, enhancing consistency and reducing manual intervention in deployment processes.
  • Local Testing Environment: Maintained an OpenStack-based local testing environment, ensuring reliable and effective software testing processes.
  • Database Implementation: Implemented PostgreSQL with Streaming Replication and High Availability, complete with Chef recipes, strengthening data integrity and availability.
  • Disaster Recovery Setup: Completed the setup for a disaster recovery site on AWS, enhancing business continuity capabilities.
  • Monitoring Services: Managed monitoring services such as Nagios and Munin, keeping them up to date to ensure real-time insights into system performance and health.
  • System Administration Automation: Authored scripts in Python and Bash to automate numerous system administration tasks, increasing operational efficiency and reducing error-prone manual tasks.

Advanced Research Projects and Technologies, Karachi, Pakistan

2011-022012-12

Team Lead System Administrator

  • Network Design and Implementation: Designed and implemented a high-availability network based on Cisco Catalyst 3560, improving network resilience and uptime.
  • System Administration Automation: Authored scripts in BASH, PERL, and PHP to automate system administration tasks, enhancing operational efficiency and reducing manual effort.
  • Central Authentication Management: Managed a central OpenLDAP server with distinct OUs for VPN, PAM, and SUDO for different DMZs, improving security and user management.
  • Virtual Machine Management: Oversaw and monitored over 300 VMs, with separate authentication using OpenLDAP server for UAT and production environments, ensuring effective and secure access control.
  • Disaster Recovery Plan: Designed and implemented a Disaster Recovery (DR) plan with remote ESX and backup, enhancing data protection and business continuity capabilities.

Skills

Strategic Leadership

Proven record of building team(s) from scratch and incorporating a healthy learning environment with career growth planning

Technical Expertise

Showcased deep technical proficiency in cloud and infrastructure design, automation, and scripting languages.

Mentorship

Proven ability to guide and foster professional growth within teams, ensuring their skill development aligns with the company's strategic goals.

Designing and Planning OKRs & KPIs

I have been directly involved with setting up OKRs for yearly goal setting and monitoring them quarterly and also setting up KPIs for measuring business performance metrics

Operational Excellence

Implemented efficient processes, from performance reviews to scrum ceremonies, optimizing team workflows and overall operational efficiency.

Security & Access Management

Led advanced security initiatives, streamlined access management, and ensured company-wide compliance with IT policies.

System Development Life Cycle

In depth understanding of SDLC and on hands experience of using the SDLC framework with different projects and teams

Budgeting & Cost Management

Experienced DevOps engineer and an advocate of automation and DevOps culture.

Migration & Disaster Recovery Planning

Led critical infrastructure migration projects and designed robust disaster recovery systems, ensuring business continuity.

Education

Bahria University, Karachi, Pakistan

2007-022011-01

BBA: Management Information Systems

Certifications

  • RHCE (RedHat Certified Engineer)
  • CKA (Certified Kubernetes Administrator)
  • CCNA (Certified Cisco Network Administrator)
  • AWS SA (Amazon Web Services Solutions Architect)

Personal Interests

  • Reading books
  • Swimming
  • Road trips

References