Summary
Overview
Work History
Education
Skills
Websites
Certification
Timeline
SRE & DevOps
Cloud Platform Design & Operations
Linux System Administration & Architecture
Generic

Sunil Kumar Thertala Veetil

Boxhill

Summary

IT Professional with 15+ years of expertise in AWS Cloud, DevOps, and Site Reliability Engineering (SRE), driving enterprise-scale infrastructure automation, observability, and operations. Skilled in designing, automating, and managing platforms using AWS, Kubernetes, Docker,Pulumi, Terraform, Ansible, and CloudFormation, with a strong focus on CI/CD pipelines, Security, and Scalability. Experienced in AWS cost optimization, performance tuning, and cloud resource efficiency, delivering significant OPEX savings across environments. Hands-on experience with Databricks for data engineering workloads, Kafka for event streaming, and Kibana/ELK stack for log analytics and monitoring. Proven success in large-scale datacenter & cloud migrations, high availability design, and resilient architecture leveraging VMware, Oracle RAC, and Red Hat clusters. Expertise in observability and monitoring using Prometheus, Grafana, Zabbix, ELK, CloudWatch, integrated with automation and alerting pipelines.

Strong background in Linux (RHEL), storage/SAN administration, and server builds, combined with automation using Python & Bash scripting.

Overview

20
20
years of professional experience
1
1
Certification

Work History

Cloud/Devops/SRE Engineer

Bigtincan Mobile Pty Ltd
11.2021 - Current

Cloud/Devops Engineer

Firstwave Cloud technology Ltd
10.2019 - 11.2021

Senior Engineer IT –SME (Cloud/Linux)

Cisco Systems Pvt. Ltd
11.2008 - 09.2019

DTS Engineer- L3 for HP-UNIX/LINUX

Hewlett Packard Global Soft Ltd.
02.2006 - 11.2008

HP Unix System Administrator

Net connect Pvt. Ltd.
07.2005 - 02.2006

Customer Engineer

HCL Infosystems Ltd.

Education

Master of Science - MCA

Sikkim Manipal

Skills

  • Cloud Administration
  • Cloud Automation
  • Production escalations
  • Clusters Management
  • Python Programming
  • Automation
  • CI/CD Planning
  • Problem Management
  • SRE Initiatives
  • Virtualization Technologies
  • Oncall Supports
  • Docker, Open-Shift Enterprise, Kubernetes

Certification

Certified Redhat Open Stack, Redhat Openshift, CCNA, HP CSA, HP-CSE, EMC SAN foundations, ITIL Certified professional, VMware 5 certified, SAFe Engineer

Timeline

Cloud/Devops/SRE Engineer

Bigtincan Mobile Pty Ltd
11.2021 - Current

Cloud/Devops Engineer

Firstwave Cloud technology Ltd
10.2019 - 11.2021

Senior Engineer IT –SME (Cloud/Linux)

Cisco Systems Pvt. Ltd
11.2008 - 09.2019

DTS Engineer- L3 for HP-UNIX/LINUX

Hewlett Packard Global Soft Ltd.
02.2006 - 11.2008

HP Unix System Administrator

Net connect Pvt. Ltd.
07.2005 - 02.2006

Customer Engineer

HCL Infosystems Ltd.

Master of Science - MCA

Sikkim Manipal

SRE & DevOps

  • Infrastructure as Code (IaC): Migrated legacy CloudFormation templates to Pulumi (TypeScript) for improved modularity, versioning, and automation. Automated AWS provisioning using Ansible, reducing environment setup time by 70%.
  • Automation & Configuration: Fully automated OS customization, application upgrades, and VMware provisioning using Ansible, eliminating repetitive manual tasks and improving consistency.
  • Scalability & Reliability: Designed and deployed AWS Auto Scaling integrated with EventBridge, Lambda, and Ansible, improving system resilience and lowering infrastructure costs.
  • Monitoring & Observability: Expanded monitoring with Zabbix, Prometheus, Grafana, CloudWatch, and Kibana, adding custom metrics, synthetic tests (Selenium), and log analytics for proactive issue detection.
  • Application & Cloud Modernization: Migrated workloads from SaltStack to Ansible and built CI/CD pipelines with Jenkins and GitLab for streamlined deployments.
  • Containers & Orchestration: Managed workloads on Kubernetes & Docker, including deployment, scaling, and cluster troubleshooting.(Using Keda and Karpenter)
  • Programming & Automation: Developed Python-based automation for AWS, OpenStack, and Cisco ESA, improving operational efficiency and visibility.
  • Cost Optimization: Implemented AWS resource right-sizing, tagging, and automation to optimize compute and storage usage, achieving significant monthly savings.
  • Data & Streaming: Worked with Databricks for data engineering automation and Kafka for event streaming pipelines.

Cloud Platform Design & Operations


  • Deployed and managed Kafka & RabbitMQ pipelines for real-time streaming and messaging workloads.
  • Implemented enterprise-grade monitoring using Nagios, Prometheus, Grafana, ELK, and check_mk, including synthetic and custom application monitoring.
  • Automated operational workflows using Python SDKs, Bash scripting, and REST APIs.Designed, customized, and managed OpenStack environments (Nova, Swift, Glance, Keystone, Neutron, Ceph) including complex deployments and troubleshooting.
  • Delivered infrastructure automation using Ansible, StackStorm, and GitLab pipelines.
  • Supported hybrid environments with VMware, OpenShift, Docker (swarm & standalone), and ACI integrations.

Linux System Administration & Architecture

  • Managed 18,000+ Linux (RHEL) systems across physical and virtual environments.
  • Led datacenter migrations (including production workloads) with zero major downtime.
  • Administered large Oracle RAC clusters (4/8/12 node) on UCS bare-metal and VMs, including ASM, NetApp/EMC SAN storage, SRDF, SnapMirror, FlexClone LUNs.
  • Oversaw system patching lifecycle with Red Hat Satellite, performed performance tuning, RCA, and critical incident resolution.
  • Designed cross-datacenter redundancy and resiliency using VMware vMotion, clustering, and SAN replication.
  • Automated OS, database, and storage operations with Python & Bash, reducing manual workload and improving reliability.
Sunil Kumar Thertala Veetil