Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings
View dev-opsss's full-sized avatar

Block or report dev-opsss

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
dev-opsss/README.md

👋 Hi, I'm Devashish N

Senior DevOps / SRE | Kubernetes | Terraform | Data & ML Platforms

LinkedIn Email

Profile Views


🚀 About Me

Senior Cloud Infrastructure / SRE engineer with 10+ years building and operating mission‑critical data and ML platforms on AWS. Focused on Kubernetes (EKS), Terraform, GitOps, and observability for large‑scale Hadoop/Spark/Kafka/Solr systems supporting AI/ML workloads.

  • Uptime: 99.9%+ on shared data/ML platforms
  • Migrations: bare metal Cloudera → AWS/EKS with zero downtime
  • Scale: petabyte‑scale HDFS, tens of millions of Kafka messages per day

What I Work On

  • 🧱 Kubernetes platform engineering (EKS admin, upgrades, autoscaling, Helm, Kustomize)
  • ☁️ Infrastructure‑as‑Code with Terraform (multi‑account VPC, IAM, EKS, RDS, DR)
  • 📊 Observability for data/ML systems (Prometheus, Grafana, Splunk, OpenTelemetry)
  • 🔄 Data & streaming platforms (Hadoop/Cloudera, Spark, Kafka, Solr, Trino, Iceberg, Airflow)
  • 🔐 Security & access control (Kerberos, Ranger, IAM/RBAC, Okta/Cognito SSO)
  • 🤝 Enabling data and ML teams with self‑service platforms and GitOps workflows

🛠️ Tech Stack

Cloud & Infrastructure

AWS Azure Kubernetes Docker Terraform Helm Kustomize

Primary: AWS (EKS, EC2, S3, RDS, CloudFormation, Route53, ALB/NLB).
Azure: hybrid IaC with Terraform/Ansible from earlier roles.

Data & Streaming / ML Infra

Hadoop Spark Kafka Airflow Trino Iceberg Solr

Observability & SRE

Prometheus Grafana Splunk OpenTelemetry PagerDuty

  • SLO/SLI design • error budgets • incident response • DR drills • on‑call rotation

CI/CD & GitOps

Jenkins GitHub Actions GitLab CI Spinnaker ArgoCD FluxCD

Security & Access

Ranger Okta Cognito Kerberos

  • Zero Trust • RBAC • SSO (Okta/Cognito) • Kerberos • Ranger policies • least‑privilege IAM

Service Mesh

Istio Envoy

Programming

Python Go Bash SQL Java


📊 GitHub Stats

⚠️ If these cards show “empty” stats, it usually means most work is in private repos or the stats service is rate‑limited. That’s normal for enterprise work.

GitHub Stats

GitHub Streak

Top Languages


🎯 Expertise (What I’m Good At)

Data & ML Infrastructure

  • Migrating on‑prem Cloudera/Hadoop to AWS (EKS + S3 + RDS) with zero downtime
  • Running large‑scale Spark, Kafka, Solr, Trino, Iceberg platforms for ML workloads
  • Building self‑service data platforms with Terraform modules and GitOps

Kubernetes & Platform Engineering

  • EKS upgrades (control plane + nodes), AL2023 AMIs, RBAC, HPA/VPA, cluster autoscaler
  • Helm + Kustomize for multi‑env deployments (Dev/QA/Prod)
  • Istio for canary rollouts, traffic splitting, and mTLS between services

Observability & SRE

  • Prometheus + Grafana dashboards, SLO/error budget tracking, OpenTelemetry traces
  • Splunk‑based log analytics for data pipelines and microservices
  • PagerDuty on‑call design (L1/L2/L3) and incident management playbooks

Security & Compliance

  • Kerberos + Ranger for Hadoop security and fine‑grained access
  • Okta SAML SSO + AWS Cognito OAuth2/OIDC for internal services
  • SOC 2 / HIPAA / GDPR controls on data platforms

🏆 Selected Achievements

  • Zero‑downtime migration of shared AML data platforms from bare metal to AWS
  • 45%+ runtime reduction and 30% cost reduction for critical Spark pipelines
  • 40%+ MTTD/MTTR improvement via unified metrics, logs, and PagerDuty workflows
  • Built Terraform‑based self‑service platform used by 5+ internal teams to provision data/ML environments in under an hour

🎓 Certifications & Learning

  • ☁️ AWS Cloud Practitioner
  • ⎈ Kubernetes Application Developer (CKAD)
  • 🏗️ HashiCorp Terraform (training/cert)
  • 🤖 AI & Machine Learning for Business
  • 📊 AWS: Design and Implement Systems
  • 🧠 OCI 2025 Certified AI Foundations Associate

💬 Let’s Connect

Interested in data/ML platform engineering, Kubernetes on AWS, observability & SRE, and GitOps‑driven infra.

LinkedIn Email


💡 “Production is where good ML becomes great ML.”

⭐ If you find any of my repositories useful, please consider starring them.

Pinned Loading

  1. Python-Essentials-for-MLOps Python-Essentials-for-MLOps Public

    Jupyter Notebook 1

  2. DevOps-Projects DevOps-Projects Public

    Forked from NotHarshhaa/DevOps-Projects

    𝑫𝒆𝒗𝑶𝒑𝒔 𝑹𝒆𝒂𝒍 𝑾𝒐𝒓𝒍𝒅 𝑷𝒓𝒐𝒋𝒆𝒄𝒕𝒔 𝒇𝒐𝒓 𝑨𝒔𝒑𝒊𝒓𝒊𝒏𝒈 𝑫𝒆𝒗𝑶𝒑𝒔 𝑬𝒏𝒈𝒊𝒏𝒆𝒆𝒓𝒔 [𝑩𝒆𝒈𝒊𝒏𝒏𝒆𝒓 𝒕𝒐 𝑨𝒅𝒗𝒂𝒏𝒄𝒆𝒅]

    Java 1

  3. MLOps MLOps Public

    Jupyter Notebook

  4. MLOps-CI MLOps-CI Public

    Python

  5. Decode a JWT via command line Decode a JWT via command line
    1
    # will not work in all cases, see https://gist.github.com/angelo-v/e0208a18d455e2e6ea3c40ad637aac53#gistcomment-3439904
    2
    
                  
    3
    function jwt-decode() {
    4
      sed 's/\./\n/g' <<< $(cut -d. -f1,2 <<< $1) | base64 --decode | jq
    5
    }
Morty Proxy This is a proxified and sanitized view of the page, visit original site.