Alexa Griffith

agriffith96@gmail.com | New York, NY
LinkTree | LinkedIn | Website

Summary

Software engineer specializing in large-scale AI inference systems and cloud-native infrastructure with a focus on building safe, observable, and evolvable AI systems in hybrid on-prem and cloud environments. I design and operate production GenAI platforms on Kubernetes, with experience in KServe, Envoy AI Gateway, and multi-cluster reliability. Active open-source contributor and CNCF community leader with multiple KubeCon/public talks and committee roles. Host and producer of Alexa's Input (AI), a podcast featuring engineers, founders, and open-source leaders.

Professional Experience

Bloomberg New York, NY

Senior Software Engineer - Inference Platform Team January 2022 - Present

  • Lead engineer on Bloomberg's internal GenAI inference platform built on Kubernetes, KServe, and Envoy AI Gateway, powering production ML and LLM workloads across multiple clusters and environments.
  • Designed and implemented core transformation logic in Envoy AI Gateway across providers, and multimodal/streaming/tool-calling use cases, enabling a single consistent API.
  • Defined and implemented SLOs/SLIs and dashboards for the platform, deployment system, and user services creating Grafana dashboards and alerts used for ongoing reliability reviews and incident response. Led our Inference Platform Stability strategy.
  • Owned multiple DR-1 disaster-recovery test cycles for inference, including rebuilding clusters, documenting procedures, and coordinating cross-team response. Served as the representative for my team.
  • Designed and rolled out GPU support and configurations for inference services, including scale-to-zero.
  • Setup AWS EKS-based inference clusters (first clusters on cloud); configured to be compatible with on-prem, onboarded the team to Terraform-based pipelines, and designed OPA workflows and policies.
  • Led multiple KServe upgrade and rollout efforts (0.9–0.15) across shared and dedicated clusters.
  • Authored upstream KServe features and documentation for GenAI runtimes, model latency metrics, and canary rollouts; maintained KServe implementation and website contributions.
  • Built and designed the deployment and debugging backend platform service and integrated it with our UI: new APIs for pod status, health, and information, significantly improving debuggability for internal users.
  • Drove cross-team initiatives with SRE, product, and other platform teams regularly.

Bluecore New York, NY

Software Engineer 1.2 - Data Science & Analytics Infrastructure Teams November 2019 - January 2022

  • Decreased analytics UI response time by 95% (~2 minutes to sub-seconds) by leading a multi-quarter project for the design and development of a new gRPC-based Go API; main on-call for new & legacy systems.
  • Pioneered the design and cross-team discussions for a standardized authentication and authorization service implementation for Bluecore's GKE-based microservices.
  • Contributed crucial features such as database lock handling and a data generator service for our high throughput, low latency streaming architecture written in Python/Java.
  • Created a SQL parser tool with a UI to visualize similarities in complex queries using d3/Python/JavaScript and consolidate 50% of the analytics endpoints.
  • Managed Airflow infrastructure, added new features, and created optimized DAGs for ETL workflows.
  • Led and designed Looker integration usage for Bluecore's UI used by 600+ clients & created self service features for users.

Open Source, Leadership & Activities

See full list at alexagriffith.com

Open Source

  • Core contributor to KServe (GenAI/LLM support, metrics, Helm, docs) and Envoy AI Gateway (multi-provider schema mapping, streaming, tools, reference architecture).
  • Active participant in Kubernetes, CNCF, and AI-platform working groups.

Speaking & Writing

CNCF & Industry Roles

  • Technical Advisory Group (TAG) Infrastructure – Tech Lead (CNCF).
  • Program Committee Member: Cloud Native + Kubernetes AI Day EU 2026 & NA 2025, EnvoyCon (Virtual).
  • Lead SME for Cloud Native Platform Associate Advisory Council; Advisory Panel Member for Cloud Native Platform Engineer Advisory Council.

Internal Culture & Mentorship

  • Mentored interns and engineers interested in platform/inference work; ran debugging workshops and KServe training sessions across offices.
  • Represent Bloomberg at recruitment events, Kubernetes meetups, and open-source showcases.

Technical Skills

  • Languages: Go, Python
  • Cloud & Infra: Kubernetes, KServe, Envoy AI Gateway, Istio, Prometheus, Grafana, Helm, Terraform, Docker, Knative
  • Cloud Providers: AWS (EKS, Bedrock), GCP (GKE, Pub/Sub, BigQuery, Bigtable, CloudSQL, load balancers)
  • Data & Streaming: BigQuery/Bigtable, PostgreSQL, Kafka

Education

University of Tennessee, Knoxville 2014 - 2019

B.S. in Honors Chemistry, American Chemical Society certified
Minor in Hispanic Studies
Magna Cum Laude