linkedin.com/in/sergey-vlassiev

Sergey Vlassiev

Stockholm, Sweden

I’ve been writing software since I was seven — I was born into a family of engineers and it was simply what people around me did. One of the programs I’m still proud of from those years is a two-player game where my brother and I could fight as triangle-shaped knights together.

I trained as a mathematician — and graduated as both mathematician and teacher — which shapes how I approach problems: I want to understand how something actually works before touching it, and I tend to explain things to people along the way. That was true when I was running algebra seminars at university in St. Petersburg and leading three-week expeditions above the Arctic Circle — and I still work that way.

We’ve moved as a family three times: to Stockholm when my wife was offered a role here, to Amsterdam when I joined AWS, and back to Sweden when we wanted the best conditions for our kids. We’ve kept both careers throughout without either of us disappearing into the other’s story. Parenting has been the central theme of my life in recent years — it makes me more deliberate about what I spend my time on and why.

I build the systems, processes, and guardrails that let people work safely and with some fun: on-call structures, alerts, documentation, automation, deployment pipelines. I’m at Epidemic Sound in part because the domain matters to me. My brother is a musician, and working on infrastructure that affects millions of users means I’m thinking about people like him — not abstractions. Right now my focus is on agentic readiness: making the platform infrastructure consumable by AI agents. I took a philosophy of technology course in 2021 because what engineers build shapes how people feel, create, and connect — and I’d rather that be deliberate.

Experience


Stockholm, Sweden
Software Engineer Explore 2026—present
Agentic RAG system — Vertex AI + Gemini multimodal and text RAG over a 20-year family photo archive and a private working journal
Main technologies: Python (FastAPI, asyncio), Vertex AI (now Gemini Enterprise Agent Platform), Gemini 2.5 Pro / Flash, Cloud Run, GKE, Firestore, Firebase Auth, GCS, Server-Sent Events, grafana-app-sdk
key achievements
  • Multimodal RAG over the family archive6,004 personal photos embedded into a shared text / image space (multimodalembedding@001, 1408-dim). Natural-language questions are answered with cited images, and EXIF / folder-name date filters are parsed straight out of the query (“summer 2017”). A second corpus does text RAG over a private working journal with text-embedding-005.
  • Two-stage retrieval with vision reranking — Cosine top-k over an in-memory index, then the top 20 images fanned out to four parallel Gemini 2.5 Flash batches that rerank under a structured schema; the best 10 go to Gemini 2.5 Pro for the written answer. A SHA-keyed byte map means Pro never re-downloads what Flash already saw, and a 15 s timeout falls back to similarity order without tearing the response.
  • Streamed answers over Server-Sent Events — The ask endpoint emits citations first (~5–10 s in) so photos render while Pro is still writing, then the answer (~10–15 s), then refreshed quota. A soft-failure wrapper (asyncio timeout in a worker thread) turns a 60 s generation timeout into an inline notice rather than a 500 — citations always stay on screen.
  • Cost engineering as a first principle — Rejected Vertex AI Vector Search (~$360/mo) in favour of in-memory NumPy cosine similarity: 6k vectors × 1408-dim ≈ 35 MB fits in memory trivially, and the same code path scales to managed search at 100k+ items. Cloud Run scales to zero, so idle cost is $0.
  • Auth as the only server-side trust boundary — Firebase ID tokens cryptographically verified (signature, audience, expiry) against Google’s public keys, plus an email allow-list; the entire client flow can be broken without compromising it. Anonymous visitors query the public photo corpus (20/day globally); allow-listed users also reach the private journal corpus (50/day each).
  • Fail-closed rate limiting — Atomic per-day Firestore counters that avoid read-modify-write races under concurrency; corpus authorisation checked twice per request as defence-in-depth; if the counter store is unreachable the request fails closed rather than leaking quota.
  • Private deployment topology — Browser → GKE Ingress (TLS) → a small nginx-proxy pod → Cloud Run, keeping the internal run.app URL off the public surface at ~30 ms p50 overhead. The proxy runs with buffering off and a 130 s read timeout so SSE frames flush in real time. Packaged as a multi-package uv repository, deployed via Artifact Registry.
  • Declarative control plane (sibling project) — Modelled Explore’s configuration as Grafana App Platform resources with grafana-app-sdk: ModelRoute and Corpus kinds in CUE, code-generated into a Kubernetes-style apiserver, with a minimal reconciler that fills an index status. Built as a faithful model and a reproduction harness for two open upstream grafana-app-sdk issues.
Stockholm, Sweden
Software Engineer Epidemic Sound Apr 2024—present
Enabling development teams to build backend services safely and at high velocity
Main technologies: Go, Python, Keycloak, OAuth2 / OIDC, Fastly CDN, GCP
key achievements
  • API Platform & agentic readiness — Building the API platform’s client generation pipeline: automated TypeScript and Go client libraries from OpenAPI specs, published to internal registries, with agentic metadata embedded so AI agents like Claude Code, Cursor and Gemini Enterprise can consume services without external documentation. Currently driving adoption across the engineering organisation.
  • GitHub Access Control — Identified compliance risks in the company’s GitHub organisation setup. Ran the threat analysis, prepared stakeholder slides, coordinated with Security and Internal IT. Result: 35% fewer super-admins, all non-human owners removed, admin access made traceable. Directly supported IPO readiness.
  • Email verification simplification — Mapped the entire flow across Keycloak, multiple services, mobile, and CRM. Found the inconsistencies blocking the legacy user management service migration for months. Designed a cleaner flow where Keycloak fully owns verification — unblocking a long-stalled migration and making verification reliable for conversion flows.
  • Unified logout & session invalidation — Investigated inconsistencies across web, mobile, plugins, and frontend. Designed Keycloak backchannel logout to close a real security gap. Scoped and planned as part of the External Publishing project.
  • MCP integration authentication — Designed auth infrastructure enabling MCP integration with the platform, including session invalidation mechanisms to control agentic users and ensure secure token lifecycle management.
  • Enterprise SSO self-service — Implemented enabling flows including domain verification and session handling, enabling customer self-service onboarding and reducing manual operational overhead. Unlocked the small and medium business market — projected to bring in more than 10,000 new customers in the first year, previously blocked by manual onboarding.
  • Observability pilot — Part of the pilot team connecting OpenTelemetry metrics and tracing with Grafana’s Loki and Tempo; helped shape the rollout for the broader organisation.
Amsterdam, Netherlands
Software Engineer Amazon Web Services Sep 2022—Dec 2023
Cloud solution for managed development environments
Main technologies: AWS, Go, CDK, Lambda, Serverless, Docker, JavaScript
key achievements
  • Salesforce Code Builder — Designed, scoped, and implemented the Docker image reservation feature enabling Salesforce to ship Code Builder on schedule. Also introduced an activity-tracking feature that idled inactive environments, cutting unnecessary runtime and reducing customer costs.
  • Pipeline modernisation — Reworked release pipelines to incorporate automated testing, reducing vulnerabilities and ensuring stable deployments for AWS CloudShell, AWS CodeCatalyst, and downstream teams.
  • Cross-team coordination — Proactively collaborated with 8 independent teams, submitting PRs to prevent outages and ensure seamless integration of internal changes.
  • Billing launch blocker — Led resolution of a multi-team blocking issue through deep investigation and extensive communication. A critical billing module was delivered on time at AWS re:Invent 2022.
Stockholm, Sweden
Software Engineer PingPong AB May 2019—Jul 2022
Two products: Baloo Learning (internal division, sold Mar 2021 — now Skillhabit) and the PING PONG LMS for schools and public authorities
Main technologies: Kotlin, Java, JSP, Vue.js, Vuetify, PWA, PostgreSQL, Dropwizard, Kubernetes, AWS, MongoDB, Prometheus, Grafana
key achievements
  • 15% AWS cost reduction (Baloo) — Through infrastructure optimisation, better caching, and tuned event handling. Also increased event processing throughput.
  • Zero-downtime ownership migration (Baloo) — Performed technical due diligence and managed the platform’s transition to a new owner in March 2021 without any customer-facing downtime.
  • Organised and configured activity and availability monitors and alerts with Prometheus and Grafana (Baloo), making the team more confident in their operations.
  • Designed and implemented features that improved system compatibility with other LMS products and opened the public procurement market (Baloo): Microsoft Graph AD, BankID, and Microsoft Teams integrations.
  • Started a data analysis project using AI practices to help content creators build more attractive courses (Baloo).
  • 100× performance improvement (PingPong) — Found and resolved a critical backend bottleneck for users with exceptional loads. Response time dropped from seconds to milliseconds.
  • Security hardening (PingPong) — Initiated a full vulnerability audit, identified and fixed dozens of XSS vulnerabilities and other security concerns.
  • Accessibility (PingPong) — Championed web accessibility improvements. Collaborated with Axess Labs for external testing and drove follow-up remediation.
  • Implemented new UI solutions including interactive charts and a Vuetify-based interface to qualify the product for public procurement (PingPong). Updated the outdated Android app to a Trusted Web Application and published it on the Play Store, restoring mobility for thousands of users.
Stockholm, Sweden
Software Engineer Tradedoubler Dec 2017—May 2019
High load tracking system for affiliate advertisements
Main technologies: Java, Dropwizard, Docker, Kubernetes, GKE, Prometheus
key achievements
  • Modularised the system by containerising components with Docker — shortened the development cycle from several days to several minutes.
  • Introduced Kubernetes orchestration and Prometheus metrics with alerts for monitoring.
  • Transferred a large portion of the product to GKE, cutting operational costs severalfold.
Saint Petersburg, Russia; Stockholm, Sweden
Software Engineer Positive Technologies Jun 2015—Dec 2017
Source code analysis tool for detecting Java web application vulnerabilities — PT Application Inspector
Main technologies: Java, static analysis, compilers, code protection, web application security
key achievements
  • Designed and implemented Java 8 language level support, significantly expanding the product’s addressable market.
  • Contributed two merged upstream fixes to JavaParser, the widely-used open-source Java AST library, while building the analyzer’s Java parsing — stopping a visitor from mutating an unmodifiable collection (#315) and correcting traversal to descend into array initialisers (#322).
  • Added support for major enterprise web technologies: JavaServer Faces, Apache Struts, JAX-RS, JSP tag libraries.
  • Implemented code protection mechanisms — both hardware dongle-based and software-based — to prevent licence circumvention.
Saint Petersburg, Russia
Lead Software Engineer SMTDP Dec 2014—May 2015
Machine vision web service for detecting document manipulations
Main technologies: Java, image processing, REST
details
  • Designed and implemented a web service that inspects documents for modifications made with any graphics editor.
  • Led a small team as technical lead. Startup that ran out of runway — but the experience of owning a product end-to-end under real constraints stayed with me.
Saint Petersburg, Russia
Software Engineer Oracle Jun 2012—Nov 2014
Java ME Embedded runtime development and porting to different hardware boards
Main technologies: Java, C, embedded systems, Raspberry Pi, RTX
details
  • Significantly decreased runtime memory footprint by introducing a new application management system.
  • Implemented a Java ME Embedded properties tool for managing VM and runtime properties from both the device and an external host.
  • Implemented mbed64 board auto-detection for the Java ME Embedded SDK.
  • Ported DIO API to RTX and Raspberry Pi boards. Improved Inter-MIDlet Communication protocol.
Saint Petersburg, Russia
Software Engineer EMC Jul 2010—Jun 2012
Automated testing for RecoverPoint — continuous replication across Symmetrix and CLARiiON disk arrays
Main technologies: Java, VMware, enterprise storage systems, replication
details
  • Developed and maintained the automated testing system for RecoverPoint. Worked across disk arrays (Symmetrix, CLARiiON), VMware virtualisation, network connectivity, and the replication system itself.

Education


Mathematician & Teacher — Saint Petersburg State University
Mathematics and Mechanics Department — Specialist degree (MSc equivalent); thesis on Lutz filtration in local fields. Also published in number theory
2005—2010
Software Engineering — Academy of Modern Software Development
Java, algorithms, data structures, JVM internals — taught by engineers who went on to shape the wider software industry
2008—2009
Philosophy of Technology and Design — University of Twente
2021

Other experience