David Lee

Software engineer focused on large-scale distributed systems and machine learning.

Currently working on Google Cloud Pub/Sub. Previously worked on infra at Amazon.

Experience

Google - Machine Learning Engineer

Present

Working broadly on the intersection of ML systems, agentic AI, and reinforcement learning

Google - Software Engineer

Jul 2025 - Present

Worked on Google Cloud Pub/Sub, a globally distributed messaging service at ~4TB/s throughput with high availability
Worked on AI inference SMT end-to-end, including implementation, testing, metrics, and SLOs
On-call for production systems, including debugging and customer support for companies like Anthropic

Amazon - Software Engineer Intern

May 2024 - Aug 2024

Engineered an AWS Serverless automation pipeline (CDK, Lambda, SQS) that reduced weekly on-call burden by 30% and eliminated a class of high-severity tickets across multiple service teams
Contributed to Ruby on Rails & React frontend, designed new frontend POC with CI/CD
Delivered core intern project one month early, enabling successful execution of two stretch goals focusing on infrastructure and frontend CI/CD improvements

Lawliet - Cofounder Demo

2024

Multi-agent RAG system that automates data analysis, code execution, and database queries
Created built-in Excel frontend for seamless user interaction
Selected for Y Combinator interview (top 5% of applicants)

UC Berkeley - BA Computer Science

May 2025

GPT from Scratch

Implemented a transformer language model from scratch using a modern LLAMA-style architecture with SwiGLU activation, multi-head self-attention, RoPE positional embeddings, and FlashAttention
Built a training pipeline with pretraining, supervised fine-tuning, and DPO-based preference optimization
Implemented multi-GPU training using PyTorch DDP and distributed data loading
Evaluated KL regularization, reward model design, and optimization stability in small-model preference learning

AlphaGo Chess

Implemented an AlphaZero-style reinforcement learning system with ResNet-based policy/value networks and MCTS using PUCT selection
Built a self-play data generation pipeline with asynchronous workers, experience replay, and temperature-based action sampling
Analyzed policy improvement dynamics, value bootstrapping behavior, and the interaction between neural priors and search depth

Anime Recommendation System

Built a multi-stage recommendation system with two-tower retrieval, ANN-based candidate generation, cross-encoder ranking, and diversity-aware reranking
Implemented bi-encoder embeddings trained with contrastive learning and HNSW-based approximate nearest neighbor search
Optimized for sub-100ms latency using model distillation, quantization, and batched inference under production-style serving constraints

Lightweight Pub/Sub

Built a distributed publish-subscribe system using Raft consensus for leader election and replicated log-based fault tolerance
Implemented configurable delivery semantics (at-least-once, exactly-once), message ordering guarantees, and dead-letter queues
Designed partition-based scaling, consumer group coordination, and offset management for high-throughput workloads

Educational Operating System (Pintos)

Implemented core OS subsystems including thread scheduling, synchronization primitives, virtual memory, and filesystem support
Designed and debugged preemptive scheduler behavior, race-condition avoidance, and priority inversion mitigation under concurrent workloads
Built process loading, user-kernel transitions, and page fault handling with careful memory allocation tradeoffs

Secure File Storage System

Built a secure, end-to-end encrypted file storage client in Go applying cryptographic primitives for confidentiality, integrity, and access control
Designed authentication, file upload/download, efficient appending, secure sharing, and controlled revocation under adversarial threat assumptions
Reasoned about key management, data organization, and tamper-resistant storage across untrusted server APIs

Voice Language Learning Agent

Built a voice-based language learning agent focused on personalized, interactive feedback rather than static content delivery
Designed agent control flow, explicit state tracking, and response adaptation based on user proficiency and interaction history
Addressed reliability challenges including turn-taking errors, speech recognition failures, and conversational drift

AI Trading System

Implemented an automated trading system combining fundamental signal extraction with agent-driven decision logic and position management
Designed clear separation between signal generation, execution engine, and risk controls to prevent feedback loops and overfitting
Analyzed failure modes including noisy data pipelines, delayed signals, regime shifts, and agent behavior leading to unstable outcomes