ML Researcher × LLM Systems Engineer

Research
that ships.

I'm Shawn Liu. I publish AI/ML research at venues like WACV, NeurIPS, and ISLPED, and I build LLM-powered systems with the engineering rigor to trust in production.

CS · UC Irvine '26 Incoming MSCS · Columbia · Open to ML / Research-Eng / SWE
Cross-modal attention heatmap learned by the WACV 2026 event-encoder, highlighting salient regions
Published Cross-modal event-encoder attention · WACV 2026

Cross-modal attention learned by our WACV 2026 event-encoder, one of six peer-reviewed papers.

Published at WACV 2026 NeurIPS 2025 ISLPED 2026 Frontiers in AI IEEE TAI ×2 (under review)
6
Peer-reviewed papers across top venues
4×
Lower FHE bootstrapping overhead at >90% encrypted-inference accuracy
+19%
Zero-shot gain over prior SOTA · WACV 2026
65%
Fewer simulated casualties · U.S. Navy UAV landing model

01 Flagship · LLM systems

Loop

Interview prep, scheduled around your real life. Loop takes a career goal, your weekly availability, and progress signals, turns them into a validated study plan, and drafts it as a real week on your calendar. Nothing touches Google Calendar until you approve it, and every write is verified, with a rollback path if something looks off. I built it for my own daily use, and it's deployed live.

LLMs propose. Deterministic infrastructure disposes.

Four LLM nodes write the plans and the prose. Everything that can actually touch your calendar is deterministic, validated, and waits for your approval.

Propose · LLM, one isolated package
Strategist syllabus with source-claim citations · Opus 4.8
Planner structured task plan · Sonnet 5
Reflection · Explanation prose only, never parsed · Sonnet 5

The LLM SDK can't even be imported outside this package. import-linter fails the build if you try.

Dispose · deterministic
Validation layer five checks: schema, graph, coverage, user-fit, scheduling. Failures go back to the LLM as typed repairs, twice at most
Greedy scheduler draft-only · no write access
Human approval gate nothing lands on the calendar without your say-so
Calendar Write Manager the only code that writes: it rechecks the approval and payload hash, dry-runs, catches duplicates, verifies after writing, and offers rollback / retry / keep
Google Calendar
2,691
Backend tests, plus 81 on the frontend. All green in CI
23
Written axioms + 8 ADRs governing every design decision
4
LLM nodes. Everything else is deterministic
$1.70
Expected monthly cost per user, worked out in a written cost axiom. Hard cap: $8
Done

Loop engineering

I closed the dead-ends you actually feel in a tool like this. A failed calendar write now offers rollback, retry, or keep. A needed replan says so and offers recovery modes. Check-ins can be answered right in the app.

Done, one step left

Harness engineering

Timeouts, backoff, and a typed taxonomy for provider errors, plus a live-capture tool, a CI eval gate, and call-log readers. All shipped. The one thing left is recording the first real-prompt baseline.

In progress

Prompt engineering

Few-shot exemplars, unified repair messages, and voice specs for the prose you'd actually read. All specified in detail, none of it shipped yet.

Partial

Context engineering

Prompt caching is live. Source-claim curation, reflection history, and replans that remember the previous plan are specified, not yet built.

The eval harness

Every LLM call lands in a SQLite call log with tokens, cost, and latency. A capture tool records real model outputs into committed recordings, and CI re-grades them deterministically: schema validity, repair recovery, plan-quality metrics, plus an offline LLM judge for the prose. Prompt and model changes ship with before/after deltas in the commit message. Live API calls never run in CI, and prompt bytes are version-pinned by hash, so an unmeasured prompt change fails the build.

Full honesty: the gate currently runs on fixture recordings, one of which deliberately fails so I know the harness actually catches failures. Recording the first real-prompt baseline is the next step.

Python 3.11Pydantic v2FastAPISQLite + WALReact + TS + ViteAnthropic Messages APIGoogle Calendar OAuthFly.io

Privacy by design: Loop never stores raw calendar event titles or descriptions.

01 / Research

Published & peer-reviewed

I work on neurosymbolic AI, hyperdimensional computing, and secure ML inference (FHE/CKKS). I'm lead author on work at IEEE TAI, with papers at WACV, NeurIPS, and ISLPED.

Read the papers
02 / LLM systems

Engineered to be trusted

Loop, my LLM-powered scheduler, plans real weeks on a real calendar. Every write sits behind deterministic validation, a human approval gate, and a recordings-based eval harness. It's the same discipline behind a live e-commerce platform and a U.S. Navy CV collaboration.

See Loop
03 / Bio

Off the clock

I live with Coconut and Kumquat, listen to way too much D'Angelo, and spend my free time shooting hoops, snowboarding, or gaming. The Spotify feed is live (yes, it's mostly D'Angelo), and there's a wall of cat photos because I take way too many. Enjoy.

Meet the person

02 Selected research

Publications

Six peer-reviewed papers across neurosymbolic AI, hyperdimensional computing, and secure ML inference. The ones marked lead are where I'm first author. Open any of them for the full abstract, key results, figures, and exactly what I worked on.

Lead author IEEE TAI 2026 · under review

Brain-Inspired Reasoning under Homomorphic Encryption

A privacy-preserving neurosymbolic framework that runs inference entirely under CKKS-FHE while keeping HDC-based reasoning robust. It holds >90% accuracy on encrypted graph inference with a 4× reduction in bootstrapping overhead, thanks to noise-adaptive scheduling.

FHE · HDC · Neurosymbolic AI · Privacy-Preserving ML

Lead & corresponding author · rebuttal completed
End-to-end neuro-symbolic FHE pipeline with distributed bootstrapping and symbolic decoding
WACV 2026

Cross-Modal Event Encoder: Bridging Image–Text Knowledge to Event Streams

Extends CLIP's zero-shot power to event-based vision, aligning asynchronous event data with image–text space across five modalities (image, event, text, sound, depth), for a +19% zero-shot accuracy gain over prior event methods.

Event-based Vision · CLIP · Cross-Modality · Zero-Shot

Attention heatmap from the cross-modal event encoder attending to salient regions
NeurIPS 2025 · NeurReps

Geometric Priors for Generalizable World Models via VSA

Vector Symbolic Architecture builds generalizable world models with learned group structure, reaching 87.5% zero-shot accuracy and 4× noise robustness over an MLP baseline.

VSA · World Models · Generalization

OpenReview ↗
FHRR state embeddings showing grid-like structure
FHRR (VSA): Grid-like structured embeddings preserve spatial relationships
MLP unstructured embeddings
MLP: Unstructured embeddings with no clear geometric pattern
Frontiers in AI

Optimal Hyperdimensional Representation for Learning & Cognitive Computation

The first universal HDC encoding that adapts between learning and cognition, reaching 95% learning accuracy with correlated encodings and 100% decoding under exclusive encodings.

HDC · Cognitive Computation · Neural-Symbolic

2026
IEEE TAI 2026 · under review

HyperEncrypt: Homomorphic Hyperdimensional Computing for Efficient & Secure Learning

Positions HDC as an alternative to encrypted deep learning: shallow, noise-resilient algebra that fits FHE, using up to an order-of-magnitude fewer bootstrapping ops at near-clean accuracy.

HDC · Kernel Methods · CKKS · Privacy-Preserving ML

2026
ISLPED 2026

Integrating Symbolic & Neural Mechanisms for Adversarially Robust HDC

Fuses Vision-Transformer features with classical texture/shape descriptors through HDC for graceful degradation under FGSM and Genetic attacks, with +17–26 pp recovery from partial adversarial retraining.

HDC · Neurosymbolic AI · Adversarial Robustness · ViT

2026

03 Engineering

Things I've shipped

Applied-ML systems up top: a U.S. Navy computer-vision collaboration, crash anticipation, and honest healthcare-ML evaluation. Below them, the production full-stack products real customers use every day. Real stacks, real outcomes.

Visuals restricted CUI · U.S. Navy collaboration
Applied research · BiasLab @ UCI U.S. Navy

Safe UAV Landing for the U.S. Navy

A custom pose-estimation + symbolic-reasoning system for autonomous UAV landing in adverse weather, replacing brittle fixed-pattern optical markers. The reasoning module holds the landing whenever crew or obstacles are detected on the deck.

Computer VisionPose EstimationSymbolic ReasoningPyTorchCUI dataset
Safety layer that holds landings until the deck is clear, cutting casualty risk.
Computer vision · autonomous driving

Crash Anticipation

A VideoMAE model that predicts vehicle crashes before they happen, with real-time inference. Next: a harder evaluation, model compression for embedded deployment, and a neurosymbolic module that issues direct avoidance commands.

VideoMAEPyTorchReal-timeNeurosymbolic
Saturates the benchmark. Next up: a harder eval.
code ↗
Healthcare ML · manuscript in prep

Generalizable Arrhythmia Detection

Shows how beat-wise splits leak patient identity and inflate ECG-classification accuracy, then introduces an optimal patient-wise split search for honest, generalizable evaluation on MIT-BIH.

CNNLSTM-AEMIT-BIHPatient-wise eval
Exposes leakage behind "SOTA-looking" numbers.
repo ↗
Computational biology · ongoing

Structure-Aware Antimicrobial Peptide Prediction

An ML pipeline that combines biochemical descriptors with structure-aware features from ESMFold-predicted conformations, comparing SVM, MLP and graph neural networks across QSAR, geometric and residue-graph representations.

ESMFoldSVM / MLP / GNNQSARBioIntelligence Lab
Ongoing, exploring the activity vs. hemolysis trade-off.
Crash Anticipation demo scroll →
Crash anticipation demo 1
Crash anticipation demo 2
Crash anticipation demo 3
Crash anticipation demo 4
Crash anticipation demo 5
Crash anticipation demo 6
Shipped products full-stack, real people use them
E-commerce · solo build ● Live

AdamsFoods Wholesale

Full-stack wholesale platform: React and Node/Express, signed-URL media on S3, JWT auth with role-guarded admin routes. Serving real customers today.

Nonprofit · CTC @ UCI Nonprofit

Feeding Pets of the Homeless

End-to-end donation-management platform for a national nonprofit: role-based access for coordinators, donors, and admins across regional chapters.

Internal tool

AdamsFoods Inventory

Back-office inventory system: item CRUD, search and filters, low-stock alerts, CSV export, and a 3D map of warehouse storage rooms.

code ↗

04 Stack

Tools of the trade

LLM Engineering

  • Anthropic API
  • Eval harnesses · CI gates
  • Structured outputs
  • Bounded repair loops
  • Prompt versioning
  • LLM observability

Languages

  • Python
  • TypeScript / JavaScript
  • C / C++
  • SQL

ML / AI

  • PyTorch
  • CLIP · ViT · VideoMAE
  • Hyperdimensional Computing
  • CKKS-FHE (SEAL)
  • ESMFold · GNNs

Full-Stack

  • React / TypeScript
  • Node / Express
  • FastAPI · Pydantic v2
  • PostgreSQL · SQLite
  • Firebase

Infra & Tools

  • AWS (S3)
  • Fly.io · Vercel
  • Git · CI
  • JWT Auth · OAuth
  • Linux · CUDA

05 News

Recent updates

Jul 2026
Loop is deployed live: an LLM-powered interview-prep scheduler with deterministic validation, human approval gates, and a recordings-based eval harness. I built it for my own daily use. [Case study]
May 22, 2026
Integrating Symbolic and Neural Mechanisms for Adversarially Robust Hyperdimensional Computing was accepted to ISLPED 2026.
Apr 5, 2026
Live Spotify Stats: feel free to stalk my recent listening history and see how our music tastes match up :)
Jan 19, 2026
Optimal Hyperdimensional Representation for Learning and Cognitive Computation: third author, accepted to Frontiers in Artificial Intelligence. [Paper]
Dec 15, 2025
Started a new position at BioIntelligence Lab with Dr. Haleh Alimohamadi, working on peptides research building AMP vs non-AMP classifiers from geometric features and QSAR (incorporating ESMFold).
Nov 10, 2025
Cross-Modal Event Encoder: Bridging Image–Text Knowledge to Event Streams was accepted to WACV 2026.
Sept 23, 2025
Geometric Priors for Generalizable World Models via Vector Symbolic Architecture was accepted to NeurIPS 2025 Workshop NeurReps.

06 The person

Beyond the résumé

I'm an undergraduate researcher in Computer Science at UC Irvine ('26) and an incoming M.S. student at Columbia. I work on neuro-symbolic AI, brain-inspired learning (HDC), multimodal models, and secure inference (CKKS-FHE), and I'm grateful to research with Prof. Mohsen Imani and Dr. Haleh Alimohamadi.

Current labsBiasLab @ UCI · BioIntelligence Lab @ UCI
HobbiesBasketball, snowboarding, music, gaming
On repeatD'Angelo · Dijon · Mkgee
PlayingCyberpunk 2077
ReadingDesigning Machine Learning Systems by Chip Huyen

Favorite albums & live listening

D'Angelo - Voodoo album cover Dijon - Absolutely album cover Mkgee - Two Star & the Dream Police album cover

07 Teaching & service

Teaching

Learning Assistant, ICS 33

Intermediate programming with Python · UC Irvine, Spring 2025.

Pro bono

Web dev for nonprofits

Built & maintained tooling for Feeding Pets of the Homeless, free of charge.

Service

Youth In Action counselor

Student counselor mentoring youth through the YIA program.

Contact

Let's build something that ships.