Work

Projects

Cross-backend differential testing CLI for LLM inference correctness — quantization sweeps, determinism verification, and serving-layer faithfulness across mlx-lm, vllm-mlx, and llama.cpp.

PythonMLXvLLMllama.cpp+1

Active

Native macOS app that adds three-finger trackpad gestures for middle-click and middle-drag. Distributed via Homebrew and MacPorts.

SwiftmacOSMultitouchSupportAccessibility API+2

Complete

Confidence-based filter decoding system that routes LLM inference requests between small and large models to reduce compute costs while maintaining output quality. BayLearn 2024 poster at Apple.

PythonPyTorchLLaMASpeculative Decoding+1

Projects

infer-check

MiddleDrag

CALID