Tools
The stack, infrastructure, and daily tools that enable my workflows.
Providers
Anthropic for primary work — Claude is the model behind every project on this site. Google's Gemini for multi-modal tasks and long-context analysis. OpenAI for ecosystem integrations. Groq when latency matters. OpenRouter to evaluate models I haven't committed to yet, and Together for hosted open weights at scale.
-
Anthropic
Claude API & Claude Code
Primary AI provider. Claude powers multi-agent systems, RAG pipelines, code generation, and agentic development workflows. The context window and instruction-following quality make it the right choice end to end.
-
Google
Gemini API & AI Studio
Gemini models for multi-modal tasks, long-context analysis, and cross-validation in multi-agent workflows. AI Studio for rapid prototyping.
-
OpenAI
GPT & Embeddings
GPT-4 for specific use cases, embedding models for vector search pipelines. The ecosystem integration is mature.
-
Groq
LPU Inference
Ultra-low-latency inference for speed-sensitive workflows. LPU hardware delivers sub-second responses on open-weight models.
-
Nvidia
NIM & GPU Infrastructure
NIM microservices and GPU-accelerated inference. The hardware substrate underneath most of the AI stack.
-
OpenRouter
Model Router
Unified API across dozens of model providers. Route to the best model for each task without managing individual API keys.
-
DeepSeek
Open-Weight Reasoning
Where Claude and GPT-4 cost budget, DeepSeek closes the gap for reasoning-heavy batch jobs — code synthesis, long-form analysis, research summarization. Downloadable weights for on-prem deployment.
-
Mistral
European AI Provider
The option when clients need EU data residency for inference. Strong multilingual performance at lower cost than US providers, and the weights are downloadable when compliance demands it.
-
Cohere
Embeddings & Retrieval
The Rerank API specifically — when a RAG pipeline's first retrieval pass returns 50 candidates and I need to reorder them by true relevance before passing to the LLM. Often replaces a second embedding pass.
-
Together AI
Open-Weight Hosting
Where I run Llama and Mixtral when a project needs open-weight models without standing up GPU infrastructure. Fine-tuning workflows and dedicated endpoints when the inference volume justifies them.
AI Frameworks & Local Inference
The composition layer between models and applications. LangChain and LangGraph for orchestration; LlamaIndex when retrieval is the work. CrewAI for role-based multi-agent workflows. DSPy when prompts deserve to be programs. Local inference via Ollama, LM Studio, and vLLM — the runtime layer underneath UPS hardware builds.
-
LangChain
LLM Application Framework
Composable building blocks for LLM applications — chains, retrievers, agents, memory. The standard framework for wiring models into production pipelines.
-
LangGraph
Agent Orchestration
Stateful, multi-agent orchestration built on LangChain. Graph-based control flow for complex agent workflows with human-in-the-loop and persistence.
-
LlamaIndex
RAG Framework
The standard framework for building retrieval-augmented generation pipelines. Data connectors, indexing strategies, and query engines purpose-built for connecting LLMs to data.
-
CrewAI
Multi-Agent Framework
Role-based multi-agent orchestration. Define agents with specific roles, goals, and tools — then let them collaborate on complex tasks autonomously.
-
DSPy
Programmatic Prompting
Stanford's framework for programming — not prompting — language models. Compiles declarative modules into optimized prompt chains. The academic frontier of LLM application design.
-
Vercel AI SDK
Streaming & Tool Use
TypeScript SDK for building AI applications with streaming responses, tool calling, and multi-provider support. The abstraction layer for AI-powered UIs.
-
Weights & Biases
Experiment Tracking
MLOps platform for tracking experiments, versioning models, and visualizing training runs. The standard for reproducible machine learning development.
-
MLflow
ML Lifecycle
Open-source platform for managing the ML lifecycle — experiment tracking, model registry, and deployment. Enterprise MLOps infrastructure.
-
Ollama
Local Model Runtime
Run open-weight models locally with a single command. The runtime layer for UPS hardware builds — persistent local inference on commodity hardware.
-
LM Studio
Local Model GUI
Desktop application for discovering, downloading, and running local LLMs. Model evaluation and testing without cloud dependencies.
-
vLLM
Production Inference Server
High-throughput, memory-efficient inference engine for LLMs. PagedAttention for serving models at scale. The production backend for self-hosted model deployments.
Services
Specialized AI services and operational infrastructure. MidJourney and PixelLab for generative imagery (the Limner pipeline). Elicit for academic research synthesis. ElevenLabs for voice. Hugging Face as the default first stop for any new model evaluation. Perplexity replaces the Google-then-read-ten-tabs research workflow.
-
MidJourney
Image Generation
Concept exploration and initial asset generation in the Limner pipeline. The aesthetic control and style consistency are unmatched for creative direction.
-
PixelLab
Pixel Art Generation
Specialized pixel art generation integrated into the Limner: Pixel pipeline. Purpose-built for the medium rather than adapted from photorealistic models.
-
ReCraft
Design-Grade Generation
Vector and raster generation with strong design sensibility. Useful for UI assets, icons, and illustrations that need to feel designed rather than generated.
-
Elicit
Research Assistant
AI-powered research tool for finding, analyzing, and synthesizing academic papers. Used extensively during the AI Embassy thesis research.
-
ElevenLabs
Voice & Audio AI
Voice synthesis and audio generation. The quality ceiling for text-to-speech and voice cloning in production applications.
-
Hugging Face
Model Hub & Inference
Model discovery, hosted inference, and the open-source ML ecosystem. The default starting point for evaluating new models and architectures.
-
Perplexity
AI Search
Citation-backed AI search for research and fact-checking. Replaces the Google-then-read-ten-tabs workflow with direct answers and sources.
-
Brave
Privacy-First Browser & Search
Privacy-respecting browser and search API. The Search API powers tool-use pipelines where you need web results without tracking overhead.
-
Stripe
Payments
Payment processing for any project that needs to take money. Webhook-driven event handling, Customer Portal for self-service subscriptions, and Stripe CLI for local testing. The default revenue layer for client work.
-
Resend
Transactional Email
Transactional email for projects where SendGrid is overkill and SMTP is underkill. React Email templates, simple API, predictable pricing. The default email provider for any new build.
-
Zapier
Workflow Automation
When an Asana task needs to fan out to Slack, calendar, and a doc at the same time. The glue layer for personal and client ops that don't justify writing a Worker.
-
n8n
Open-Source Automation
When Zapier would cost more than a small VPS, or when the workflow handles data that shouldn't leave my infrastructure. Self-hosted, JavaScript nodes when no-code isn't enough.
Infrastructure & DevOps
The substrate underneath everything. AWS for enterprise-scale work; Cloudflare's serverless edge for everything else. Docker and Kubernetes when the deployment target requires them. Terraform for declarative infrastructure. Tailscale to make distributed dev environments behave like one network.
-
AWS
Cloud Platform
S3, Lambda, Bedrock, SageMaker — the enterprise cloud ecosystem. Certified AI Practitioner with hands-on experience across compute, storage, and AI services.
-
Docker
Containerization
Container runtime for reproducible builds and deployments. The standard abstraction layer between development environments and production infrastructure.
-
Kubernetes
Container Orchestration
Container orchestration for production workloads. Deployment, scaling, and management of containerized applications across clusters.
-
Terraform
Infrastructure as Code
Declarative infrastructure management for Cloudflare resources. Version-controlled, reviewable infrastructure changes through HCL configurations.
-
GitHub Actions
CI/CD Pipelines
Automated build, test, and deployment workflows. Powers the CI/CD pipeline from push to Cloudflare Pages deployment.
-
Airflow
Workflow Orchestration
Programmatic workflow scheduling and monitoring. DAG-based pipeline orchestration for data processing, ML training, and batch inference jobs.
-
Vercel
Frontend & Serverless
Deployment target for Next.js and SvelteKit projects outside the Cloudflare stack. AI SDK integrations, preview deployments, and edge functions.
-
Tailscale
Mesh VPN
Zero-config mesh networking across devices. Makes the Embassy PC, development machines, and cloud services behave like they're on the same LAN.
-
Wrangler
Cloudflare CLI
Cloudflare's command-line tool for developing and deploying Workers, Pages, D1, R2, and KV. The single entry point for the entire Cloudflare developer platform.
-
Git
Version Control
Distributed version control — branching, rebasing, cherry-picking, bisecting. The foundation underneath GitHub, not a synonym for it.
-
pnpm
Package Manager
Fast, disk-efficient package manager. Content-addressable storage and strict dependency resolution. The default for all monorepo and workspace projects.
-
LaunchDarkly
Feature Flags
Feature flag management for progressive rollouts, A/B testing, and kill switches. Ship continuously without shipping risk.
Data & Observability
Storage, retrieval, and the systems that make production AI legible. Qdrant powers the Permanent Record RAG pipeline. PostgreSQL when D1's SQLite isn't enough. Redis for sub-millisecond data access. BigQuery for warehouse-scale analysis.
-
Qdrant
Vector Database
High-performance vector search with payload filtering and named vectors. Rust-based engine with clean HTTP and gRPC APIs. Powers the Permanent Record RAG pipeline.
-
PostgreSQL
Relational Database
The industry-standard relational database. Full SQL, JSON support, extensions ecosystem. The foundation when projects need more than D1 SQLite at the edge.
-
Redis
In-Memory Data Store
High-performance caching, pub/sub messaging, and session storage. The speed layer for applications that need sub-millisecond data access.
-
BigQuery
Data Warehouse
Serverless data warehouse for analytics at scale. SQL-based analysis over petabyte-scale datasets without infrastructure management.
-
Databricks
Lakehouse Platform
Unified analytics and AI platform. Delta Lake, MLflow integration, and collaborative notebooks for data engineering and machine learning at scale.
-
dbt
Data Transformation
SQL-based data transformation framework. Version-controlled, tested, documented transformations that turn raw data into analytics-ready models.
-
Prometheus
Metrics & Monitoring
Pull-based metrics collection and time-series database. The standard data source for Grafana dashboards and alerting rules.
-
Grafana
Observability Platform
Dashboards and visualization for metrics, logs, and traces. The single pane of glass for monitoring deployed infrastructure and AI pipeline health.
-
Datadog
APM & Monitoring
Enterprise application performance monitoring. Traces, metrics, logs, and real user monitoring in a unified platform. The enterprise counterpart to Grafana.
-
Loki
Log Aggregation
Horizontally scalable log aggregation designed for Grafana. Label-based indexing makes it cost-effective for high-volume log streams.
-
Vector
Data Pipeline
High-performance observability data router. Collects, transforms, and routes logs and metrics between services with minimal resource overhead.
-
Sentry
Error Tracking
Real-time error tracking and performance monitoring. Stack traces, breadcrumbs, and release tracking for production applications.
-
Tableau
Data Visualization
Interactive data visualization for business intelligence. Drag-and-drop dashboards that make complex data accessible to non-technical stakeholders.
-
Looker
BI & Analytics
Business intelligence platform with LookML modeling layer. Semantic data models that ensure consistent metrics across the organization.
Evaluation & Benchmarking
The discipline of knowing whether your AI actually works. Promptfoo for prompt regression testing. Inspect AI for structured evaluation harnesses. OpenAI Evals as the canonical reference. LangSmith when LangChain is in the loop.
-
Promptfoo
Prompt Regression Testing
Snapshot prompts, define assertions, catch regressions before they ship. Open source, runs in CI. The test framework LLM applications were missing.
-
Inspect AI
Evaluation Harness
UK AI Safety Institute's structured evaluation framework. Designed for capability, safety, and alignment testing with reproducible scoring and dataset versioning.
-
OpenAI Evals
Evaluation Reference
The canonical reference for LLM evaluation patterns. Templates, registries, and the evaluation vocabulary the rest of the field borrows from.
-
LangSmith
LLM Observability
Tracing, evaluation, and monitoring for LangChain applications. The observability layer when LangChain is in the loop and you need to see what each chain step is doing.
Memory & Persistence
The Memory Architecture layer — how agents accumulate state across sessions and substrates. pgvector when embeddings live alongside relational data. Letta (formerly MemGPT) for tiered agent memory. Mem0 for memory-as-a-service patterns. The substrate beneath the UPS Cartridge work.
-
pgvector
Postgres Vector Extension
Vector similarity search as a Postgres extension. The default when embeddings need to live alongside relational data without a separate vector database.
-
Letta
Tiered Agent Memory
Formerly MemGPT. Tiered memory system — working memory, long-term memory, archival storage — for agents that need to persist across sessions with structured recall.
-
Mem0
Memory-as-a-Service
Hosted persistent memory layer with API-first design. Drops into any agent application without standing up vector DB or retrieval infrastructure yourself.
-
Zep
Long-Term Context
Long-term memory and context management for LLM apps. Auto-extraction of facts and entities from conversation history — the memory graph under the conversation.
Projects
Adjacent projects in the personal-AI-on-personal-hardware design space — the same territory the UPS Cartridge work operates in. OpenClaw for action-taking, ZeroClaw for local-first privacy, OpenJarvis for on-device architecture. LobeHub as the chat substrate for multi-agent sessions.
-
OpenClaw
Agentic Personal Assistant
Agentic personal AI assistant that takes actions across platforms. Open-source, action-oriented architecture — the AI that actually does things, not just answers questions.
-
ZeroClaw
Local-First Private AI
Private AI assistant that runs 100% locally. Multi-platform (Telegram, Discord, WhatsApp) with no cloud dependency — your data never leaves your machine.
-
LobeHub
AI Chat Framework
Open-source chat framework powering multi-agent conversation interfaces. Used for the AI Peer Review collaborative synthesis environment.
-
OpenJarvis
On-Device Personal AI
Personal AI that runs on personal devices. Architecture for on-device assistants from Stanford's Scaling Intelligence Lab — privacy and latency both solved at the substrate.
Web
The full-stack that ships jimvinson.com and client projects. TypeScript end-to-end. Astro for content, SvelteKit and React for interactive work. Tailwind v4 for design tokens. Drizzle + D1 for edge-native data. Better Auth when the project needs sessions.
-
Python
Language
The lingua franca of AI/ML development. Data pipelines, model training, scripting, and API services. Every AI project touches Python somewhere.
-
TypeScript
Language
The language underneath the entire stack. Type safety from schema validation to API boundaries to UI components. Every project is TypeScript-first.
-
Astro
Static Site Generator
Content-first framework that ships zero JS by default. Powers this site and most web projects. Island architecture for interactive components without the full bundle tax.
-
Next.js
React Framework
Full-stack React framework for projects in the Vercel ecosystem. Server components, API routes, and middleware for applications that need SSR or ISR.
-
Cloudflare
Edge Infrastructure
Workers, Pages, D1, R2, Vectorize, KV — the full edge stack. Single wrangler.toml, one billing dashboard, global deployment with sub-50ms cold starts.
-
Tailwind CSS v4
CSS Framework
v4's CSS-first config and oklch color system are significant upgrades. The @theme block in global.css replaces the entire config file.
-
Drizzle ORM
Database ORM
Lightweight, type-safe ORM that speaks D1 natively. SQL-first philosophy means the abstractions don't fight you.
-
Better Auth
Authentication
Type-safe auth library built for the modern TypeScript stack. Edge-native with session management and OAuth providers.
-
SvelteKit
Full-Stack Framework
Used for projects where the reactivity model matters. Compiles away the framework — ships minimal runtime JavaScript.
-
React 19
UI Islands
Astro island components for interactive UI. React 19's concurrent features and improved hydration make the island pattern practical.
-
Zod / Valibot
Schema Validation
Zod for general validation, Valibot for edge-optimized contexts. Both enforce type safety at runtime boundaries.
-
Playwright
E2E Testing
Cross-browser end-to-end testing and automation. Clean API that doubles as an agent tool — browser control, scraping, and UI validation in one framework.
-
Supabase
Backend as a Service
Open-source Firebase alternative. Postgres database, auth, storage, and real-time subscriptions with a generous free tier and full SQL access.
-
Clerk
Auth as a Service
Drop-in authentication with pre-built UI components. User management, organizations, and SSO without building auth from scratch.
-
Storybook
Component Development
Isolated component development and documentation. Build, test, and showcase UI components outside of the application context.
-
Chromatic
Visual Testing
Visual regression testing for UI components. Catches unintended visual changes in pull requests before they reach production.
-
Retool
Internal Tools
Low-code platform for building internal tools. Connect to any database or API and build admin panels, dashboards, and workflows in hours.
-
Serena
Semantic Code Navigation
AI-native code navigation that understands symbol relationships, not just text matches. Semantic search across codebases.
Desktop & Creative
The applications that run on the machine and the tools that shape the work. Cursor for AI-native coding, BBEdit as the text editor of last resort. Obsidian for PKM. Affinity for design when the situation demands real design tools. Raycast to tie the macOS surface together.
-
Cursor
AI Code Editor
AI-native code editor built on VS Code. Inline completions, multi-file edits, and codebase-aware chat. The IDE for AI-assisted development workflows.
-
VS Code
Code Editor
The universal editor. Extensions ecosystem, integrated terminal, and debugger. The baseline environment when Cursor or BBEdit aren't the right tool.
-
BBEdit
Text Editor
The reliable workhorse. Pattern matching, regex, multi-file grep, disk browser. Not glamorous but it's been the right tool for 30 years.
-
Warp
Terminal
AI-native terminal with built-in command completion and workflow blocks. Session history and shareable runbooks for development workflows.
-
Figma
Design Collaboration
Collaborative interface design. Prototyping, design systems, and developer handoff. The shared surface where design and engineering meet.
-
Obsidian
Knowledge Management
Local-first markdown PKM. Project notes, AI research, architecture decisions. Plain markdown files are the real value — no lock-in.
-
Affinity Suite
Design & Publishing
Designer, Photo, and Publisher — the full creative suite. Professional-grade design tools without the subscription model. Used for presentations, diagrams, and visual assets.
-
Raycast
macOS Launcher
Extensible launcher replacing Spotlight. Script commands, clipboard history, window management, and AI chat — the productivity layer that ties everything together on macOS.
-
WebCatalog / Singlebox
Web App Container
Turns web apps into native desktop applications with isolated sessions. Keeps AI tools, dashboards, and services organized as discrete windows.
-
VLC
Media Player
Plays anything. The universal media player for reviewing audio/video assets, testing media pipelines, and verifying output formats.
Collaboration & Communication
Where work happens with other humans. Slack and Discord for real-time; Google Workspace for co-editing. Asana for task tracking at the project level, Linear for issue tracking at the engineering level. Notion and NotebookLM for durable docs and research synthesis.
-
Slack
Team Messaging
Primary channel for team communication, integrations, and automated notifications. Workflow Builder and API integrations tie into project pipelines.
-
Discord
Community & Voice
Community channels, voice collaboration, and bot integrations. Where technical communities live — from MidJourney to open-source projects.
-
Google Workspace
Productivity Suite
Docs, Sheets, Slides, Drive, Meet — the collaboration backbone. Real-time co-editing and deeply integrated search across all content.
-
NotebookLM
AI Research Notebook
Google's AI-powered research tool. Source-grounded conversation over uploaded documents. Strong for synthesizing multiple papers and reports.
-
GitHub
Version Control
Source of truth for all projects. Issues and PRs for project coordination. The collaboration layer on top of git.
-
Asana
Project Management
Task tracking and project coordination across workstreams. Portfolio-level views for managing multiple concurrent projects.
-
Linear
Issue Tracking
Modern issue tracking built for engineering velocity. Keyboard-first interface, cycles, and project views. The tool engineering teams actually want to use.
-
Atlassian Suite
Jira, Confluence, Bitbucket
Enterprise project management and documentation. Jira for structured workflows, Confluence for technical documentation, Bitbucket for enterprise git.
-
Miro
Whiteboarding
Collaborative whiteboard for diagramming, brainstorming, and visual planning. The default surface for architecture discussions and workshop facilitation.
-
Notion
Documentation & Wikis
Flexible workspace for documentation, databases, and project wikis. The block-based editor handles structured and unstructured content equally well.
-
Jupyter Notebook
Interactive Computing
Interactive notebooks for data analysis, model experimentation, and documentation. The standard environment for exploratory AI/ML work.