Favorite Tools
The stack, infrastructure, and daily tools that enable my workflows.
Model Providers
Anthropic for primary work - Claude is the model behind every project on this site. Google's Gemini for multi-modal tasks and long-context analysis. OpenAI for ecosystem integrations. Groq when latency matters. OpenRouter to evaluate models I haven't committed to yet.
-
Anthropic
Claude API & Claude Code
Primary AI provider. Claude powers multi-agent systems, RAG pipelines, code generation, and agentic development workflows. The context window and instruction-following quality make it the right choice end to end.
-
Google
Gemini API & AI Studio
Gemini models for multi-modal tasks, long-context analysis, and cross-validation in multi-agent workflows. AI Studio for rapid prototyping.
-
OpenAI
GPT & Embeddings
GPT-4 for specific use cases, embedding models for vector search pipelines. The ecosystem integration is mature.
-
Groq
LPU Inference
Ultra-low-latency inference for speed-sensitive workflows. LPU hardware delivers sub-second responses on open-weight models.
-
Nvidia
NIM & GPU Infrastructure
NIM microservices and GPU-accelerated inference. The hardware substrate underneath most of the AI stack.
-
OpenRouter
Model Router
Unified API across dozens of model providers. Route to the best model for each task without managing individual API keys.
-
DeepSeek
Open-Weight Reasoning
Where Claude and GPT-4 cost budget, DeepSeek closes the gap for reasoning-heavy batch jobs - code synthesis, long-form analysis, research summarization. Downloadable weights for on-prem deployment.
-
Mistral
European AI Provider
The option when clients need EU data residency for inference. Strong multilingual performance at lower cost than US providers, and the weights are downloadable when compliance demands it.
AI Frameworks & Local Inference
The composition layer between models and applications. LangChain and LangGraph for orchestration; LlamaIndex when retrieval is the work. CrewAI for role-based multi-agent workflows. DSPy when prompts deserve to be programs. Local inference via Ollama, LM Studio, and vLLM - the runtime layer for personal-AI experimentation on commodity hardware.
-
LangChain
LLM Application Framework
Composable building blocks for LLM applications - chains, retrievers, agents, memory. The standard framework for wiring models into production pipelines.
-
LangGraph
Agent Orchestration
Stateful, multi-agent orchestration built on LangChain. Graph-based control flow for complex agent workflows with human-in-the-loop and persistence.
-
LlamaIndex
RAG Framework
The standard framework for building retrieval-augmented generation pipelines. Data connectors, indexing strategies, and query engines purpose-built for connecting LLMs to data.
-
CrewAI
Multi-Agent Framework
Role-based multi-agent orchestration. Define agents with specific roles, goals, and tools - then let them collaborate on complex tasks autonomously.
-
Vercel AI SDK
Streaming & Tool Use
TypeScript SDK for building AI applications with streaming responses, tool calling, and multi-provider support. The abstraction layer for AI-powered UIs.
-
Weights & Biases
Experiment Tracking
MLOps platform for tracking experiments, versioning models, and visualizing training runs. The standard for reproducible machine learning development.
-
Ollama
Local Model Runtime
Run open-weight models locally with a single command. Persistent local inference on commodity hardware - the runtime layer for personal-AI experimentation.
-
LM Studio
Local Model GUI
Desktop application for discovering, downloading, and running local LLMs. Model evaluation and testing without cloud dependencies.
-
vLLM
Production Inference Server
High-throughput, memory-efficient inference engine for LLMs. PagedAttention for serving models at scale. The production backend for self-hosted model deployments.
-
Maestro
Agent Orchestration
AI agent orchestration platform for building and deploying multi-step agent workflows. Visual pipeline builder with production monitoring.
Services
Specialized AI services and operational infrastructure. MidJourney and PixelLab for generative imagery (the Limner pipeline). Elicit for academic research synthesis. ElevenLabs for voice. Hugging Face as the default first stop for any new model evaluation. Perplexity replaces the Google-then-read-ten-tabs research workflow.
-
MidJourney
Image Generation
Concept exploration and initial asset generation in the Limner pipeline. The aesthetic control and style consistency are unmatched for creative direction.
-
PixelLab
Pixel Art Generation
Specialized pixel art generation integrated into the Limner: Pixel pipeline. Purpose-built for the medium rather than adapted from photorealistic models.
-
ReCraft
Design-Grade Generation
Vector and raster generation with strong design sensibility. Useful for UI assets, icons, and illustrations that need to feel designed rather than generated.
-
Elicit
Research Assistant
AI-powered research tool for finding, analyzing, and synthesizing academic papers. Useful when a project needs literature review before a design decision.
-
ElevenLabs
Voice & Audio AI
Voice synthesis and audio generation. The quality ceiling for text-to-speech and voice cloning in production applications.
-
Hugging Face
Model Hub & Inference
Model discovery, hosted inference, and the open-source ML ecosystem. The default starting point for evaluating new models and architectures.
-
Perplexity
AI Search
Citation-backed AI search for research and fact-checking. Replaces the Google-then-read-ten-tabs workflow with direct answers and sources.
-
Brave
Privacy-First Browser & Search
Privacy-respecting browser and search API. The Search API powers tool-use pipelines where you need web results without tracking overhead.
Infrastructure & DevOps
The substrate underneath everything. AWS for enterprise-scale work; Cloudflare's serverless edge for everything else. Docker and Kubernetes when the deployment target requires them. Terraform for declarative infrastructure. Tailscale to make distributed dev environments behave like one network.
-
AWS
Cloud Platform
S3, Lambda, Bedrock, SageMaker - the enterprise cloud ecosystem. Certified AI Practitioner with hands-on experience across compute, storage, and AI services.
-
Docker
Containerization
Container runtime for reproducible builds and deployments. The standard abstraction layer between development environments and production infrastructure.
-
Kubernetes
Container Orchestration
Container orchestration for production workloads. Deployment, scaling, and management of containerized applications across clusters.
-
Terraform
Infrastructure as Code
Declarative infrastructure management for Cloudflare resources. Version-controlled, reviewable infrastructure changes through HCL configurations.
-
GitHub Actions
CI/CD Pipelines
Automated build, test, and deployment workflows. Powers the CI/CD pipeline from push to Cloudflare Pages deployment.
-
Vercel
Frontend & Serverless
Deployment target for Next.js and SvelteKit projects outside the Cloudflare stack. AI SDK integrations, preview deployments, and edge functions.
-
Tailscale
Mesh VPN
Zero-config mesh networking across devices. Makes the lab PC, development machines, and cloud services behave like they're on the same LAN.
-
Wrangler
Cloudflare CLI
Cloudflare's command-line tool for developing and deploying Workers, Pages, D1, R2, and KV. The single entry point for the entire Cloudflare developer platform.
Data & Observability
Storage, retrieval, and the systems that make production AI legible. PostgreSQL when D1's SQLite isn't enough. Redis for sub-millisecond data access. BigQuery for warehouse-scale analysis.
-
PostgreSQL
Relational Database
The industry-standard relational database. Full SQL, JSON support, extensions ecosystem. The foundation when projects need more than D1 SQLite at the edge.
-
Redis
In-Memory Data Store
High-performance caching, pub/sub messaging, and session storage. The speed layer for applications that need sub-millisecond data access.
-
BigQuery
Data Warehouse
Serverless data warehouse for analytics at scale. SQL-based analysis over petabyte-scale datasets without infrastructure management.
-
Prometheus
Metrics & Monitoring
Pull-based metrics collection and time-series database. The standard data source for Grafana dashboards and alerting rules.
-
Grafana
Observability Platform
Dashboards and visualization for metrics, logs, and traces. The single pane of glass for monitoring deployed infrastructure and AI pipeline health.
-
Loki
Log Aggregation
Horizontally scalable log aggregation designed for Grafana. Label-based indexing makes it cost-effective for high-volume log streams.
-
Vector
Data Pipeline
High-performance observability data router. Collects, transforms, and routes logs and metrics between services with minimal resource overhead.
-
Sentry
Error Tracking
Real-time error tracking and performance monitoring. Stack traces, breadcrumbs, and release tracking for production applications.
Evaluation & Benchmarking
The discipline of knowing whether your AI actually works. Promptfoo for prompt regression testing. Inspect AI for structured evaluation harnesses. OpenAI Evals as the canonical reference. LangSmith when LangChain is in the loop.
-
Promptfoo
Prompt Regression Testing
Snapshot prompts, define assertions, catch regressions before they ship. Open source, runs in CI. The test framework LLM applications were missing.
-
Inspect AI
Evaluation Harness
UK AI Safety Institute's structured evaluation framework. Designed for capability, safety, and alignment testing with reproducible scoring and dataset versioning.
-
OpenAI Evals
Evaluation Reference
The canonical reference for LLM evaluation patterns. Templates, registries, and the evaluation vocabulary the rest of the field borrows from.
-
LangSmith
LLM Observability
Tracing, evaluation, and monitoring for LangChain applications. The observability layer when LangChain is in the loop and you need to see what each chain step is doing.
Memory & Persistence
The Memory Architecture layer - how agents accumulate state across sessions and substrates. pgvector when embeddings live alongside relational data. Letta (formerly MemGPT) for tiered agent memory. Mem0 for memory-as-a-service patterns. The substrate beneath persistent agent memory work.
-
pgvector
Postgres Vector Extension
Vector similarity search as a Postgres extension. The default when embeddings need to live alongside relational data without a separate vector database.
-
Letta
Tiered Agent Memory
Formerly MemGPT. Tiered memory system - working memory, long-term memory, archival storage - for agents that need to persist across sessions with structured recall.
-
Mem0
Memory-as-a-Service
Hosted persistent memory layer with API-first design. Drops into any agent application without standing up vector DB or retrieval infrastructure yourself.
-
Zep
Long-Term Context
Long-term memory and context management for LLM apps. Auto-extraction of facts and entities from conversation history - the memory graph under the conversation.
Projects
Adjacent projects in the personal-AI-on-personal-hardware design space. OpenClaw for action-taking, ZeroClaw for local-first privacy, OpenJarvis for on-device architecture. LobeHub as the chat substrate for multi-agent sessions.
-
OpenClaw
Agentic Personal Assistant
Agentic personal AI assistant that takes actions across platforms. Open-source, action-oriented architecture - the AI that actually does things, not just answers questions.
-
ZeroClaw
Local-First Private AI
Private AI assistant that runs 100% locally. Multi-platform (Telegram, Discord, WhatsApp) with no cloud dependency - your data never leaves your machine.
-
LobeHub
AI Chat Framework
Open-source chat framework powering multi-agent conversation interfaces. Used for the AI Peer Review collaborative synthesis environment.
-
OpenJarvis
On-Device Personal AI
Personal AI that runs on personal devices. Architecture for on-device assistants from Stanford's Scaling Intelligence Lab - privacy and latency both solved at the substrate.
-
AnythingLLM
All-in-One Desktop AI
Desktop AI app with built-in document ingestion, RAG, and multi-provider model support. Drag in files, connect any LLM, get answers grounded in your data - no infrastructure to manage.
-
LibreChat
Multi-Model Chat
Open-source ChatGPT alternative supporting Claude, GPT, Gemini, and local models in one interface. Conversations, presets, and plugins without vendor lock-in.
-
TypingMind
Premium Chat UI
Enhanced chat interface for Claude, GPT, and Gemini. Custom personas, prompt library, and conversation management - the UI layer when the native apps aren't enough.
-
Open WebUI
Local LLM Interface
Open-source web UI for Ollama and other local models. RAG pipeline, model management, and collaborative features. The browser-based frontend for local inference.
Web
The full-stack that ships jimvinson.com and client projects. TypeScript end-to-end. Astro for content, SvelteKit and React for interactive work. Tailwind v4 for design tokens. Drizzle + D1 for edge-native data.
-
Python
Language
The lingua franca of AI/ML development. Data pipelines, model training, scripting, and API services. Every AI project touches Python somewhere.
-
TypeScript
Language
The language underneath the entire stack. Type safety from schema validation to API boundaries to UI components. Every project is TypeScript-first.
-
Astro
Static Site Generator
Content-first framework that ships zero JS by default. Powers this site and most web projects. Island architecture for interactive components without the full bundle tax.
-
Next.js
React Framework
Full-stack React framework for projects in the Vercel ecosystem. Server components, API routes, and middleware for applications that need SSR or ISR.
-
Cloudflare
Edge Infrastructure
Workers, Pages, D1, R2, Vectorize, KV - the full edge stack. Single wrangler.toml, one billing dashboard, global deployment with sub-50ms cold starts.
-
Tailwind CSS v4
CSS Framework
v4's CSS-first config and oklch color system are significant upgrades. The @theme block in global.css replaces the entire config file.
-
Drizzle ORM
Database ORM
Lightweight, type-safe ORM that speaks D1 natively. SQL-first philosophy means the abstractions don't fight you.
-
SvelteKit
Full-Stack Framework
Used for projects where the reactivity model matters. Compiles away the framework - ships minimal runtime JavaScript.
-
React 19
UI Islands
Astro island components for interactive UI. React 19's concurrent features and improved hydration make the island pattern practical.
-
Zod / Valibot
Schema Validation
Zod for general validation, Valibot for edge-optimized contexts. Both enforce type safety at runtime boundaries.
-
Playwright
E2E Testing
Cross-browser end-to-end testing and automation. Clean API that doubles as an agent tool - browser control, scraping, and UI validation in one framework.
-
Serena
Semantic Code Navigation
AI-native code navigation that understands symbol relationships, not just text matches. Semantic search across codebases.
Desktop & Creative
The applications that run on the machine and the tools that shape the work. BBEdit as the text editor of choice. Warp for AI-native terminal sessions. Obsidian for PKM. Affinity for design when the situation demands real design tools.
-
BBEdit
Text Editor
The reliable workhorse. Pattern matching, regex, multi-file grep, disk browser. Not glamorous but it's been the right tool for 30 years.
-
Warp
Terminal
AI-native terminal with built-in command completion and workflow blocks. Session history and shareable runbooks for development workflows.
-
Obsidian
Knowledge Management
Local-first markdown PKM. Project notes, AI research, architecture decisions. Plain markdown files are the real value - no lock-in.
-
Affinity Suite
Design & Publishing
Designer, Photo, and Publisher - the full creative suite. Professional-grade design tools without the subscription model. Used for presentations, diagrams, and visual assets.
Collaboration & Communication
Where work happens with other humans. Discord for real-time; Google Workspace for co-editing. Asana for task tracking at the project level, Linear for issue tracking at the engineering level. Notion and NotebookLM for durable docs and research synthesis.
-
Discord
Community & Voice
Community channels, voice collaboration, and bot integrations. Where technical communities live - from MidJourney to open-source projects.
-
Google Workspace
Productivity Suite
Docs, Sheets, Slides, Drive, Meet - the collaboration backbone. Real-time co-editing and deeply integrated search across all content.
-
NotebookLM
AI Research Notebook
Google's AI-powered research tool. Source-grounded conversation over uploaded documents. Strong for synthesizing multiple papers and reports.
-
GitHub
Version Control
Source of truth for all projects. Issues and PRs for project coordination. The collaboration layer on top of git.
-
Asana
Project Management
Task tracking and project coordination across workstreams. Portfolio-level views for managing multiple concurrent projects.
-
Linear
Issue Tracking
Modern issue tracking built for engineering velocity. Keyboard-first interface, cycles, and project views. The tool engineering teams actually want to use.
-
Atlassian Suite
Jira, Confluence, Bitbucket
Enterprise project management and documentation. Jira for structured workflows, Confluence for technical documentation, Bitbucket for enterprise git.
-
Miro
Whiteboarding
Collaborative whiteboard for diagramming, brainstorming, and visual planning. The default surface for architecture discussions and workshop facilitation.
-
Notion
Documentation & Wikis
Flexible workspace for documentation, databases, and project wikis. The block-based editor handles structured and unstructured content equally well.
-
Jupyter Notebook
Interactive Computing
Interactive notebooks for data analysis, model experimentation, and documentation. The standard environment for exploratory AI/ML work.