Favorite Tools

Model Providers

Anthropic for primary work - Claude is the model behind every project on this site. Google's Gemini for multi-modal tasks and long-context analysis. OpenAI for ecosystem integrations. Groq when latency matters. OpenRouter to evaluate models I haven't committed to yet.

Anthropic

Claude API & Claude Code

Primary AI provider. Claude powers multi-agent systems, RAG pipelines, code generation, and agentic development workflows. The context window and instruction-following quality make it the right choice end to end.
Google

Gemini API & AI Studio

Gemini models for multi-modal tasks, long-context analysis, and cross-validation in multi-agent workflows. AI Studio for rapid prototyping.
OpenAI

GPT & Embeddings

GPT-4 for specific use cases, embedding models for vector search pipelines. The ecosystem integration is mature.
Groq

LPU Inference

Ultra-low-latency inference for speed-sensitive workflows. LPU hardware delivers sub-second responses on open-weight models.
Nvidia

NIM & GPU Infrastructure

NIM microservices and GPU-accelerated inference. The hardware substrate underneath most of the AI stack.
OpenRouter

Model Router

Unified API across dozens of model providers. Route to the best model for each task without managing individual API keys.
DeepSeek

Open-Weight Reasoning

Where Claude and GPT-4 cost budget, DeepSeek closes the gap for reasoning-heavy batch jobs - code synthesis, long-form analysis, research summarization. Downloadable weights for on-prem deployment.
Mistral

European AI Provider

The option when clients need EU data residency for inference. Strong multilingual performance at lower cost than US providers, and the weights are downloadable when compliance demands it.

AI Frameworks & Local Inference

The composition layer between models and applications. LangChain and LangGraph for orchestration; LlamaIndex when retrieval is the work. CrewAI for role-based multi-agent workflows. DSPy when prompts deserve to be programs. Local inference via Ollama, LM Studio, and vLLM - the runtime layer for personal-AI experimentation on commodity hardware.

Services

Specialized AI services and operational infrastructure. MidJourney and PixelLab for generative imagery (the Limner pipeline). Elicit for academic research synthesis. ElevenLabs for voice. Hugging Face as the default first stop for any new model evaluation. Perplexity replaces the Google-then-read-ten-tabs research workflow.

Infrastructure & DevOps

The substrate underneath everything. AWS for enterprise-scale work; Cloudflare's serverless edge for everything else. Docker and Kubernetes when the deployment target requires them. Terraform for declarative infrastructure. Tailscale to make distributed dev environments behave like one network.

Data & Observability

Storage, retrieval, and the systems that make production AI legible. PostgreSQL when D1's SQLite isn't enough. Redis for sub-millisecond data access. BigQuery for warehouse-scale analysis.

Evaluation & Benchmarking

The discipline of knowing whether your AI actually works. Promptfoo for prompt regression testing. Inspect AI for structured evaluation harnesses. OpenAI Evals as the canonical reference. LangSmith when LangChain is in the loop.

Memory & Persistence

The Memory Architecture layer - how agents accumulate state across sessions and substrates. pgvector when embeddings live alongside relational data. Letta (formerly MemGPT) for tiered agent memory. Mem0 for memory-as-a-service patterns. The substrate beneath persistent agent memory work.

Projects

Adjacent projects in the personal-AI-on-personal-hardware design space. OpenClaw for action-taking, ZeroClaw for local-first privacy, OpenJarvis for on-device architecture. LobeHub as the chat substrate for multi-agent sessions.

Web

The full-stack that ships jimvinson.com and client projects. TypeScript end-to-end. Astro for content, SvelteKit and React for interactive work. Tailwind v4 for design tokens. Drizzle + D1 for edge-native data.

Desktop & Creative

The applications that run on the machine and the tools that shape the work. BBEdit as the text editor of choice. Warp for AI-native terminal sessions. Obsidian for PKM. Affinity for design when the situation demands real design tools.

Collaboration & Communication

Where work happens with other humans. Discord for real-time; Google Workspace for co-editing. Asana for task tracking at the project level, Linear for issue tracking at the engineering level. Notion and NotebookLM for durable docs and research synthesis.

Anthropic

Google

OpenAI

Groq

Nvidia

OpenRouter

DeepSeek

Mistral

LangChain

LangGraph

LlamaIndex

CrewAI

Vercel AI SDK

Weights & Biases

Ollama

LM Studio

vLLM

Maestro

MidJourney

PixelLab

ReCraft

Elicit

ElevenLabs

Hugging Face

Perplexity

Brave

AWS

Docker

Kubernetes

Terraform

GitHub Actions

Vercel

Tailscale

Wrangler

PostgreSQL

Redis

BigQuery

Prometheus

Grafana

Loki

Vector

Sentry

Promptfoo

Inspect AI

OpenAI Evals

LangSmith

pgvector

Letta

Mem0

Zep

OpenClaw

ZeroClaw

LobeHub

OpenJarvis

AnythingLLM

LibreChat

TypingMind

Open WebUI

Python

TypeScript

Astro

Next.js

Cloudflare

Tailwind CSS v4

Drizzle ORM

SvelteKit

React 19

Zod / Valibot

Playwright

Serena

BBEdit

Warp

Obsidian

Affinity Suite

Discord

Google Workspace

NotebookLM

GitHub

Asana