Writing on software development, containers & Kubernetes, and AI.

Here's all of my longer content from around the web, collected in chronological order.

The state of open source AI models in 2025

A look at the open source AI model landscape in 2025, from DeepSeek and Qwen to gpt-oss and small language models, and how to run them on your own hardware.

Run containerized AI models locally with RamaLama

Learn how to run AI models locally with RamaLama, an open source project that simplifies running AI models in containers for local inference, serving, and RAG.

3 MCP servers you should be using (safely)

Explore the three most useful Model Context Protocol (MCP) servers for developers, including Kubernetes, Context7, and GitHub, along with essential safety guardrails.

What is llm-d and why do we need it?

Discover llm-d, an open source Kubernetes-native framework for distributed AI inference that improves performance and reduces costs through disaggregation and intelligent scheduling.

Alignment tuning and RAG: What you should know

Explore how alignment tuning and retrieval-augmented generation (RAG) are two strategies for customizing large language models for enterprise use cases.

How to make generative AI more consumable

Learn about the three stages of AI adoption—utilization, adoption, and customization—and how open frameworks make generative AI accessible for enterprises.

Intro to Podman

Podman is an alternative to the Docker command-line interface that lets you run standalone, daemonless containers. See examples of how easy it is to use Podman.

Getting started with Buildah

Use Buildah to create a working Open Container Initiative container image from scratch, or from a pre-existing Dockerfile, before running it with Podman.