Weave 0.9.14: Native Ollama and Other Updates

This release introduces Ollama as a fully integrated on-device AI backend with an 18-model catalog (as well as support for pulling models from Ollama and Huggingface), adds support for any OpenAI-compatible API as an external provider with encrypted API key storage, and makes chat file attachments viewable in the regular chat. You can now export and import Creations, embedding generation has been migrated from MLX to Ollama for a simpler single-server architecture, and your configured context length is now actually applied to local models. Several stability fixes address React 19 re-render loops, production permission errors, and UI quirks.

App Updates

Mar 6, 2026

Weave 0.9.14 Release Notes


New Features

On-Device Ollama Backend

Ollama is now a first-class on-device AI backend. You can run Ollama models alongside or instead of MLX. Only one backend runs at a time, selected automatically based on your active model. Full model management is built in: browse the catalog, pull, delete, and monitor download progress directly from the app. 18 new models have been added to the catalog including Qwen 3.5, Gemma 3, GPT-OSS, LFM 2, GLM 4.7 Flash, and Ministral 3.

Generic OpenAI Provider

Connect any generic OpenAI-compatible API (OpenRouter, Together AI, vLLM, or your own endpoint) as an external LLM provider.

File Attachments in Chat

Images, documents, and text files you send in chat are now clickable. Click any attachment to open it in the side viewer panel. Office documents (DOCX, PPTX, RTF, ODT, ODP, ODS) are automatically converted to PDF for in-app viewing. Excel files render as interactive tables. Converted files are cached for fast re-opens and cleaned up automatically after one hour.

Creation Export & Upload

Export your creations as HTML files with a single click from the Play page. A new "Upload Creation" tile on the Create page lets you import HTML files via click-to-browse or drag-and-drop (up to 50 MB), loading them straight into the play view with live preview.


Embedding Migration

Embedding generation has been migrated from the MLX server to Ollama, still utilizing Embedding Gemma 300M. On first launch, the app automatically detects the old embedding model, removes it, and pulls the replacement in the background, no action needed on your part.


Improvements

  • Context length is now respected. Your configured context length for Ollama models is applied on every server start. Previously it always defaulted to 4096 regardless of your setting. 75% of the configured value is used for input, reserving 25% for output.

  • PDF viewer theming. The PDF viewer background now correctly matches your app theme in both light and dark mode.

  • Image handling in chat. External images in AI responses render inline. If an image fails to load, it gracefully falls back to a clickable "View" link that opens in the side viewer or your browser.

  • Updated iconography. New creator logos for Meta, Microsoft, and Liquid. Creator icons are auto-detected from model names.

  • Search intent relaxed. Search is now less restrictive when classifying user intent, and a hint is shown suggesting .md files for long responses.


Bug Fixes

  • Fixed eco model loading banner. The "Eco Model Loading" indicator no longer appears when the on-device model is disabled in settings.

  • Fixed skill deletion. Deleting a skill no longer fails due to the alert dialog's portal click being swallowed by the popover's outside-click handler.