New features, improvements, and fixes for Solair AI.
Version 2.2Latest
May 2026
Scroll-to-Bottom Button
When you scroll up in a conversation, a glass button now appears to jump back down to the latest message. Tapping it during a response also re-enables auto-scroll, so you can keep following along as the AI generates.
Storage Reclaimed Properly
Deleting a downloaded model now immediately frees up your disk space — no need to restart the app. We also added a "Delete All Downloaded Models" button in Settings to quickly recover storage in one tap.
Combined Memory & Context Gauge
The device memory and context window indicators are now unified into one gauge. The RAM pie chart sits at the center, with a ring around it showing how much of the context window you've used. Tap it to see the full breakdown — model weights, system memory, and tokens used (e.g. 2.4K / 32K) — all in one popover.
Send Button Redesigned
The send button now renders as true liquid glass with its own independent glass context. As an option, you can also transform it into a procedural gold plasma ring with specular highlights and a soft glow. A small detail — but hey, it's ok to have fun.
Version 2.1
May 2026
Device Memory Gauge
New memory breakdown shows model weights, KV cache, and system usage at a glance, with localized labels.
Date & Time Awareness
Models now know today's date and time, for more accurate, context-aware responses.
Apple Intelligence on LAN Server
Expose Apple's on-device foundation model as an endpoint on the LAN Inference Server, alongside your MLX models.
Stability & Performance
Improved stability — much less likely to crash during long conversations or when using multiple features together (web search, images, health tools)
Better memory management — the AI adapts to your device's available memory in real time, preventing out-of-memory crashes
Cancel button works reliably — tapping stop during a response now works consistently
Importing files no longer freezes the app — documents load smoothly in the background
Faster generation stays stable — speculative decoding (2× speed mode) no longer crashes on longer conversations
Startup crash prevention — in the rare case of a corrupted database, the app recovers automatically instead of getting stuck in a crash loop
Bug Fixes
Fixed input field being locked when only Apple Intelligence was loaded with no MLX model present
Fixed stuck streaming indicator after a crash, and prevented repeated crash loops on relaunch
Reduced prefill step size to 512 to prevent out-of-memory crashes during prompt prefill with Gemma 4 MoE models
Version 2.0.1
April 2026
Download Server Mirror
New setting to choose download server: Auto, Global, or China Mirror (hf-mirror.com)
Auto-detects your region for the fastest downloads
Version 2.0
April 2026
LAN Inference Server
Turn your iPhone/iPad into an AI server. Load any model in Solair, flip the switch, and every device on your Wi-Fi can use it — just like OpenAI's API, but running entirely on your device.
See full details
How it works
Enable the server in Settings > LAN Inference Server
Any app that supports OpenAI or Ollama APIs can connect (Cursor, VS Code, Open WebUI, Python scripts, and more)
Streaming responses, just like a cloud API
Compatible with
OpenAI API — /v1/chat/completions, /v1/models
Ollama API — /api/chat, /api/tags
Security
Optional API key — generate a random key with one tap, or run without authentication on trusted networks
Rate limiting — automatic protection against request flooding
Connection limits — max 20 simultaneous connections
DNS rebinding protection — blocks cross-origin attacks from malicious websites
Credentials stored in Keychain — never in plain text
Setup guide built in
Includes connection instructions, code examples for Python and curl, and app-specific tips for Cursor, Open WebUI, and VS Code.
Good to know
Server pauses when Solair goes to the background — keep the app open while serving
Bonjour auto-discovery lets compatible apps find your server automatically
Works over Tailscale for remote access
Add Models from Files App
You can now add MLX models directly through the iOS Files app. Place a model folder into the Solair models directory, restart the app, and it appears automatically in Your Models. Supports both author--model-name and author/model-name folder formats.
Version 1.9
April 2026
Smarter Model Selection for Your Device
Solair now automatically picks the best AI smart models based on your iPhone's memory:
12GB devices (iPhone 17 Pro, iPhone Air): Qwen3 4B + Qwen3 VL 4B for maximum quality
Beautiful aurora glow effect around the input field
Animated suggestions cycle through helpful prompts
Improvements
Sidebar opens more easily with lighter swipe
Better download management with queued models
Improved local model sharing — better reliability and transfer speed
Camera improvements
Fixed memory leaks with remote server connections
Better Apple Watch voice playback
Improved translations across supported languages
Version 1.8
April 2026
Apple Watch App
Ask Solair from your wrist. Tap the mic, speak your question, and hear the answer. When Solair is running on your iPhone, queries are processed by your loaded local model. If the app is closed or your phone is locked, it falls back to Apple Intelligence seamlessly.
Local Model Sharing
Transfer AI models between your devices over Wi-Fi or Bluetooth — no internet needed. Great for setting up a new device without re-downloading gigabytes of models.
iCloud Backup Control
Option to exclude AI models from iCloud backup to save storage space.
Code Block Improvements
Auto-scroll while AI generates code
Line numbers for easier reference
Syntax highlighting in edit mode
Accessibility
Improved VoiceOver accessibility for chat messages and settings.
Version 1.7.2
April 2026
Improvements & Fixes
Health Tools now work better across all AI models, including Chinese/Japanese/Korean
Unified Smart+Vision — use one model (like Gemma 4) for both, no reloading
Fixed tool recognition for Qwen3 and other models
Version 1.7.1
April 2026
Bug Fix
Improved stability when using web search with vision models (Gemma 4)
Version 1.7
April 2026
Gemma 4 Support
Added new Gemma 4 family models (vision-language model) with full image understanding and tool calling.
New code preview with live rendering for HTML, JavaScript, p5.js, Chart.js, Three.js, D3.js, Mermaid diagrams, SVG, CSS, and Canvas
"Ask AI to Fix" button for code errors
Syntax highlighting in code blocks
Save and persist edited code
New Models
Added Qwen2.5-Coder models (1.5B, 3B, 7B)
Added LFM2.5 350M model — a tiny, reliable data extraction and tool use model
Model family logos in the All Models list
Other Improvements
Faster model downloads with accurate progress tracking
KV Cache Quantization — new setting to reduce memory usage by up to 75% during long conversations, letting you chat longer before running out of memory
Enhanced tool calling for Health Intelligence
More improvements under the hood
Version 1.6
April 2026
10 Languages Supported
Solair is now available in Spanish, Chinese (Simplified & Traditional), Japanese, French, German, Korean, Portuguese (Brazil), and Italian on top of English.
Wikipedia in Web Search
Web search now includes Wikipedia as a knowledge source as an option with Grokipedia.
Version 1.5
March 2026
Web Search Improvements
Complete overhaul of web search. The app now intelligently rewrites your questions into better search queries, handles complex multi-part questions by searching multiple times in parallel, and shows you exactly which sources were used in a new collapsible card.
35% Faster Text Generation
Under-the-hood performance improvements for Qwen 3.5 models. The MLX engine now processes tokens more efficiently on Apple Silicon.
Thinking Mode Toggle
New thinking mode button lets you enable deep reasoning for models that support it, like Qwen 3.5. Only in manual mode.
Qwen 3.5 & Nemotron Models
Qwen 3.5 is back in Auto Mode with proper thinking controls. New Nemotron model support added for even more choices.
Better Tool Calling
Fixed issues with AI calling multiple tools at once and improved handling of complex tool parameters. Health queries and other tool-based features now work more reliably.
Smarter Memory Extraction
Choose between Smart (AI-powered) or Fast (instant) methods for remembering facts about you. Smart mode understands context better, while Fast mode offers instant results.
Advanced Generation Settings
Fine-tune responses with new parameters: Top-K, Min-P, Presence Penalty, and Frequency Penalty. Try the Qwen 3.5 preset for optimal settings.
Bug Fixes
Fixed Voice Mode over Bluetooth connections
Fixed Shortcuts integration issues
Version 1.4
March 2026
Personas
Chat with AI personalities tailored to your mood. Choose from built-in personas or create your own.
Friends — Sam and Julia offer casual, supportive conversation like texting a real friend
Historical Figures — Pick the brain of Einstein, Tesla, Da Vinci, Socrates, or Benjamin Franklin. Each speaks authentically from their era with unique insights
Create Custom Personas — Create your own characters with custom names, personalities, and conversation styles. Built-in with a powerful AI creation tool
Features iMessage-style chat bubbles, unique voice for each persona, and a beautiful golden selector in the sidebar
Ask Solair AI questions directly from Siri: "Hey Siri, ask Solair AI..."
15+ Shortcuts actions: Ask questions, translate, summarize, explain code, proofread, generate ideas, and more
Works seamlessly with iOS Shortcuts app for custom automations
New Siri & Shortcuts section in Settings
Voice Mode Improvements
Faster AI response timing — reduced silence detection from 2.7s to 1.2s
Thinking blocks now stripped from spoken responses
Now 15 voices available (requires redownloading Kokoro)
Remote Server
Added support for public HTTPS servers (OpenWebUI, etc.)
Speculative Decoding
Uses the fast model to speed up the smart model. Requires 2 models from the same family (e.g. Llama 3.2 1B and 3B).
Other Improvements
Newly designed settings menu
File size limit increased to 20 MB (from 5 MB)
XLSX files now supported
Better memory management for 8GB devices
New option to enable web search by default
Image results from web search
Onboarding now lets you choose models or use defaults
New Conversation Gesture — swipe left anywhere on the chat screen to instantly create a new conversation
Bug Fixes
Fixed health tools appearing when HealthKit isn't set up
Version 1.3
February 2026
Health Intelligence
A groundbreaking feature: ask about your steps, sleep, heart rate, workouts, and more — all processed on-device.
9 data types: Exercise Time, Standing Hours, VO2 Max, Heart Rate Recovery, Walking Steadiness, Blood Pressure, and Menstrual Cycle with calendar visualization
Weekly reports and trend analysis factoring in all available metrics
All data stays on your device — never uploaded
Note: Health Intelligence is for informational purposes only and not medical advice.
Private Space
New prompt stack for personal conversations: emotional support, anxiety help, private journaling, relationship advice, and a safe space to vent. Everything stays completely on-device.
Remote Server
For power users: connect to your own LLM servers via Tailscale VPN. Supports Ollama, vLLM, and OpenAI-compatible APIs with auto-discovery and secure credential storage.
Expanded File Import
PDF, TXT, CSV, JSON, Markdown, HTML, and 25+ programming languages including Swift, Python, JavaScript, and more
Mac & iPad Improvements
Native Mac Catalyst support for better performance
Optimized memory management on all platforms
Improved layout and UI
Version 1.0 to 1.2
February 2026
Initial Release
The first versions of Solair AI — a private AI assistant that runs entirely on your iPhone and iPad. No servers, no accounts, no data collection. Chat with local LLMs, attach images and files, talk with Voice Mode, and get intelligent responses without ever going online. Super fast, built to be the best and most polished local AI app.