Now available for Apple Silicon

Run local AI models on your PC.

Localy AI makes it easy to discover, install, and use local AI models — without giving up speed, control, or privacy.

Free to start · macOS 13+ · Apple Silicon & Intel · Requires Ollama

Localy AI chat workspace on macOS
Llama 3.2
Mistral 7B
Qwen 2.5
Gemma 3
DeepSeek R1
Phi-4
Code Llama
StarCoder2
Mixtral 8x7B
LLaVA
Command-R
Yi
Codestral
GPT-OSS
TinyLlama
Orca Mini
Llama 3.2
Mistral 7B
Qwen 2.5
Gemma 3
DeepSeek R1
Phi-4
Code Llama
StarCoder2
Mixtral 8x7B
LLaVA
Command-R
Yi
Codestral
GPT-OSS
TinyLlama
Orca Mini

Everything you need

A complete home for local AI.

From discovery to inference, Localy AI handles every step so you can focus on what you're building.

50+
Models to use
Unlimited
Custom skills
2
Live integrations
Free
All of it

Local model management

Browse, organize, and switch between your installed AI models from a single, beautifully designed library.

Install & track status

One-click installs with live download progress, version tracking, and disk usage at a glance.

Clean chat workspace

A focused, distraction-free chat interface built for thinking — with history, threads, and prompts.

Mac-first design

Native feel on macOS. Optimized for Apple Silicon, with system theming, gestures, and shortcuts.

Privacy-first

Models run entirely on your machine. No cloud round-trips, no telemetry, no data leaves your PC.

Built for speed

Metal-accelerated inference and smart caching so even large models stay snappy.

Model Library

Discover and install in seconds.

Search a curated catalog of open-source models. See size, capabilities, and benchmarks before you install — then track downloads live.

Localy AI model library
Localy AI skills editor

Workspace

A chat built for thinking.

Threaded conversations, saved prompts, markdown and code rendering — all running on the model of your choice, fully offline.

Model Library

50+ local models, one app.

Tiny chat models to 70B powerhouses, plus vision and coding specialists. Localy AI checks your Mac and tells you which ones it can run.

Every model below is 100% free right now
Tiny7 models

Ultra-small models for instant replies on modest hardware.

llama3.2:1b
Llama · 1B · 0.8 GB
Free

Very light and fast for tiny local tasks.

tinyllama:1.1b
Llama · 1.1B · 0.7 GB
Free

Ultra-small model for the fastest possible local replies.

gemma2:2b
Gemma · 2B · 1.3 GB
Free

Very small Gemma for quick drafts and short replies.

stablelm:2b
StableLM · 2B · 1.4 GB
Free

Compact model for lightweight everyday prompts.

qwen2.5:1.5b
Qwen · 1.5B · 1 GB
Free

Small and sharp for quick local Q&A.

smollm2:1.7b
SmolLM · 1.7B · 1 GB
Free

Tiny model optimized for speed and low resource usage.

qwen2.5:0.5b
Qwen · 0.5B · 0.4 GB
Free

Ultra-light model for instant local responses.

Light5 models

Lightweight daily drivers for everyday chat.

orca-mini:3b
Orca · 3B · 1.9 GB
Free

Small and responsive model for lightweight prompts.

llama3.2:3b
Llama · 3B · 2 GB
Free

Best lightweight default for daily local chat.

replit-code:3b
Replit · 3B · 2.1 GB
Free

Code model tuned for quick developer assistance.

phi3:mini
Phi · 3.8B · 2.2 GB
Free

Microsoft's compact model for efficient local use.

phi3.5:mini
Phi · 3.8B · 2.3 GB
Free

Tiny and efficient model for quick local responses.

Balanced20 models

All-rounders for chat, coding, and reasoning.

qwen3:4b
Qwen · 4B · 2.7 GB
Free

A fast, newer everyday model for chat and reasoning.

gemma3:4b
Gemma · 4B · 2.8 GB
Free

Strong local quality with good speed and vision support.

Vision
yi:6b
Yi · 6B · 3.9 GB
Free

Strong bilingual model with good performance.

deepseek-coder:6.7b
DeepSeek · 6.7B · 4.1 GB
Free

Specialized coding model for programming assistance.

qwen2.5:7b
Qwen · 7B · 4.6 GB
Free

Nice for coding and structured output.

mistral:7b
Mistral · 7B · 4.7 GB
Free

Great all-rounder if you have enough RAM.

codellama:7b
Llama · 7B · 4.7 GB
Free

Llama fine-tuned for code generation and understanding.

command-r7b
Cohere · 7B · 4.8 GB
Free

Smaller instruction model for fast everyday use.

openhermes:7b
Mistral · 7B · 4.5 GB
Free

Instruction-tuned model with a chatty, helpful style.

dolphin-mistral:7b
Mistral · 7B · 4.6 GB
Free

Good general-purpose model with a playful assistant tone.

zephyr:7b
Mistral · 7B · 4.4 GB
Free

Friendly instruction model that keeps answers concise.

wizardcoder:7b
WizardCoder · 7B · 4.8 GB
Free

Coding model for writing and refactoring code.

wizardlm:7b
WizardLM · 7B · 4.6 GB
Free

Strong instruction model for structured responses.

falcon:7b
Falcon · 7B · 4.7 GB
Free

Older but solid general-purpose model.

stablelm:7b
StableLM · 7B · 4.5 GB
Free

Bigger StableLM for more capable local chat.

hermes3:8b
Nous · 8B · 5 GB
Free

Polished instruction model with strong general responses.

qwen3:8b
Qwen · 8B · 5 GB
Free

Great all-around local model with strong reasoning.

llama3.1:8b
Llama · 8B · 5.1 GB
Free

Improved Llama 3 with better instruction following.

dolphin-llama3:8b
Llama · 8B · 5.2 GB
Free

Fine-tuned for instruction following and casual chat.

gemma2:9b
Gemma · 9B · 5.8 GB
Free

Google's improved Gemma with better reasoning.

Advanced21 models

Larger, heavier models for powerful Macs.

llava:7b
Llava · 7B · 4.6 GB
Free

Popular vision model — may be buggy in some cases.

Vision
deepseek-r1:8b
DeepSeek · 8B · 5.2 GB
Free

Heavier reasoning model for stronger Macs.

qwen2.5vl
Qwen · VL · 5.3 GB
Free

Vision model that can describe and reason about images.

Vision
llama3.2:11b-vision
Llama · 11B · 7.6 GB
Free

Vision-capable Llama for image understanding and chat.

Vision
gemma3:12b
Gemma · 12B · 8 GB
Free

Larger Gemma variant for better quality and deeper reasoning.

mistral-nemo:12b
Mistral · 12B · 7.9 GB
Free

Newer Mistral variant with a strong balance of speed and quality.

phi4:14b
Phi · 14B · 9.2 GB
Free

A stronger compact model for reasoning and structured tasks.

starcoder2:15b
StarCoder · 15B · 9.8 GB
Free

Dedicated code model for editing, generation, and explanation.

deepseek-coder-v2:16b
DeepSeek · 16B · 10.2 GB
Free

Stronger coding model for larger local development tasks.

gpt-oss:20b
OpenAI · 20B · 12.8 GB
Free

A larger general-purpose model for powerful Macs.

codestral:22b
Mistral · 22B · 13.6 GB
Free

Code-focused model for more advanced software work.

mistral-small:22b
Mistral · 22B · 13.8 GB
Free

Strong larger model for higher-quality local responses.

falcon:40b
Falcon · 40B · 24 GB
Free

Heavy Falcon variant for powerful machines.

llama3.1:70b
Llama · 70B · 40 GB
Free

Powerful large model for demanding workloads.

llama3.3:70b
Llama · 70B · 41 GB
Free

Very large Llama variant for serious local rigs.

mixtral:8x7b
Mistral · 8x7B · 26 GB
Free

Mixture of experts model for advanced tasks.

command-r:35b
Cohere · 35B · 21 GB
Free

Strong instruction model for long-form and tool-like tasks.

vicuna:13b
Vicuna · 13B · 8.4 GB
Free

Classic instruction model for broad local use.

orca2:13b
Orca · 13B · 8.2 GB
Free

Instruction-tuned model with solid general reasoning.

nous-hermes2:10.7b
Nous · 10.7B · 7.1 GB
Free

Focused on helpful, thoughtful instruction following.

aya:23b
Cohere · 23B · 14.2 GB
Free

Multilingual model for better cross-language prompts.

Localy AI checks your Mac's chip and memory in-app and tells you which models it can run.

Plans

Free for everyone. Plus for the curious.

Localy AI is free. Apply for Plus to unlock exclusive downloads and early access — reviewed by AI, up to 3 approvals per day.

Free

Always

Everything you need to run local AI on your PC.

  • Run local AI models on your PC
  • Public downloads & resources
  • Community support
  • Privacy-first by default
Get Localy AI

Plus

Apply · Free

For makers, researchers, and creators with a real use case.

  • Everything in Free
  • Exclusive Plus-only downloads
  • Early access to new builds
  • Priority resources & tools
Request Plus
“Why not apply, it's free!”

FAQ

Questions, answered.

What's included, what stays on your PC, and what's coming next.

Bring AI home to your desktop.

Download Localy AI and start running powerful local models in minutes.

macOS 13+ · Apple Silicon & Intel · Windows 10/11