Models

Choosing a model

A practical decision guide. What to pick for daily chat, romance, hard reasoning, long context, ultra-cheap, ultra-free.

The model picker is two clicks away, and switching mid-chat is free. So most users settle on a default and then occasionally swap for a specific scene. This page is the cheat sheet for that.

Source: The data-driven model picker →

The short answer

If you want…	Use
A fast, cheap daily driver	DeepSeek V4 Flash (0.3×)
Solid roleplay quality at low cost	DeepSeek V3.2 (0.5×)
Premium quality for important scenes	GLM 5 (1.3×) or GLM 5.1 (2.0×)
Maximum reasoning depth	DeepSeek R1 (1.0×, always-reasoning)
Million-token context	Gemini 3 Flash Preview (1.0×)
Zero cost (no credits at all)	Llama 3.1 8B (free)
Free model with image understanding	Qwen3 VL 30B (free)

The platform default for a reason. Cheap (0.3×), 163K context (longer than most need), optional reasoning (off by default for speed), DeepSeek's fast, affordable chat model. Most users never need to switch.

Where it falls short: deep emotional scenes feel a little flatter than GLM 5+. For most daily roleplay it's plenty.

When to upgrade to DeepSeek V3.2

DeepSeek V3.2 is the "I want a bit more, still cheap" choice. 0.5× multiplier, 163K context, no reasoning. Voice is more grounded and slower than DeepSeek V4 Flash — better for slow-burn romance, thoughtful dialogue, or characters that need more "presence."

When to use GLM 4.7

GLM 4.7 is 1.0×. The voice is balanced — neither flowery nor blunt. Optional reasoning makes it good for both casual chat and thoughtful scenes.

This is a great "middle" choice — significantly better than 0.3-0.5× models, not as expensive as the GLM 5 line.

When to use GLM 5 or GLM 5.1 (subscribers only)

GLM 5 (1.3×) and GLM 5.1 (2.0×) are the highest-end models in Reverie's catalog and live in the advanced tier — they require an active subscription to use. Both excel at:

Dramatic prose
Emotional nuance
Long-form coherence
Subtext-heavy dialogue

The 5.1 is a meaningful step up from 5 — pricier but visibly better. Use for novel-mode generation and important story chapters.

When to use DeepSeek R1

Always-reasoning model. Slower than the others, but the visible thinking pane lets the model work out subtle scenes step by step.

Good for:

Mystery / detective scenes (the character deduces)
Tutoring (the character explains its work)
Emotional confrontation (the character considers what not to say)

Avoid for fast-paced banter — the reasoning step adds latency that breaks the rhythm.

When to use Gemini 3 Flash Preview (subscribers only)

Million-token context is the killer feature. For long story arcs that have run hundreds of messages or for novels you don't want summarized, Gemini 3 holds it all. Multimodal — handles image input well. 1.0× multiplier; in the advanced tier so requires a subscription.

When to use Llama 3.1 8B

Free. For quick chats or to keep talking without burning credits, it's there.

Quality is roughly "early ChatGPT 3.5 era" — usable, somewhat shallower, but consistent.

There's a dynamic rate limit on free models (shared across web/bots/API) so they can't be used as an unlimited backdoor. Hit the limit and you switch to a paid model — or wait it out. See Free models.

When to use Qwen3 VL 30B

Free + vision-capable. Drop in an image and ask the character to describe or react. Good for image-driven chats; weaker on pure conversational depth.

Per-scene model strategy

You don't have to pick one model and stick. A common pattern among power users:

Open daily chat on DeepSeek V4 Flash. 90% of casual talk goes here.
Important scene incoming? Switch to GLM 5 or 5.1 for one or two replies.
Need long context? Switch to Gemini 3.
Want to see the reasoning? Switch to DeepSeek R1.
Back to DeepSeek V4 Flash for the next casual exchange.

The switch is free and instantaneous. Use it.

How to A/B compare

You have three good options, in increasing effort:

Read the model picker's community stats. Each model card shows a quality / like rate, a speed score, and a context chip. Tap a card to expand a full bar chart and head-to-head matchups against other models — these come from real users picking between responses in first-reply comparisons. The numbers carry a confidence rating (⭐⭐⭐⭐⭐) based on sample size; ignore low-confidence ones.
Let the first-reply comparison decide for you. When you start a new chat, Reverie shows two responses (yours + an alternative model's) side by side. Pick one. Your choice both gives you the better opening and feeds the community stats from #1. See Dual-model comparison.
Fork manually. Fork at an interesting moment, switch models in the new branch, regenerate. Best for taste-testing on a specific scene rather than averages.

Free models commitment

Reverie commits to always offering at least one fully free LLM so the platform remains accessible regardless of credit balance. Currently Llama 3.1 8B and Qwen3 VL 30B fill that role — both subject to a shared anti-abuse rate limit. Details →

Why this matters →