Local AI for Creative Writers: KoboldCpp, SillyTavern, and Uncensored Models

A non-technical guide for fiction writers who want AI tools that respect creative freedom and privacy. How to set up KoboldCpp, SillyTavern, and uncensored models for brainstorming, worldbuilding, and prose generation.

If you are a fiction writer who has tried using ChatGPT or Claude for creative work, you have hit the wall. You know exactly which wall I mean.

You are writing a thriller and the AI refuses to describe violence. You are writing romance and it will not generate anything past a fade-to-black. You are exploring dark themes — trauma, addiction, war — and the model lectures you about sensitivity instead of helping you write a compelling scene. You ask it to write a villain’s monologue and it inserts moral disclaimers that break character.

Cloud AI models are designed to be maximally safe for the broadest possible audience. That is a reasonable product decision for a general-purpose chatbot. It is a terrible design for a creative writing tool. Fiction requires the full range of human experience — including the parts that make content moderators nervous.

Local AI solves this completely. You run the model on your hardware. There is no content policy. No refusals. No one looking over your shoulder. Your writing stays on your machine. This is not about generating harmful content — it is about having a creative tool that treats you like an adult writing fiction for other adults.

Here is how to set it up, even if you have never touched a command line.

The Tools: A Plain-Language Overview

You need three things: an inference engine (the software that runs the AI model), a model (the AI brain itself), and a frontend (the interface you interact with). Think of it like a game console, a game cartridge, and a TV.

KoboldCpp — The Engine

KoboldCpp is the inference engine purpose-built for creative writing. It runs GGUF model files on your computer and provides a writing-focused interface. It works on Windows, Mac, and Linux. It has a built-in web interface, and it can also serve as a backend for other frontends.

Why KoboldCpp over Ollama or LM Studio? Those tools are great for general AI use. KoboldCpp is specifically optimized for the creative writing use case: it has better support for long-form text generation, “story mode” features, memory management for ongoing narratives, and fine-grained control over generation parameters that matter for prose quality.

SillyTavern — The Premium Frontend

SillyTavern is a web-based frontend that connects to KoboldCpp (or Ollama, or other backends). It is the most feature-rich creative writing interface in the local AI ecosystem. It supports:

  • Character cards — Define detailed characters with personalities, backgrounds, and writing styles
  • Scenario management — Set up scenes with world info, author’s notes, and contextual memory
  • Multiple chat and story modes — Interactive fiction, collaborative writing, solo generation
  • World Info / Lorebooks — Define your fictional world’s rules, locations, and characters so the AI stays consistent
  • Fine-grained generation controls — Temperature, repetition penalty, top-K, top-P, and more

SillyTavern has a reputation as a “character chat” tool, but it is genuinely powerful for serious creative writing when configured properly.

Uncensored Models — The Brain

“Uncensored” is the community term for models that have had alignment training removed or reduced. These models will engage with any creative scenario without refusals. The most popular options:

  • Midnight Miqu 103B — A community merge known for exceptional creative writing quality at the cost of enormous hardware requirements
  • Llama 4 variants with uncensored fine-tunes — Various community fine-tunes that remove refusals while preserving quality
  • Qwen 3 uncensored — Excellent multilingual creative writing
  • Mistral-based creative models — Particularly strong for European literary styles
  • WizardLM uncensored variants — Good balance of intelligence and creative freedom

For most writers, a well-tuned 13-32B uncensored model is the sweet spot. Large enough to produce quality prose, small enough to run on consumer hardware.

Setting Up: Windows Walkthrough

I am writing this for people who use Word, not Vim. Every step is explained.

What You Need

  • A computer with a dedicated GPU (NVIDIA with at least 8GB VRAM is ideal, but KoboldCpp also runs on AMD GPUs and CPUs)
  • At least 16GB of system RAM
  • About 30GB of free disk space

If you have a gaming PC from the last 4-5 years, you almost certainly have enough hardware.

Step 1: Install KoboldCpp

  1. Go to KoboldCpp releases
  2. Download the latest koboldcpp.exe (Windows) — it is a single file, no installation needed
  3. Put it in a folder you will remember, like C:\KoboldCpp\

Step 2: Download a Model

You need a model file in GGUF format. Here is where to get good creative writing models:

  1. Go to Hugging Face
  2. Search for “GGUF” plus a model name

Recommended starter models by hardware:

Your GPU VRAMRecommended ModelFile Size
6-8 GBMistral 7B Uncensored Q5_K_M~5 GB
8-12 GBLlama 3.1 8B Uncensored Q6_K~6.5 GB
12-16 GBQwen 3 14B Creative Q5_K_M~10 GB
16-24 GBLlama 4 Scout Creative Q4_K_M~18 GB
24+ GBMidnight Miqu 70B Q3_K_M~23 GB

Download the GGUF file and put it in the same folder as KoboldCpp.

Step 3: Launch KoboldCpp

  1. Double-click koboldcpp.exe
  2. A configuration window appears
  3. Click “Browse” next to Model, select your downloaded GGUF file
  4. Under GPU Layers, set a high number (999 to offload everything to GPU)
  5. Set Context Size to 8192 (this determines how much of your story the AI “remembers”)
  6. Click “Launch”

A browser window will open with the KoboldCpp interface. You are running a local AI. That was it.

SillyTavern provides a much richer writing experience. You need Node.js installed first:

  1. Download and install Node.js (LTS version)
  2. Download SillyTavern — click the green “Code” button, then “Download ZIP”
  3. Extract the ZIP to a folder like C:\SillyTavern\
  4. Open that folder, double-click start.bat
  5. SillyTavern opens in your browser
  6. In SillyTavern’s connection settings, select “KoboldAI” as the API type and enter http://localhost:5001 as the URL
  7. Click Connect

You now have SillyTavern talking to KoboldCpp talking to your model. Everything runs on your computer.

Configuring for Creative Writing

The default settings for most AI tools are optimized for chat, not prose. Here are the settings that make a difference for fiction writing:

Temperature

This controls randomness. Higher temperature means more creative (and less predictable) text.

  • 0.5-0.7: Conservative, safe prose. Good for technical or formal writing.
  • 0.8-1.0: Balanced creativity. Good default for most fiction.
  • 1.0-1.3: Highly creative, surprising word choices. Good for poetry and experimental fiction.
  • Above 1.3: Usually too chaotic. The AI starts producing incoherent text.

Start at 0.85 and adjust based on results.

Repetition Penalty

This prevents the model from reusing phrases. Set it between 1.05 and 1.15. Too low and the AI repeats itself. Too high and it starts avoiding common words and producing awkward prose.

Context Window

This is how much of your story the AI can “see” when generating the next passage. More context means better coherence but slower generation. 8192 tokens (about 6,000 words) is a good starting point. If your model and hardware can handle it, 16384 is better for novel-length work.

Author’s Note

In SillyTavern, the Author’s Note is injected into the context near the end, where it has the strongest influence on generation. Use it to steer style:

[Author's Note: Write in third-person limited perspective. Use vivid sensory 
details. Maintain a dark, atmospheric tone. Show, don't tell. Dialogue should 
feel natural and terse.]

This is one of the most powerful and underused tools for creative writers using local AI.

Practical Workflows for Writers

Brainstorming

Use the AI as a brainstorming partner by asking open-ended questions in chat mode:

  • “Give me five possible motivations for a character who betrays their closest friend”
  • “What would be an unexpected twist for a detective story set in a space station?”
  • “Describe five unique magic systems that haven’t been done to death”

The uncensored model will explore dark, complex, and morally ambiguous territory that cloud models shy away from. This is enormously valuable for developing nuanced antagonists, morally grey protagonists, and realistic conflict.

Worldbuilding

Create a World Info / Lorebook entry for each major element of your world. SillyTavern will automatically inject relevant entries when their trigger words appear in the conversation:

Entry: "The Void"
Trigger: void, darkness, shadow realm
Content: The Void is a parallel dimension that mirrors the physical world but 
is populated by shadow entities. Exposure to the Void for more than 12 hours 
causes permanent psychological changes. The Void can only be accessed through 
obsidian mirrors during a new moon.

This keeps the AI consistent with your worldbuilding even across long stories.

Prose Generation

For actual prose generation, use “story mode” in KoboldCpp or “continue” mode in SillyTavern. Write a paragraph or scene opening, then let the AI continue. The key is iteration:

  1. Write your opening paragraph
  2. Let the AI generate a continuation
  3. Edit the AI output — keep what works, rewrite what does not
  4. Continue from your edited version
  5. Repeat

Most professional writers using AI this way report that the AI generates a first draft that is 30-50% usable. They then edit it into their voice. This is not “AI writing your book” — it is AI accelerating your drafting process while you maintain creative control.

Character Dialogue

Create detailed character cards in SillyTavern and use chat mode to explore how characters would respond to different situations. This is remarkably useful for:

  • Testing dialogue voice — does the character sound distinct?
  • Exploring character reactions — how would they respond to unexpected news?
  • Finding character-specific phrasing and idioms

The Privacy Argument

For creative writers, privacy is not abstract. Consider what you are putting into a cloud AI when you write fiction:

  • Your unpublished ideas and plot outlines
  • Your unfinished manuscripts
  • Scenes exploring sensitive themes that could be misinterpreted out of context
  • Character development notes that reflect deeply personal experiences

With cloud AI, all of this goes to a company’s servers, is potentially used for training, and is subject to content policies that could flag your literary exploration as policy violations.

With local AI, none of this leaves your computer. Your creative process is private by architecture, not by policy. No terms of service can change that.

Tool Comparison for Writers

FeatureKoboldCpp + SillyTavernOllama + Open WebUILM Studio
Creative writing focusExcellentGoodGood
Character cardsYes (SillyTavern)LimitedNo
World Info / LorebooksYes (SillyTavern)NoNo
Story modeYesNoNo
Author’s NoteYes (SillyTavern)NoNo
Ease of setupModerateEasyVery Easy
Generation controlsExtensiveBasicModerate
Uncensored model supportFullFullFull

For serious creative writing, the KoboldCpp + SillyTavern combination is unmatched. If you want something simpler for occasional brainstorming, Ollama + Open WebUI or LM Studio work fine.

If you are brand new to local AI and want to use it for writing, here is the path I recommend:

  1. Week 1: Install LM Studio (easiest setup). Download a small model. Get comfortable chatting with it and using it for brainstorming. No code, no configuration files.

  2. Week 2: Install KoboldCpp. Try a larger model. Experiment with generation parameters (temperature, repetition penalty). Notice how different settings change the prose quality.

  3. Week 3: Install SillyTavern. Set up character cards and World Info for a current project. Use it for a real writing session.

  4. Week 4: Explore different models. Try a creative-focused fine-tune. Find the model that matches your prose style.

By the end of the month, you will have a fully personalized AI writing setup that costs nothing to use, respects your privacy, and does not lecture you about the themes you are exploring.

A Note on Ethics

I want to address this directly because it always comes up: using uncensored models for creative writing is not the same as generating harmful content. Fiction has always explored the full range of human experience, including violence, trauma, abuse, and moral complexity. The world’s greatest literature does not shy away from darkness — it illuminates it.

Cormac McCarthy wrote extreme violence. Vladimir Nabokov wrote about a pedophile’s perspective. Toni Morrison wrote about slavery’s brutality. These works are not harmful — they are essential. A writing tool that refuses to engage with difficult themes is a tool that cannot help you write literature.

Local, uncensored AI is the tool that lets you explore any theme, any perspective, any scenario — in the privacy of your own creative process. What you do with that freedom is your responsibility as an author, just as it has always been.

If you are a creative writer using local AI, we would love to feature your workflow. Reach out through our community page.