Mobile AI Built by Cognisoc MIT

Llamafu

Flutter plugin enabling on-device AI inference on Android and iOS with complete privacy. Supports text generation, chat, vision, function calling, and structured JSON output.

Platforms: androidios

Llamafu is a Flutter plugin that brings large language model inference directly to Android and iOS devices. It runs models entirely on-device with no server dependency and no data leaving the phone, making it the right choice for mobile applications that require AI capabilities with complete user privacy. Built on llama.cpp’s proven inference core, Llamafu exposes a Dart-native API that integrates cleanly into Flutter’s reactive framework.

Key Features

On-device privacy. All inference happens locally on the user’s device. No prompts, responses, or user data are transmitted to external servers. This makes Llamafu suitable for applications handling sensitive information — health data, personal notes, financial queries — where cloud-based AI is not an option.

Text generation and chat. Llamafu supports both raw text completion and structured chat with system, user, and assistant message roles. Conversation history management, context windowing, and token streaming are handled by the plugin, giving developers a high-level API without manual KV cache management.

Vision and multimodal input. Pass images alongside text prompts to multimodal models. Llamafu handles image preprocessing and encoding, enabling use cases like photo description, document analysis, and visual question answering directly on the device.

Function calling and structured output. Define tool schemas and receive structured JSON responses from models that support function calling. This enables mobile AI agents that can interact with device APIs, databases, and external services through well-defined interfaces.

Flutter-native integration. Llamafu uses Dart streams for token-by-token output, integrates with Flutter’s widget lifecycle for proper resource management, and supports hot reload during development. The API follows Flutter conventions so it feels natural alongside other Flutter plugins.

When to Use Llamafu

Use Llamafu when building Flutter applications that need on-device AI without cloud dependencies. It is ideal for privacy-first apps, offline-capable assistants, and mobile tools that need to function without internet connectivity. Models in the 1B-7B parameter range at Q4 quantization run well on modern smartphones with 6GB+ RAM.

Ecosystem Role

Llamafu brings the local AI stack to mobile. It uses the same GGUF model format as llama.cpp, Ollama, and LM Studio, so models you test on desktop work on mobile without conversion. For desktop or server inference, pair it with Ollama or Mullama for a consistent cross-platform AI strategy.