Architecture
The Cheshire Cat framework consists of four components: the Core, the Vector Database, the LLM and the embedder.
The Core and the Admin Portal are implemented within the framework, while the Vector Database, the LLM and the embedder are external dependencies.
The Core communicates with the Vector Database, the LLM and the embedder, while The Admin Portal communicates with the Core.
The Core is implemented in Python, Qdrant is utilized as Vector Database, the Core support different LLMs and Embbeders (see the complete list below), the Admin Portal is implemented using the Vue framework.
Core
Docker Images
To facilitate, speed up, and standardize the Cat's user experience, the Cat contains configuration for use inside Docker.
You can use the pre-compiled images present in the Repo's Docker Registry or build it from scratch:
-
To use the pre-compiled image, add
ghcr.io/cheshire-cat-ai/core:<tag-version>
as value ofimage
under the name of the service in the docker-compose: -
To build it from scratch execute
docker compose build
in the repo folder just cloned.This will build two Docker images are generated. The first one contains the Cat Core and Admin Portal. The container name of the core is
cheshire_cat_core
.
The Cat core path ./core
is mounted to the image cheshire_cat_core
, by default changes to files in this folder force a restart of the Core, this behavior can be disable using the DEBUG
environment variables.
Admin Portal
As default the Admin Portal connect to the core using localhost
and the port exposed to the container, this value can be customized using environment variables. The port is the only one exposed by the cheshire_cat_core
image.
Logging
All the log messages are printed on the standard output and log level can be configured with LOG_LEVEL
environment variables. You can check logging system documentation here.
Configuration
Some options of the Core can be customized using environment variables.
Compatible Models
The cat is agnostic, meaning You can Attach your prefered llm and embedder model/provider. We provide by default the most used but you can increse the number of models/providers by plugins, here is a list of the most used:
- OpenAI and Azure OpenAI
- Cohere
- Ollama (LLM model only)
- HuggingFace TextInference API (LLM model only)
- Google Gemini
- Qdrant FastEmbed (Embedder model only)
Vector Memory
What we use as vector memory?
The cat provide connection to qdrant thought his python client.
By default the core try to connect to a qdrant database, if the connection don't go well than switch to the local Qdrant database
.
Is Highly recommended to connect the cat to a qdrant database to increase performance and capacity!
Cloud Use and Hosting
Qdrant provides to 2 paths:
-
Host yourself by using docker, here an example using docker-compose:
version: '3.7' services: cheshire-cat-core: image: ghcr.io/cheshire-cat-ai/core:1.5.1 container_name: cheshire_cat_core depends_on: - cheshire-cat-vector-memory environment: - PYTHONUNBUFFERED=1 - WATCHFILES_FORCE_POLLING=true - CORE_HOST=${CORE_HOST:-localhost} - CORE_PORT=${CORE_PORT:-1865} - QDRANT_HOST=${QDRANT_HOST:-cheshire_cat_vector_memory} - QDRANT_PORT=${QDRANT_PORT:-6333} - CORE_USE_SECURE_PROTOCOLS=${CORE_USE_SECURE_PROTOCOLS:-} - API_KEY=${API_KEY:-} - LOG_LEVEL=${LOG_LEVEL:-WARNING} - DEBUG=${DEBUG:-true} - SAVE_MEMORY_SNAPSHOTS=${SAVE_MEMORY_SNAPSHOTS:-false} ports: - ${CORE_PORT:-1865}:80 volumes: - ./cat/static:/app/cat/static - ./cat/plugins:/app/cat/plugins - ./cat/data:/app/cat/data restart: unless-stopped cheshire-cat-vector-memory: image: qdrant/qdrant:v1.7.1 container_name: cheshire_cat_vector_memory expose: - 6333 volumes: - ./cat/long_term_memory/vector:/qdrant/storage restart: unless-stopped
-
Use the cloud version, by setting
QDRANT_HOST
,QDRANT_PORT
andQDRANT_API_KEY
Enviroment Variables. Below an example of.env
file:
Admin Portal
Use case
The admin portal is an administration/debugging pannel for interact with the cat by chatting, uploading files, seeing the memory and changing the LLM and embedder Models while providing minimal authentification through api_key.