Docker Installation Guide
Getting Started with Cortex on Docker
This guide provides comprehensive instructions for installing and running Cortex in a Docker environment, including sensible defaults for security and performance.
Prerequisites
Before beginning, ensure you have:
- Docker (version 20.10.0 or higher) or Docker Desktop
- At least 8GB of RAM and 10GB of free disk space
- For GPU support, make sure you install
nvidia-container-toolkit
. Here is an example on how to do so for Ubuntu:# Install NVIDIA Container Toolkitcurl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg# Add repositorycurl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list# Installsudo apt-get update && sudo apt-get install -y nvidia-container-toolkitsudo nvidia-ctk runtime configure --runtime=dockersudo systemctl restart docker
Installation Methods
Method 1: Using Pre-built Image (Recommended)
# Pull the latest stable releasedocker pull menloltd/cortex:latest
# Or pull a specific version (recommended for production)docker pull menloltd/cortex:nightly-1.0.1-224
Version Tags
latest
: Most recent stable releasenightly
: Latest development buildx.y.z
(e.g.,1.0.1
): Specific version release
Method 2: Building from Source
- Clone the repo:
git clone https://github.com/janhq/cortex.cpp.gitcd cortex.cppgit submodule update --init
- Build the Docker image:
- Latest Build
- Specific Version
docker build -t menloltd/cortex:local \ --build-arg CORTEX_CPP_VERSION=$(git rev-parse HEAD) \ -f docker/Dockerfile .
docker build \ --build-arg CORTEX_LLAMACPP_VERSION=0.1.34 \ --build-arg CORTEX_CPP_VERSION=$(git rev-parse HEAD) \ -t menloltd/cortex:local \ -f docker/Dockerfile .
Running Cortex (Securely)
- [Optional] Create a dedicated user and data directory:
# Create a dedicated usersudo useradd -r -s /bin/false cortexexport CORTEX_UID=$(id -u cortex)
# Create data directory with proper permissionssudo mkdir -p /opt/cortex/datasudo chown -R ${CORTEX_UID}:${CORTEX_UID} /opt/cortex
- Set up persistent storage:
docker volume create cortex_data
- Launch the container:
- GPU Mode
- CPU Mode
docker run --gpus all -d \ --name cortex \ --user ${CORTEX_UID}:${CORTEX_UID} \ --memory=4g \ --memory-swap=4g \ --security-opt=no-new-privileges \ -v cortex_data:/root/cortexcpp:rw \ -v /opt/cortex/data:/data:rw \ -p 127.0.0.1:39281:39281 \ menloltd/cortex:latest
docker run -d \ --name cortex \ --user ${CORTEX_UID}:${CORTEX_UID} \ --memory=4g \ --memory-swap=4g \ --security-opt=no-new-privileges \ -v cortex_data:/root/cortexcpp:rw \ -v /opt/cortex/data:/data:rw \ -p 127.0.0.1:39281:39281 \ menloltd/cortex:latest
Verification and Testing
- Check container status:
docker ps | grep cortexdocker logs cortex
Expected output should show:
Cortex server starting...Initialization completeServer listening on port 39281
- Test the API:
curl http://127.0.0.1:39281/healthz
Working with Cortex
Once your container is running, here's how to interact with Cortex. Make sure you have curl
installed on your system.
1. Check Available Engines
curl --request GET --url http://localhost:39281/v1/engines --header "Content-Type: application/json"
You'll see something like:
{ "data": [ { "description": "This extension enables chat completion API calls using the Onnx engine", "format": "ONNX", "name": "onnxruntime", "status": "Incompatible" }, { "description": "This extension enables chat completion API calls using the LlamaCPP engine", "format": "GGUF", "name": "llama-cpp", "status": "Ready", "variant": "linux-amd64-avx2", "version": "0.1.37" } ], "object": "list", "result": "OK"}
2. Download Models
First, set up event monitoring:
- Install
websocat
following these instructions - Open a terminal and run:
websocat ws://localhost:39281/events
Then, in a new terminal, download your desired model:
- Pull model from Cortex's Hugging Face hub
- Pull model directly from a URL
curl --request POST --url http://localhost:39281/v1/models/pull --header 'Content-Type: application/json' --data '{"model": "tinyllama:gguf"}'
curl --request POST --url http://localhost:39281/v1/models/pull --header 'Content-Type: application/json' --data '{"model": "https://huggingface.co/afrideva/zephyr-smol_llama-100m-sft-full-GGUF/blob/main/zephyr-smol_llama-100m-sft-full.q2_k.gguf"}'
To see your downloaded models:
curl --request GET --url http://localhost:39281/v1/models
3. Using the Model
First, start your model:
curl --request POST --url http://localhost:39281/v1/models/start --header 'Content-Type: application/json' --data '{"model": "tinyllama:gguf"}'
Then, send it a query:
curl --request POST --url http://localhost:39281/v1/chat/completions --header 'Content-Type: application/json' --data '{ "frequency_penalty": 0.2, "max_tokens": 4096, "messages": [{"content": "Tell me a joke", "role": "user"}], "model": "tinyllama:gguf", "presence_penalty": 0.6, "stop": ["End"], "stream": true, "temperature": 0.8, "top_p": 0.95 }'
4. Shutting Down
When you're done, stop the model:
curl --request POST --url http://localhost:39281/v1/models/stop --header 'Content-Type: application/json' --data '{"model": "tinyllama:gguf"}'
Maintenance and Troubleshooting
Common Issues
- Permission Denied Errors:
sudo chown -R ${CORTEX_UID}:${CORTEX_UID} /opt/cortex/datadocker restart cortex
- Container Won't Start:
docker logs cortexdocker system info # Check available resources
Cleanup
# Stop and remove containerdocker stop cortexdocker rm cortex
# Remove data (optional)docker volume rm cortex_datasudo rm -rf /opt/cortex/data
# Remove imagedocker rmi cortexai/cortex:latest
Updating Cortex
# Pull latest versiondocker pull cortexai/cortex:latest
# Stop and remove old containerdocker stop cortexdocker rm cortex# Start new container (use run command from above)
Best Practices
- Always use specific version tags in production
- Regularly backup your data volume
- Monitor container resources using
docker stats cortex
- Keep your Docker installation updated