Chroma

Chroma is an open-source vector database designed for AI applications that need to store, query, and manage embeddings efficiently. While Upsun does not provide an official service image for Chroma, the database can be configured as a Python application in a multi-applications project. This still gives you full control over the Chroma configuration and allows for persistent storage across deployments. Chroma excels in several use cases:

Semantic Search: Find documents based on meaning rather than exact keyword matches
Retrieval Augmented Generation (RAG): Enhance LLMs with relevant context from your knowledge base
Recommendation Systems: Build similarity-based recommendation engines
Content Classification: Automatically categorize documents based on their semantic content
Duplicate Detection: Identify similar or duplicate content across large document collections

Configuration

1. Configure the Chroma application

Create a Chroma application in your .upsun/config.yaml: The example above is using a HIGH_MEMORY container. You can refer to the Container profiles documentation for more information.

2. Connect from your application

To connect to Chroma from another application in your project, add a relationship in that application configuration block: As the example uses chroma as the relationship name, your application will have access to CHROMA_ environments variables.

3. Use the relationship in your application

Connect to Chroma using the relationship configuration:

Python
Node.js

import os
import chromadb

def get_chroma_client():
    """Create ChromaDB client based on environment variables."""
    # Check for remote ChromaDB configuration
    chroma_host = os.getenv("CHROMA_HOST")
    chroma_port = os.getenv("CHROMA_PORT", "8000")
    chroma_ssl = os.getenv("CHROMA_SSL", "false").lower() == "true"
    chroma_headers = {}

    # Optional authentication
    if os.getenv("CHROMA_AUTH_TOKEN"):
        chroma_headers["Authorization"] = f"Bearer {os.getenv('CHROMA_AUTH_TOKEN')}"

    if chroma_host:
        # Remote ChromaDB instance
        return chromadb.HttpClient(
            host=chroma_host,
            port=int(chroma_port),
            ssl=chroma_ssl,
            headers=chroma_headers
        )
    else:
        # Local ChromaDB instance
        return chromadb.Client()

import { ChromaClient, OpenAIEmbeddingFunction } from 'chromadb';

function getChromaClient(): ChromaClient {
  const chromaHost = process.env.CHROMA_HOST;
  const chromaPort = parseInt(process.env.CHROMA_PORT || '8000');
  const chromaSsl = process.env.CHROMA_SSL?.toLowerCase() === 'true';

  if (chromaHost) {
    const auth = process.env.CHROMA_AUTH_TOKEN
      ? { provider: 'token', credentials: process.env.CHROMA_AUTH_TOKEN }
      : undefined;

    return new ChromaClient({
      path: `http${chromaSsl ? 's' : ''}://${chromaHost}:${chromaPort}`,
      auth
    });
  } else {
    return new ChromaClient();
  }
}

While the examples above are based on Python and Node.js applications, the same concept can be applied to any other runtime.

Persistent storage

The configuration includes persistent storage through mounts:

.db: Stores the main Chroma database files
.chroma: Stores additional Chroma metadata

These mounts ensure that your vector data persists between deployments and application restarts.

Access Chroma

Chroma runs as an internal application without external HTTP access. Other applications in your project connect to it using the chroma.internal hostname through relationships. For development and debugging, you can use port forwarding to access your Chroma instance locally:

upsun tunnel:open

This creates a secure tunnel to your Chroma application, allowing you to connect local tools and clients during development.

Exposing Chroma on the public internet

If you are willing to make the Chroma database publicly accessible, add a new route to the application in the .upsun/config.yaml file:

.upsun/config.yaml

routes:
  "https://chroma.{default}/":
    type: upstream
    upstream: "chroma:http"

Exposing Chroma

Be mindful that exposing Chroma publicly can be sensitive from a security standpoint.

Exporting Data

Chroma stores its vector database on disk as SQLite files within the service container’s mount directory. The data is directly accessible via the mount path and can be downloaded using the CLI or rsync.

Identify the mount path where Chroma stores its data (typically defined in your .upsun/config.yaml as a source: service mount pointing to the Chroma service).
Download the data directory using the CLI:

Terminal

upsun mount:download --mount <CHROMA_MOUNT_PATH> --target ./chroma-backup

The downloaded directory contains the SQLite database files (.db) used by Chroma. These can be restored by placing them back in the mount path of a new Chroma service.

Other resources

DevCenter: Store embeddings in chroma with persistent storage (nodejs and python examples)

Get Started

Core Concepts

Configure Apps & Services

Build, Deploy & Run

Manage Projects

Manage Environments

Manage Resources

Observability & Metrics

Integrations

Troubleshooting

Resources

Configuration

1. Configure the Chroma application

2. Connect from your application

3. Use the relationship in your application

Persistent storage

Access Chroma

Exposing Chroma on the public internet

Exposing Chroma

Exporting Data

Other resources

Get Started

Core Concepts

Configure Apps & Services

Build, Deploy & Run

Manage Projects

Manage Environments

Manage Resources

Observability & Metrics

Integrations

Troubleshooting

Resources

​Configuration

​1. Configure the Chroma application

​2. Connect from your application

​3. Use the relationship in your application

​Persistent storage

​Access Chroma

​Exposing Chroma on the public internet

Exposing Chroma

​Exporting Data

​Other resources

Configuration

1. Configure the Chroma application

2. Connect from your application

3. Use the relationship in your application

Persistent storage

Access Chroma

Exposing Chroma on the public internet

Exporting Data

Other resources