download the newest (vector-search-compatible) cqlsh utility from this link;
extract the cqlsh archive to a location of your liking, e.g. /home/user/myCqlsh;
ensure you are still in the repo's root directory;
source the environment file you prepared in the previous step with . .env;
launch the script that populates the database (adjust the path as needed):

/home/user/myCqlsh/cqlsh-astra/bin/cqlsh \
    -b "$ASTRA_DB_SECURE_BUNDLE_PATH" \
    -u token \
    -p "$ASTRA_DB_APPLICATION_TOKEN" \
    -k "$ASTRA_DB_KEYSPACE" \
    -f setup/provision_db/write_sample_data.cql

Astra DB's in-browser console

Alternatively, you can also populate the database without a local cqlsh client, all from your browser.

Locate the CQL Console in you Astra DB instance, then:

enter the command USE cassio_tutorials; and press Enter. Replace with your keyspace name if you called it differently.
Paste the contents of this file in the Console and wait for it to complete.

LLM Credentials

In this repo's root directory again, create a .api_keys file where the secrets necessary for your LLM of choice are defined. You can copy the provided .api_keys.template and adjust the values therein.

Check out the LLM Pre-requisites for a list of supported LLMs: each will require a different variable to be set here.

Automatic choice of LLM

The code examples generally rely on a helper function to determine which LLM to use, based on which secrets are detected in this file. You can define your preferred LLM (e.g. in case you define more than one secret) by setting the environment variable PREFERRED_LLM_PROVIDER in .api_keys.

Remember to "source" this file before launching notebooks or Python scripts:

. .api_keys

Framework-specific setup

Now, database and LLM are all set for running the examples locally.

For each framework, still, you will have to prepare a specific Python environment with the right dependencies. The instructions are given in the section of this docs specific to that framework: for example, here is how you start the LangChain examples locally.

Use a local Vector-capable Cassandra

At the time of writing, the Vector Similarity Search capabilities are not yet included in the distributions of Cassandra (see CEP-30 to track the status).

If you want to use the latest Cassandra that implements Vector Search, you can still do so by building the code yourself. The following instructions describe how to do that and get a single-node vector-capable Cassandra cluster running locally.

Warning

This is a development branch at the moment: it is not guaranteed to be stable. Please refrain from using it in production environments for a little while more.

These instructions require some knowledge of Java building tools.

Build and run

You need Java JDK 11. Go to a directory of your choice and execute:

# get the code
git clone https://github.com/datastax/cassandra.git
cd cassandra
git checkout vsearch
git pull

# setup usage of JDK 11
sudo update-alternatives --config java    # ... and pick JDK 11
export CASSANDRA_USE_JDK11=true
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/   # adapt this command

# clean and build
ant realclean
ant jar

If the above succeeded, you can launch the cluster with:

sudo bin/cassandra -f -R

You may need to create a log directory if not present:

sudo mkdir -p /var/log/cassandra

CQL Console

Using the vector-aware CQL Console that can be downloaded here, open a CQL Console by launching:

# adjust the path as needed
/home/user/myCqlsh/cqlsh-astra/bin/cqlsh

Populate the database

Still in the CQL Console, create a keyspace for the examples by executing the following:

CREATE KEYSPACE IF NOT EXISTS cassio_tutorials
    WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1 };

You can check that the keyspace exists with:

DESC KEYSPACES;

Exit the CQL console (EXIT + Enter), go back to this repo's root dir (you should have cloned it earlier to /some/directory/cassio-website/), then run this script, which will populate the local database with sample tables occasionally used by some of the examples you'll encounter:

# adjust the path as needed
/home/user/myCqlsh/cqlsh-astra/bin/cqlsh \
    -k cassio_tutorials \
    -f setup/provision_db/write_sample_data.cql

Use the local Cassandra in the code

Your local Cassandra is ready to support all examples.

The only remaining thing is to make sure the Session object used in the code is a connection to your local database: how exactly this is achieved depends on the framework you use.

With Langchain, for example, you can simply use the provided getCQLSession and getCQLKeyspace functions making sure you pass the parameter mode="local" when calling them.

More generally, you can build a connection to the local Cassandra with Python code similar to the following:

from cassandra.cluster import Cluster

cluster = Cluster()
session = cluster.connect()

the session object, along with the keyspace name (a string), can then be used e.g. in the cassio.init(...) invocations, or when instantiating individual objects, exactly in the same way as you would a connection to Astra DB (see the drivers documentation for more options).

Further reading

Run with local Jupyter

Clone this repository

DB

Pre-populate the database