Installation

There are a number of ways to install Elassandra:

Elassandra is based on Cassandra and ElasticSearch, thus it will be easier if you’re already familiar with one on these technologies.

Important

Be aware that Elassandra need more memory than Cassandra when Elasticsearch is used and should be installed on machine with at least 4Gb of RAM.

Tarball

Elassandra requires at least Java 8. Oracle JDK is the recommended version, but OpenJDK should also work as well. You need to check which version is installed on your computer:

$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

Once java is correctly installed, download the Elassandra tarball:

wget https://github.com/strapdata/elassandra/releases/download/v6.8.4.5/elassandra-6.8.4.5.tar.gz

Then extract its content:

tar -xzf elassandra-6.8.4.5.tar.gz

Go to the extracted directory:

cd elassandra-6.8.4.5

Configure conf/cassandra.yaml if necessary, and then run:

bin/cassandra -e

This has started cassandra with elasticsearch enabled (according to the -e option).

Get the node status:

bin/nodetool status

Now connect to the node with cqlsh:

bin/cqlsh

You’re now able to type CQL commands. See the CQL reference.

Check the elasticsearch API:

curl -X GET http://localhost:9200/

You should get something like this:

{
  "name" : "127.0.0.1",
  "cluster_name" : "Test Cluster",
  "cluster_uuid" : "7cb65cea-09c1-4d6a-a17a-24efb9eb7d2b",
  "version" : {
    "number" : "6.8.4.5",
    "build_hash" : "b0b4cb025cb8aa74538124a30a00b137419983a3",
    "build_timestamp" : "2017-04-19T13:11:11Z",
    "build_snapshot" : true,
    "lucene_version" : "5.5.2"
  },
  "tagline" : "You Know, for Search"
}

You’re done !

On a production environment, we recommand to to modify some system settings such as disabling swap. This guide shows you how to do it. On linux, you should install jemalloc.

Deb

Important

Cassandra and Elassandra packages conflict. You should remove Cassandra prior to install Elassandra.

The Java Runtime 1.8 is required to run Elassandra. On recent distributions it should be resolved automatically as a dependency. On Debian Jessie it can be installed from backports:

sudo apt-get install -t jessie-backports openjdk-8-jre-headless

You may need to install apt-transport-https and other utilities as well:

sudo apt-get install software-properties-common apt-transport-https gnupg2

Add our repository and gpg key:

sudo add-apt-repository 'deb [arch=all] https://nexus.repo.strapdata.com/repository/apt-releases/ stretch main'
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys B335A4DD

And then install elassandra with:

sudo apt-get update && sudo apt-get install elassandra

Start Elassandra with Systemd:

sudo systemctl start cassandra

or SysV:

sudo service cassandra start

Files locations:

  • /usr/bin: startup script, cqlsh, nodetool, elasticsearch-plugin
  • /etc/cassandra and /etc/default/cassandra: configurations
  • /var/lib/cassandra: data
  • /var/log/cassandra: logs
  • /usr/share/cassandra: plugins, modules, libs, …
  • /usr/share/cassandra/tools: cassandra-stress, sstabledump…
  • /usr/lib/python2.7/dist-packages/cqlshlib/: python library for cqlsh

Rpm

Important

Cassandra and Elassandra packages conflict. You should remove Cassandra prior to install Elassandra.

The Java runtime 1.8 must be installed in order to run Elassandra. You can install it yourself or let the package manager pull it automatically as a dependency.

Create a file called elassandra.repo in the /etc/yum.repos.d/ directory or similar according to your distribution (RedHat, OpenSuSe…):

[strapdata]
name=Strapdata
baseurl=https://nexus.repo.strapdata.com/repository/rpm-releases/
enabled=1
gpgcheck=0
priority=1

[strapdata-snapshots]
name=Strapdata Snapshots
baseurl=https://nexus.repo.strapdata.com/repository/rpm-snapshots/
enabled=1
gpgcheck=0
priority=1

And then install elassandra with:

sudo yum install elassandra

Start Elassandra with Systemd:

sudo systemctl start cassandra

or SysV:

sudo service cassandra start

Files locations:

  • /usr/bin: startup script, cqlsh, nodetool, elasticsearch-plugin
  • /etc/cassandra and /etc/sysconfig/cassandra: configurations
  • /var/lib/cassandra: data
  • /var/log/cassandra: logs
  • /usr/share/cassandra: plugins, modules, libs…
  • /usr/share/cassandra/tools: cassandra-stress, sstabledump…
  • /usr/lib/python2.7/site-packages/cqlshlib/: python library for cqlsh

Docker image

We provide an image on docker hub:

docker pull strapdata/elassandra

This image is based on the official Cassandra image whose the documentation is valid as well for Elassandra.

The source code is on github at strapdata/docker-elassandra.

Start an Elassandra server instance

Starting an Elassandra instance is pretty simple:

docker run --name node0 -d strapdata/elassandra:6.8.4.5

Run nodetool, cqlsh and curl:

docker exec -it node0 nodetool status
docker exec -it node0 cqlsh
docker exec -it node0 curl localhost:9200

Environment Variables

When you start the Elassandra image, you can adjust the configuration of the Elassandra instance by passing one or more environment variables on the docker run command line.

Variable Name Description
CASSANDRA_LISTEN_ADDRESS This variable is used for controlling which IP address to listen to for incoming connections on. The default value is auto, which will set the listen_address option in cassandra.yaml to the IP address of the container when it starts. This default should work in most use cases.
CASSANDRA_BROADCAST_ADDRESS This variable is used for controlling which IP address to advertise on other nodes. The default value is the value of CASSANDRA_LISTEN_ADDRESS. It will set the broadcast_address and broadcast_rpc_address options in cassandra.yaml.
CASSANDRA_RPC_ADDRESS This variable is used for controlling which address to bind the thrift rpc server to. If you do not specify an address, the wildcard address (0.0.0.0) will be used. It will set the rpc_address option in cassandra.yaml.
CASSANDRA_START_RPC This variable is used for controlling if the thrift rpc server is started. It will set the start_rpc option in cassandra.yaml. As Elastic search used this port in Elassandra, it will be set ON by default.
CASSANDRA_SEEDS This variable is the comma-separated list of IP addresses used by gossip for bootstrapping new nodes joining a cluster. It will set the seeds value of the seed_provider option in cassandra.yaml. The CASSANDRA_BROADCAST_ADDRESS will be added to the seeds passed on so that the sever can also talk to itself.
CASSANDRA_CLUSTER_NAME This variable sets the name of the cluster. It must be the same for all nodes in the cluster. It will set the cluster_name option of cassandra.yaml.
CASSANDRA_NUM_TOKENS This variable sets the number of tokens for this node. It will set the num_tokens option of cassandra.yaml.
CASSANDRA_DC This variable sets the datacenter name of this node. It will set the dc option of cassandra-rackdc.properties.
CASSANDRA_RACK This variable sets the rack name of this node. It will set the rack option of cassandra-rackdc.properties.
CASSANDRA_ENDPOINT_SNITCH This variable sets the snitch implementation that will be used by the node. It will set the endpoint_snitch option of cassandra.yml.
CASSANDRA_DAEMON The Cassandra entry-point class: org.apache.cassandra.service.ElassandraDaemon to start with ElasticSearch enabled (default), org.apache.cassandra.service.ElassandraDaemon otherwise.

Files locations

Docker elassandra image is based on the debian package installation:

  • /etc/cassandra: elassandra configuration
  • /usr/share/cassandra: elassandra installation
  • /var/lib/cassandra: data (sstables, lucene segment, commitlogs, …)
  • /var/log/cassandra: logs files.

/var/lib/cassandra is automatically managed as a docker volume. But it’s a good target to bind mount from the host filesystem.

Exposed ports

  • 7000: intra-node communication
  • 7001: TLS intra-node communication
  • 7199: JMX
  • 9042: CQL
  • 9160: thrift service
  • 9200: ElasticSearch HTTP
  • 9300: ElasticSearch transport

Create a cluster

In case there is only one elassandra instance per docker host, the easiest way is to start the container with --net=host.

When using the host network is not an option, you could just map the necessary ports with -p 9042:9042, -p 9200:9200 and so on… but you should be aware that docker default network will considerably slow down performances.

Note

Creating a cluster from the standalone image is probably fine for testing environments. But if you plan to run long-lived Elassandra clusters on containers, Kubernetes is the way to go.

Helm chart

Helm Tiller must be initialised on the target kubernetes cluster.

Add our helm repository:

helm repo add strapdata https://charts.strapdata.com

Then create a cluster with the following command:

helm install -n elassandra --set image.tag="6.8.4.5" strapdata/elassandra

After installation succeeds, you can get a status of chart:

helm status elassandra

As show below, the Elassandra chart creates 2 clustered service for elasticsearch and cassandra:

kubectl get all -o wide -n elassandra
NAME                          READY     STATUS    RESTARTS   AGE
pod/elassandra-0              1/1       Running   0          51m
pod/elassandra-1              1/1       Running   0          50m
pod/elassandra-2              1/1       Running   0          49m

NAME                               TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                                                          AGE
service/elassandra                 ClusterIP   None           <none>        7199/TCP,7000/TCP,7001/TCP,9300/TCP,9042/TCP,9160/TCP,9200/TCP   51m
service/elassandra-cassandra       ClusterIP   10.0.174.13    <none>        9042/TCP,9160/TCP                                                51m
service/elassandra-elasticsearch   ClusterIP   10.0.131.15    <none>        9200/TCP                                                         51m

NAME                          DESIRED   CURRENT   AGE
statefulset.apps/elassandra   3         3         51m

More information is available on github.

Google Kubernetes Marketplace

You can deploy an Elassandra cluster on GKE with a few clicks using our Elassandra Kubernetes App (require an existing GCP project and a running Google Kubernetes Cluster).

Running Cassandra only

In a cluster, you may need to run Cassandra datacenter without Elasticsearch indexing. In such case, change the CASSANDRA_DAEMON variable to org.apache.cassandra.service.CassandraDaemon in your /etc/default/cassandra on all nodes of your Cassandra only datacenter.