Integration¶
Integration with an existing cassandra cluster¶
Elassandra include a modified version of cassandra, available at strapdata-cassandra repro, so all nodes of a cluster should run elassandra binaries. However, you can start a node with or without the elasticsearch support. Obviously, all nodes of a datacenter should run cassandra only or cassandra with elasticsearch.
Rolling upgrade to elassandra¶
Before starting any elassandra node with elasticsearch enable, do a rolling replace of the cassandra binaries by the elassandra ones. For each node :
- Install elassandra.
- Replace the elassandra configuration files by the one from your existing cluster (cassandra.yml and snitch configuration file)
- Stop your cassandra node.
- Restart cassandra
elassandra bin/cassandra
or cassandra with elasticsearch enableelassandra bin/cassandra -e
Create a new elassandra datacenter¶
The overall procedure is similar the cassandra one describe on Adding a datacenter to a cluster.
For earch nodes in your new datacenter :
- Install elassandra.
- Set
auto_bootstrap: false
in your conf/cassandra.yaml. - Start cassandra-only nodes in your new datacenter and check that all nodes join the cluster.
bin/cassandra
- Restart all nodes in your new datacenter with elasticsearch enable. You should see started shards but empty indices.
bin/cassandra -e
- Set the replication factor of indexed keyspaces to one or more in your new datacenter.
- Pull data from your existaing datacenter.
nodetool rebuild <source-datacenter-name>
After rebuild on all your new nodes, you should see the same number of document for each indices in your new and existing datacenters.
- Set
auto_bootstrap: true
(default value) in your conf/cassandra.yaml - Create new elasticsearch index or map some existing cassandra tables.
Tip
If you need to replay this procedure for a node :
- stop your node
- nodetool removenode <id-of-node-to-remove>
- clear data, commitlogs and saved_cache directories.
Installing an Elasticsearch plugins¶
Elasticsearch plugin installation remains unchanged, see elasticsearch plugin installation.
- bin/plugin install <url>
Running Kibana with Elassandra¶
Kibana can run with Elassandra, providing a visualization tool for cassandra and elasticsearch data.
- If you want to load sample data from the Kibana Getting started, apply the following changes to logstash.jsonl with a sed command.
s/logstash-2015.05.18/logstash_20150518/g
s/logstash-2015.05.19/logstash_20150519/g
s/logstash-2015.05.20/logstash_20150520/g
s/article:modified_time/articleModified_time/g
s/article:published_time/articlePublished_time/g
s/article:section/articleSection/g
s/article:tag/articleTag/g
s/og:type/ogType/g
s/og:title/ogTitle/g
s/og:description/ogDescription/g
s/og:site_name/ogSite_name/g
s/og:url/ogUrl/g
s/og:image:width/ogImageWidth/g
s/og:image:height/ogImageHeight/g
s/og:image/ogImage/g
s/twitter:title/twitterTitle/g
s/twitter:description/twitterDescription/g
s/twitter:card/twitterCard/g
s/twitter:image/twitterImage/g
s/twitter:site/twitterSite/g
JDBC Driver sql4es + Elassandra¶
The Elasticsearch JDBC driver. can be used with elassandra. Here is a code example :
Class.forName("nl.anchormen.sql4es.jdbc.ESDriver");
Connection con = DriverManager.getConnection("jdbc:sql4es://localhost:9300/twitter?cluster.name=Test%20Cluster");
Statement st = con.createStatement();
ResultSet rs = st.executeQuery("SELECT user,avg(size),count(*) FROM tweet GROUP BY user");
ResultSetMetaData rsmd = rs.getMetaData();
int nrCols = rsmd.getColumnCount();
while(rs.next()){
for(int i=1; i<=nrCols; i++){
System.out.println(rs.getObject(i));
}
}
rs.close();
con.close();
Running Spark with Elassandra¶
A modified version of the elasticsearch-hadoop connector is available for elassandra at `https://github.com/strapdata/elasticsearch-hadoop`_. This connector works with spark as describe in the elasticsearch documentation available at https://www.elastic.co/guide/en/elasticsearch/hadoop/current/index.html.
For example, in order to submit a spark job in client mode.
bin/spark-submit --driver-class-path <yourpath>/elasticsearch-spark_2.10-2.2.0.jar --master spark://<sparkmaster>:7077 --deploy-mode client <application.jar>