Apache Cassandra Learning Step by Step (1)

By Bright Zheng (IT进行时)

1.   About Apache Cassandra

Apache Cassandra is one of thepowerful NoSQL platforms.

Link: http://cassandra.apache.org/

 

Following are some usefulcomparisons on some classic NoSQL platforms which we need to make a decision ifwe want to get involved in such a domain.

      1. Cassandra vs MongoDB vsCouchDB vs Redis vs Riak vs HBase vs Membase vs Neo4j comparison: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

      2. HBase vs Cassandra: http://blog.adku.com/2011/02/hbase-vs-cassandra.html

2.  Development environment Setup

2.1.  Softwarerequired

SN

Name

Description

1

Apache Cansandra

Mandatory. Current stable version is 1.0.7

@http://cassandra.apache.org/download/

2

Cassandra Tutorial

Mandatory. Source code for following writeups.

@http://github.com/zznate/cassandra-tutorial.git

3

GIT

Optional. Tools for latest source code download & sync

4

Maven

Optional. I’m using 3.x, or 3.0.3 exactly.

5

Hactor

Optional. Recommended Java client implementation for deeper learning experience since it encapsulates all Cassandra’s Concepts/APIs as a higer layer Concepts/APIs.

2.2. Softwareinstall & configuration

2.2.1. Apache Cassandra (as single node first)

Download & unzip, that’salmost done.

Better to configure two files forsome runtime info (data/log/etc.).

 

1.$APACHE_CASSANDRA$/conf/cassandra.yaml

cluster_name: 'Test Cluster'

 

# directories where Cassandra should store data on disk.

data_file_directories:

    - ../runtime/data

 

# commit log

commitlog_directory: ../runtime/commitlog

 

# saved caches

saved_caches_directory: ../runtime/saved_caches

 

2.$APACHE_CASSANDRA$/conf/log4j-server.properties

log4j.appender.R.File=../runtime/system.log

 

Double click $APACHE_CASSANDRA$/bin/cassandra.bat to start Cassandra.

 

If you want to enable JMX forruntime monitoring, please download MX4J from here and extract mx4j.jar and mx4j-tools.jarto $APACHE_CASSANDRA$/lib and thenrestart Cassandra.

One more thing you shouldunderstand is that the JMX port will be defaulted to 8081 (Ref to the sourcecode here).If you got port conflict issue, you have two ways to change the port.

1. Add a parameter to the start up bat (or $APACHE_CASSANDRA$/bin/cassandra.bat)as -Dmx4jport=8082, or

2. Change the bat directly at following lines:

@REMoriginal

@REM"%JAVA_HOME%\bin\java"%JAVA_OPTS%%CASSANDRA_PARAMS%-cp%CASSANDRA_CLASSPATH%"%CASSANDRA_MAIN%"

@REMchangetofollwing

setmx4jport=7599

"%JAVA_HOME%\bin\java" %JAVA_OPTS% %CASSANDRA_PARAMS% -Dmx4jport=%mx4jport%-cp %CASSANDRA_CLASSPATH% "%CASSANDRA_MAIN%"

2.2.2. Git & Maven

Omitted…

2.2.3.  Cassandra Tutorial

1. Open Git Bash:

By double clicking the Git Bashunder Git install folder.

 

2. Download the source:

git clone http://github.com/zznate/cassandra-tutorial.git

Note:

If your network is via proxy, runfollwing command first:

export http_proxy={YOUR PROXY HOST/IP}:{PORT}

 

3. Import to Eclipse:

This is a maven project so we canimport it to Eclipse as existing Maven project.

2.2.4. Hactor

1. Open Git Bash:

By double clicking the Git Bashunder Git install folder.

 

2. Download the source:

git clone http://github.com/rantav/hector.git

 

3. Import to Eclipse:

Hactor is a mavan project(s). Sowe can import it to Eclipse as existing maven project.

It has 3 sub projects: core, orm& test.

 

Take note that the orm subproject is lack of javax.persistence dependency.

I added it as org.hibernate.javax.persistence:hibernate-jpa-2.0-api.jar

TODO Items:

3. Samples Step by Step: By combining Concept Intro, CLI Usage & Sample Java Code

4. Clustering & Tuning: To evaluate Cassandra so-called Linear Scalability and proven Fault-tolerance