Apache Cassandra Learning Step by Step (1)
By Bright Zheng (IT进行时)
1. About Apache Cassandra
Apache Cassandra is one of thepowerful NoSQL platforms.
Link: http://cassandra.apache.org/
Following are some usefulcomparisons on some classic NoSQL platforms which we need to make a decision ifwe want to get involved in such a domain.
1. Cassandra vs MongoDB vsCouchDB vs Redis vs Riak vs HBase vs Membase vs Neo4j comparison: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
2. HBase vs Cassandra: http://blog.adku.com/2011/02/hbase-vs-cassandra.html
2. Development environment Setup
2.1. Softwarerequired
SN | Name | Description |
1 | Apache Cansandra | Mandatory. Current stable version is 1.0.7 |
2 | Cassandra Tutorial | Mandatory. Source code for following writeups. |
3 | GIT | Optional. Tools for latest source code download & sync |
4 | Maven | Optional. I’m using 3.x, or 3.0.3 exactly. |
5 | Hactor | Optional. Recommended Java client implementation for deeper learning experience since it encapsulates all Cassandra’s Concepts/APIs as a higer layer Concepts/APIs. |
2.2. Softwareinstall & configuration
2.2.1. Apache Cassandra (as single node first)
Download & unzip, that’salmost done.
Better to configure two files forsome runtime info (data/log/etc.).
1.$APACHE_CASSANDRA$/conf/cassandra.yaml
cluster_name: 'Test Cluster'
# directories where Cassandra should store data on disk. data_file_directories: - ../runtime/data
# commit log commitlog_directory: ../runtime/commitlog
# saved caches saved_caches_directory: ../runtime/saved_caches |
2.$APACHE_CASSANDRA$/conf/log4j-server.properties
log4j.appender.R.File=../runtime/system.log |
Double click $APACHE_CASSANDRA$/bin/cassandra.bat to start Cassandra.
If you want to enable JMX forruntime monitoring, please download MX4J from here and extract mx4j.jar and mx4j-tools.jarto $APACHE_CASSANDRA$/lib and thenrestart Cassandra.
One more thing you shouldunderstand is that the JMX port will be defaulted to 8081 (Ref to the sourcecode here).If you got port conflict issue, you have two ways to change the port.
1. Add a parameter to the start up bat (or $APACHE_CASSANDRA$/bin/cassandra.bat)as -Dmx4jport=8082, or
2. Change the bat directly at following lines:
@REMoriginal
@REM"%JAVA_HOME%\bin\java"%JAVA_OPTS%%CASSANDRA_PARAMS%-cp%CASSANDRA_CLASSPATH%"%CASSANDRA_MAIN%"
@REMchangetofollwing
setmx4jport=7599
"%JAVA_HOME%\bin\java" %JAVA_OPTS% %CASSANDRA_PARAMS% -Dmx4jport=%mx4jport%-cp %CASSANDRA_CLASSPATH% "%CASSANDRA_MAIN%"2.2.2. Git & Maven
Omitted…
2.2.3. Cassandra Tutorial
1. Open Git Bash:
By double clicking the Git Bashunder Git install folder.
2. Download the source:
git clone http://github.com/zznate/cassandra-tutorial.git |
Note:
If your network is via proxy, runfollwing command first:
export http_proxy={YOUR PROXY HOST/IP}:{PORT} |
3. Import to Eclipse:
This is a maven project so we canimport it to Eclipse as existing Maven project.
2.2.4. Hactor
1. Open Git Bash:
By double clicking the Git Bashunder Git install folder.
2. Download the source:
git clone http://github.com/rantav/hector.git |
3. Import to Eclipse:
Hactor is a mavan project(s). Sowe can import it to Eclipse as existing maven project.
It has 3 sub projects: core, orm& test.
Take note that the orm subproject is lack of javax.persistence dependency.
I added it as org.hibernate.javax.persistence:hibernate-jpa-2.0-api.jar
TODO Items:
3. Samples Step by Step: By combining Concept Intro, CLI Usage & Sample Java Code
4. Clustering & Tuning: To evaluate Cassandra so-called Linear Scalability and proven Fault-tolerance