Friday, August 3, 2012

Installing Apache Cassandra on Ubuntu

Apache Cassandra is a distributed, extremely scalable, highly available and fault tolerant NoSQL database initiated by facebook, later open sourced as an apache project. Cassandra data model is inspired by Google Bigtable and it's distribution model is inspired by Amazon Dynamo. If you are interested to know more about Cassandra you can refer to the paper written by Facebook.

This post will guide you how to install Cassandra on Ubuntu 12.04.
  1. Install the new updates using following commands
    sudo apt-get update
    sudo apt-get upgrade
  2. open /etc/apt/sources.list using the following command
    sudo gedit /etc/apt/sources.list
    and add the following lines to it
    deb 10x main
    deb-src 10x main
  3. Run update again and you will get the following error. This means you need to add the PUBLIC_KEY. In next step you will understand how to add this PUBLIC_KEY.
    GPG error: unstable Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY 4BD736A82B5C1B00
  4. Register and add a PUBLIC_KEY key and update again, note that you may need to change the key accordingly
    gpg --keyserver --recv-keys 4BD736A82B5C1B00
    sudo apt-key add ~/.gnupg/pubring.gpg
    sudo apt-get update
  5. Install Cassandra using the following command
    sudo apt-get install cassandra
  6. Start Cassandra server using the following command
    sudo cassandra -f
    After starting the Cassandra server you will see it has started listening for thrift clients.
    INFO 12:18:29,140 Listening for thrift clients...
  7. To stop Cassandra server process first find the Process ID for Cassandra and kill it.

    To find the process ID use following command
    ps auwx | grep cassandra
    Output will be something like this. According to that 3595 is the process ID for cassandra.
    root      3595  0.0  0.0  60048  1908 pts/0    S+   12:18   0:00 sudo cassandra -f
    To kill the process use the following command
    sudo kill <pid>
    After killing the process you will see Cassandra server has stopped listening to thrift clients.
    INFO 13:04:08,663 Stop listening to thrift clients
    INFO 13:04:08,666 Waiting for messaging service to quiesce
    INFO 13:04:08,667 MessagingService shutting down server thread.
  8. Use following command to start Cassandra as a service
    sudo /etc/init.d/cassandra start
  9. Use following command to stop Cassandra service
    sudo /etc/init.d/cassandra stop
Installation will create following directories. Uses of them are mentioned within the brackets.
  • /var/lib/cassandra (data directories)
  • /var/log/cassandra (log directory)
  • /var/run/cassandra (runtime files)
  • /usr/share/cassandra (environment settings)
  • /usr/share/cassandra/lib (JAR files)
  • /usr/bin (binary files)
  • /usr/sbin
  • /etc/cassandra (configuration files)
  • /etc/init.d (service startup script)
  • /etc/security/limits.d (cassandra user limits)
  • /etc/default
Installing JNA (Java Native Access) on Linux platforms can improve Cassandra memory usage. To install JNA, download jna.jar from here and add it to /usr/share/cassandra/lib directory.

If you get the following error while you try to start a Cassandra server, means that cassandra is already running in the background somewhere. You will need to kill the process that is running in the background first. You can probably use the above mentioned stop command to stop any Cassandra servers running background.
Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: 7199; nested exception is: Address already in use
Here I have done the packaged installation, alternatively you can install Cassandra binary tarball installation on Ubuntu. Use this link for that.


Official Package To Install On Debian(tm) (not a product of Debian(tm))

No comments :

Post a Comment