Benchmarking Cassandra
Keywords:
Distributed databases, NoSQL databases, Database Performance, Parameters of performanceAbstract
Today, with the increasing need for storage of unstructured data, the need of NoSql databases have increased. The most widely used NoSql database is the column based Cassandra. While there has been growth in the usage of Cassandra, evaluating its performance becomes important and crucial to applications using Cassandra on a large scale for storage. The popularity of NoSQL databases (especially Cassandra) has been increasing day by day. Now, as many companies are developing Cassandra applications, they may need new tools to monitor database performance efficiently. Developers have difficulty optimizing something they can't see. When problems related to performance occur and proper analysis is needed, the statistical data generated by monitoring tool will be of a lot help. To optimize NoSQL applications, developers need to have an idea about how the database is behaving in different working scenarios. Cassandra is easy to configure, but for the proper performance tuning it is necessary to study the performance requirements for a particular application. This can be judged by monitoring tool. The paper describes the design of such monitoring tool and the results generated ie. statistics and graphs. The tool will be used primarily for low end machines as they are cost effective.
References
Amazon SimpleDB. http://aws.amazon.com/simpledb/.
Apache Cassandra. http://incubator.apache.org/cassandra
Apache CouchDB. http://couchdb.apache.org/.
Apache HBase. http://hadoop.apache.org/hbase/.
Dynomite Framework. http://wiki.github.com/cliffmoon/-dynomite/dynomite-framework.
Google App Engine. http://appengine.google.com.
Hypertable. http://www.hypertable.org/.
mongodb. http://www.mongodb.org/.
Project Voldemort. http://project-voldemort.com/.
Solaris FileBench. http://www.solarisinternals.com/wiki/index.php/FileBench.
SQL Data Services/Azure Services Platform. http://www.microsoft.com/azure/data.mspx.
Storage Performance Council. http://www.storageperformance.org/home.
Yahoo! Query Language. http://developer.yahoo.com/yql
A. Arasu et al. Linear Road: a stream data management benchmark. In VLDB, 2004.
F. C. Botelho, D. Belazzougui, and M. Dietzfelbinger. Compress, hash and displace. In Proc. of the 17th European Symposium on Algorithms, 2009.
F. Chang et al. Bigtable: A distributed storage system for structured data. In OSDI, 2006.
B. F. Cooper et al. PNUTS: Yahoo!’s hosted data serving platform. In VLDB, 2008.
G. DeCandia et al. Dynamo: Amazon’s highly available key-value store. In SOSP, 2007.
D. J. DeWitt. The Wisconsin Benchmark: Past, present and future. In J. Gray, editor, The Benchmark Handbook. Morgan Kaufmann, 1993.
I. Eure. Looking to the future with Cassandra. http://blog.digg.com/?p=966.
S. Gilbert and N. Lynch. Brewer’s conjecture and the Feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News, 33(2):51–59, 2002.
J. Gray, editor. The Benchmark Handbook For Database and Transaction Processing Systems. Morgan Kaufmann, 1993.
J. Gray et al. Quickly generating billion-record synthetic databases. In SIGMOD, 1994.
A. Lakshman, P. Malik, and K. Ranganathan. Cassandra: A structured storage system on a P2P network. In SIGMOD, 2008.
B. C. Ooi and S. Parthasarathy. Special issue on data Management on cloud computing platforms. IEEE Data Engineering Bulletin, vol. 32, 2009.
A. Pavlo et al. A comparison of approaches to large-scale data analysis. In SIGMOD, 2009.
R. Rawson. HBase intro. In NoSQL Oakland, 2009.
A. Schmidt et al. Xmark: A benchmark for XML data Management. In VLDB, 2002.
R. Sears, M. Callaghan, and E. Brewer. Rose: Compressed, log-structured replication. In VLDB, 2008.
M. Seltzer, D. Krinsky, K. A. Smith, and X. Zhang. The case for application-specific benchmarking. In Proc. HotOS, 1999.
P. Shivam et al. Cutting corners: Workbench automation for server benchmarking. In Proc. USENIX Annual Technical Conference, 2008.
M. Stonebraker et al. C-store: a column-oriented DBMS. In VLDB, 2005.
B. White et al. An integrated experimental environment for distributed systems and networks. In OSDI, 2002.
K. Yocum et al. Scalability and accuracy in a large-scale network emulator. In OSDI, 2002.
Downloads
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.