Posts Tagged ‘databases’

Jul 11

Today there are many big applications several hundreds transaction per seconds and millions of concurrent users. Query language limits concurrent transaction. Some big company is using non relational database to make their application more scalable. They are company with extreme in terms of the amount of data and the concurrent user. They break the rules of relational databases. Here a few example of non relational database and the company who is using it :

cloud_computing

Voldemort Project : LinkedIn
Voldemort is not a relational database, it does not attempt to satisfy arbitrary relations while satisfying ACID properties. Nor is it an object database that attempts to transparently map object reference graphs. Nor does it introduce a new abstraction such as document-orientation. It is basically just a big, distributed, persistent, fault-tolerant hash table.

Cassandra Project : Facebook, Rackspace, Digg
Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together the distributed systems technologies from Dynamo and the data model from Google’s BigTable. Like Dynamo, Cassandra is eventually consistent. Like BigTable, Cassandra provides a ColumnFamily-based data model richer than typical key/value systems.
Cassandra was open sourced by Facebook in 2008, where it was designed by one of the authors of Amazon’s Dynamo. In a lot of ways you can think of Cassandra as Dynamo 2.0. Cassandra is in production use at Facebook but is still under heavy development.

HBase : Adobe, Amazon,Yahoo, and more..
HBase is the Hadoop database. Its an open-source, distributed, column-oriented store modeled after the Google paper, Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop.
HBase’s goal is the hosting of very large tables — billions of rows X millions of columns — atop clusters of commodity hardware. Try it if your plans for a data store run to big.

Hypertable : Baidu,Zvents, Rediff.com
Hypertable is an open source project based on published best practices and our own experience in solving large-scale data-intensive tasks. Our goal is to bring the benefits of new levels of both performance and scale to many data-driven businesses who are currently limited by previous-generation platforms. Our goal is nothing less than that Hypertable become one of the world’s most massively parallel high performance database platforms.

Dynamo : Amazon
Dynamo, a highly available data storage technology that addresses the needs of these important classes of services. Dynamo has a simple key/value interface, is highly available with a clearly defined consistency window, is efficient in its resource usage, and has a simple scale out scheme to address growth in data set size or request rates. Each service that uses Dynamo runs its own Dynamo instances.

Apache CouchDB : Gpirate, Source Weaver and more..
Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.
CouchDB is written in Erlang, but can be easily accessed from any environment that provides means to make HTTP requests. There are a multitude of third-party client libraries that make this even easier for a variety of programming languages and environments.

Images taken from http://www.flickr.com/photos/davidrn/1717425332/