Is MongoDB the next big thing?

MongoDB is Opensource, scalable, schema free and document oriented database designed for very high performance and developed in C++. The goal is to bridge the gap between key value stores and traditional RDBMS. MongoDB stores data in collection of JSON like documents. This allows many applications to model data in a more natural way, as data can be nested in complex hierarchies and still be query-able and index-able.

Features

Simple Queries
MongoDB is a document store with no transactions and no joins. Queries are easier to write and more easy to tune.

Document-based data model
The basic unit of storage is analogous to Ruby hashes, JSON, Python dictionaries etc. A rich data structure capable of holding arrays and other documents. Advantage of document based data model is, you can represent as single entity a construct what would require several tables to properly represent in a relational db.

Sharding
If you have a lots of data and you are running out of space on single DB server then sharding can help you there. It scales your database horizontally, i.e. bring new machines and divide your load of several machine that will not only give you more space to store but also enhance performance and increase disk throughput. The main advantage sharding provides is you don’t have to care about your database needs for after 5 years from now, no pre-optimization and no downtime.

GridFS
A simple but most useful concept in MongoDB! Traditional databases such as MySQL don’t allow to store files in database. For Ex. If I want to store profile picture of an user then I need to store URL/Path of that image in database and not the whole image. With GridFS, MongoDB lets you to store file in database. As MongoDB has built in replication and sharding facilities, the files will not only stored there but also help you in backing ‘em up.

MultiKeys
An interesting feature that will automatically index arrays of an index values. For eg. If you have stored an article with multiple tags in MongoDB database then MongoDB will automatically index all the tags and you can use them for searching any articles in database. Also, if that article is blog post then you might want to store the comments in the same place where article is stored. Yes, you can do this by embedding objects in an array.

MapReduce
A tools useful for batch manipulation of data and aggregation operations. It is particularly useful in scenarios where you have data coming from different sources and the requirement is to process that data simultaneously. map/reduce is invoked via a database command. The database creates a temporary collection to hold output of the operation . The collection is cleaned up when the client connection closes, or when explicitly dropped.

Better Performance
As I mentioned above, MongoDB doesn’t require any joins and translations. The results is better performance.

Advantages

  • MongoDB is extremely fast
  • No schema  no Data mapping → Faster development
  • No another query language for learning → Faster learning curve
  • Your code is future-proof. Easily lets you to add more fields, even complex fields to your objects. So as requirements change, you can adapt code quickly.
  • Horizontally scalable
  • No need to worry about migrations.



Things to keep in mind

So far, after reading through features and advantages you might be feeling that lets use Mongo for my next project. But here are few things you should keep in mind before switching over to MongoDB (from NoSQL blog)

  • MongoDB assumes you have large (very large) amount of hard drive space
  • MongoDB assumes RAM can be used instead of disk
  • MongoDB assumes that you have a 64-bit machine
  • MongoDB assumes that you’re using a little-endian system
  • MongoDB assumes that you have more than one server
  • MongoDB assumes you want fast/unsafe, but lets you do slow/safe
  • MongoDB developers assume you’ll complain if something goes wrong

Disadvantages

Stability
One of the major blow to MongoDB is the environment is not yet stable. There are memory leaks on process so, you will need reboot your server every few days. The lack of stability also makes it unsuitable for production environment where data is very important.

No Queries
Because MongoDB doesn’t support queries in SQL, you can’t make use of the enormous range of SQL-based reporting and business intelligence tools. This means any tool that uses ODBC or JDBC to generate graphs, reports or dashboards is unavailable to you.

Durability
MongoDB isn’t designed for durability that means your data is more likely to be lost if you are in “single instance configuration”

Not enough documentation
As MongoDB is relatively new there is lack of documentation. If you want to try something new then you might find it difficult to get some help.

Lack of availability of Talent
You might find it difficult to get some talented developers who knows everything about it.

Who should go for it?

MongoDB at current stage is most suitable for those who want database response extremely fast and at the same time can bear loss of few records. Forums, blogs and news sites can use Mongo to enhance performance. It is most efficient to them as initial investment is comparatively much less and the infrastructure can be scaled anytime if needed in future.

Have something to say about MongoDB? Post it down in comments.

  • Anonymous Coward

    Some feedback: MongoDB is pretty stable (from my experience with several production applications that store almost 1 TB of data in total). You code usually has to be backwards-compatible and not "future-proof", so that is irrelevant: flexible schema is both good and bad when it comes to data migration between releases. Relational database store and replicate files as blobs; Mongo's GridFS may have advantages but there is nothing new. Most important feature of Mongo is that you don't have to modify application code to switch to sharded setup, including advanced cases when each shard is a group of machines. And, of course, lack of transactions is a no-no for some apps, so software engineers need to understand that Mongo is not a replacement for relational databases, it is a complimentary thing.

  • Rahulchugh

    nice artcicle..
    I remember my old days 1998 (99) when i used to work with the database then known as Rogue Wave. It was developed in c++ and in order to call queries to return table rows data and schema information , one would have to write code in c++ and get the data as Objects.
    things are coming back around

  • http://www.facebook.com/profile.php?id=500104631 Bryan Migliorisi

    Not sure why you say its unstable – there are actually many companies, some very, very big ones, using MongoDB in production. I think that speaks to its stability.

    Look here: http://www.mongodb.org/display/DOCS/Production+Deployments

  • http://twitter.com/mattparlane Matt Parlane

    +1 on the comments on stability so far – it's quite stable. I'm using it in production and don't have to reboot anything, it's been running for weeks now (1.6.0).

    Also, I've found the documentation to be excellent – it's written by developers for developers.

    And while you might be right about the lack of experience with MongoDB in the community, the actual developers themselves are very active (and helpful) on the mailing list – they've answered every question I've had so far, and answered them well. It's only a matter of time before some significant momentum builds.

  • http://twitter.com/mr_adubb_atl A-Dubb Wimberly

    Thanks for a great article.

  • http://www.marko.anastasov.name Marko Anastasov

    Can you elaborate a bit on this part:

    MongoDB isn't designed for durability that means your data is more likely to be lost if you are in “single instance configuration”

    I'm beginning to use MongoDB and I'm curious where has this been seen and documented.

  • http://www.tutkiun.com/ Mayur

    Hey Morko,

    MongoDB is not designed to protect your data rather it is designed for high performance. So, some records might be missing. In "single instance configuration", MongoDB store just one copy of data with no replication or sharding. Thats why your data is more likely to get lost.

    You can find more about MongoDB's durability at

    http://nosql.mypopescu.com/post/392868405/mongodb-durability-a-tradeoff-to-be-aware-of

    and

    http://ivoras.sharanet.org/blog/tree/2010-02-20.mongodb-and-durability.html

  • http://www.tutkiun.com/ Mayur

    Welcome! :)

  • http://www.tutkiun.com/ Mayur

    Great! But, the lack of experience was in context of small/medium sized organization who would like to go for MongoDB but, won't find a suitable candidate as it is new and yet to be fully developed.

  • http://www.tutkiun.com/ Mayur

    Thanks for taking your time to read Tutkiun!

    Here is why I didn't say it is stable.

    http://www.blue74.com/2010/06/scatter/were-back-so-long-mongodb/

    But, yes with new versions MongoDB is quiet stable!

  • http://www.marko.anastasov.name Marko Anastasov

    Thanks for those links Mayur. Now I'm more clear about the tradeoff. It's not really that for a random reason data will eventually be lost, it's that MongoDB's design is more prone to having corrupt data in case of a machine fail, in single instance conf.
    Some improvement should come with release 1.7 or 1.8.

  • http://twitter.com/stilburg Sven Tilburg

    Excuse me, I really don't want to offend anyone…but this guy moves first from MySQL to MongoDB because it's schema free and then moves back to MySQL because he doesn't like the idea of a scheme free database. So why did he move to Mongo in the first place?

    Same goes for "I have to reboot my server every couple of days" – what the heck is he doing there? I never had to reboot my server because of Mongo and yes if you just reboot a server without correctly shutting down Mongo you may loose data (sic!)

    I am heavily under the impression the blog article you are quoting is just a rant by someone incapable of handling new technology.

  • http://javarevisited.blogspot.com/2011/01/how-classpath-work-in-java.html JP@ classpath in Java

    I am seeing lot of post about mangodb now days, not used yet but lots of buzz around this , what is its real benefit over other databses ?

    • http://www.tutkiun.com/ Mayur

      Here are some of the advantages of MongoDB for building web applications:

      A document-based data model. The basic unit of storage is analogous to JSON, Python dictionaries, Ruby hashes, etc. This is a rich data structure capable of holding arrays and other documents. This means you can often represent in a single entity a construct what would require several tables to properly represent in a relational db.

      Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that’s nearly as powerful as SQL.

      No schema migrations. Since MongoDB is schema-free, your code defines your schema.

      Better performance. There are many reasons for this. One is that, since the document model frequently doesn’t need joins, MongoDB doesn’t support them; another is that MongoDB uses memory-mapped files and a different consistency model.

      A clear path to horizontal scalability.

      You’ll need to read more about it and play with it to get a better idea. Here’s an online demo:

      http://mongo.kylebanker.com/

  • http://javarevisited.blogspot.com/2011/01/how-classpath-work-in-java.html JP@ classpath in Java

    I am seeing lot of post about mangodb now days, not used yet but lots of buzz around this , what is its real benefit over other databses ?

    • http://www.tutkiun.com/ Mayur

      Here are some of the advantages of MongoDB for building web applications:

      A document-based data model. The basic unit of storage is analogous to JSON, Python dictionaries, Ruby hashes, etc. This is a rich data structure capable of holding arrays and other documents. This means you can often represent in a single entity a construct what would require several tables to properly represent in a relational db.

      Deep query-ability. MongoDB supports dynamic queries on documents using a document-based query language that’s nearly as powerful as SQL.

      No schema migrations. Since MongoDB is schema-free, your code defines your schema.

      Better performance. There are many reasons for this. One is that, since the document model frequently doesn’t need joins, MongoDB doesn’t support them; another is that MongoDB uses memory-mapped files and a different consistency model.

      A clear path to horizontal scalability.

      You’ll need to read more about it and play with it to get a better idea. Here’s an online demo:

      http://mongo.kylebanker.com/