Thursday, November 01, 2018

MongoDB Storage Engines

The storage engine is a vital component for managing and storing data in the memory as well as disk. MongoDB supports multiple storage engines with unique features for better performance. In this blog, we are going to discuss the various storage engines and their features.
Types of Storage Engine:

Our production workload will be different for each application, some will be writing intensive, some will be read and some required encryption etc. MongoDB provides flexibility to handle such workloads by providing multiple storage engines. Mentioned the storage engines below.
  • Wired Tiger
  • MMAPv1
  • Encrypted
  • In-memory
Let's see the key features of each storage engines.
Wired Tiger:
  • Wired Tiger (WT) is the default storage engine from mongo 3.0
  • WT storage engine uses document-level concurrency control for write operations so multiple clients can modify different documents of a collection at the same time.
  • It uses only intent locks at the global, database and collection levels when the storage engine detects conflicts between two operations, one will incur a write conflict causing MongoDB to transparently retry that operation
  • MongoDB utilizes both the WiredTiger internal cache and the filesystem cache. By default the wired tiger cache will use 50% of RAM minus 1 GB or 256 MB.
  • Efficient use of CPU cores and RAM
  • Allows for more tuning of storage engine than MMAP
  • 7 to 10X better write performance
  • 80% less storage with compression
  • Compression minimizes storage use at the expense of additional CPU.
  • Collection level data in the WiredTiger internal cache is uncompressed and uses a different representation from the on-disk format.
MMAP:
  • The MMAP Storage engine uses memory mapped files to store its data
  • A segment of virtual memory which has been assigned a direct byte-for-byte correlation with some portion of a file
  • It is a traditional storage engine that allow great deal of performance for heavy read applications
  • Data and indexes are mapped into virtual space
  • Data access is placed into RAM
  • When the OS runs out of RAM and an application requests for memory,then it will swap out memory to disk to make space for the newly requested data
  • The operating system’s virtual memory subsystem manages MongoDB’s memory
  • Deployments with enough memory to fit the application’s working data set in RAM will achieve the best performance.

Encrypted:
  • Available in mongodb enterprise only.
  • The default encryption mode that MongoDB Enterprise uses is the AES256-CBC
  • All data files are fully encrypted from a file system
  • Unencrypted state in memory and during transmission
  • Master keys and database keys are used for encryption
  • Data is encrypted with the database keys,master key encrypts the database keys
  • Encryption is not a part in replication keys are not replicated
  • In replication data is not natively encrypted over the wire
  • Application Level Encryption provides encryption on a per-field or per-document basis within the application layer

In-memory:
  • It is available in the enterprise editions starting from version 3.2.6.
  • Handles ultra high throughput with low latency and high availability
  • In-memory storage engine is part of general availability
  • More predictable and low latency on less in-memory infrastructure
  • Supports high level infrastructure based on zonal sharding
  • MongodB rich query capability and indexing support

Third-party pluggable storage engines:
  • MongoDB is providing support for 3rd party storage engines as “modules” that can be independently updated.
  • When building MongoDB, any storage engine modules will be automatically detected, configured and integrated in the final binaries.
  • The RocksDB storage engine is the first one to use this new module system for their MongoDB storage integration layer
  • RocksDB for MongoDB is based on the key-value store optimized for fast storage.
  • It is developed by facebook and designed to handle write-intensive workloads.

Storage Engine application API:
As mentioned, each application load will be different from other. Choosing the right storage engine will definetly boost the performance. Differentiated storage engine with respect to the workload which helps in choosing the right storage engine.
Comparison chart:
The overall feature comparison for all the storage engines are listed below:

2 comments:

  1. The information which you have provided in this blog is really useful to everyone. Thanks for sharing.
    Mern stack online training

    Mern stack training in hyderabad

    ReplyDelete
  2. Your blog is in a convincing manner, thanks for sharing such an information with lots of your effort and time
    mongodb online training India
    mongodb online training Hyderabad

    ReplyDelete