EasyReliableDBA: MongooDB Concept Part 8

8.3 Data File (Relevant for WiredTiger)

In this section, you will look at the content of data directory when the mongod is started with the WiredTiger storage engine .

When the storage option selected is WiredTiger, data, journals, and indexes are compressed on disk. The compression is done based on the compression algorithm specified when starting the mongod.

Snappy is the default compression option.

Under the data directory, there are separate compressed wt files corresponding to each collection and indexes. Journals have their own folder under the data directory.

The compressed files are actually created when data is inserted in the collection (the files are allocated at write time, no preallocation).

For example, if you create collection called users, it will be stored in collection-0—2259994602858926461 files and the associated index will be stored in index-1—2259994602858926461, index-2—2259994602858926461, and so on.

In addition to the collection and index compressed files, there is a _mdb_catalog file that stores metadata mapping collection and indexes to the files in the data directory. In the above example it will store mapping of collection users to the wt file collection-0—2259994602858926461. See Figure 8-11.

Figure 8-11.

WiredTiger Data folder

Separate volumes can be specified for storing indexes.

When specifying the DBPath you need to ensure that the directory corresponds to the storage engine, which is specified using the –storageEngine option when starting the mongod. The mongod will fail to start if the dbpath contains files created by a storage engine other than the one specified using the –storageEngine option. So if MMAPv1 files are found in DBPath, then WT will fail to start.

Internally, WiredTiger uses the traditional B+ tree structure for storing and managing data but that’s where the similarity ends. Unlike B+ tree, it doesn’t support in-place updates.

WiredTiger cache is used for any read/write operations on the data. The trees in cache are optimized for in-memory access.

8.4 Reads and Writes

You will briefly look at how the reads and writes happen. As mentioned, when MongoDB updates and reads from the DB, it is actually reading and writing to memory.

If a modification operation in the MongoDB MMAPv1 storage engine increases the record size bigger then the space allocated for it, then the entire record will be moved to a much bigger space with extra padding bytes. By default, MongoDB uses power of 2-sized allocations so that every document in MongoDB is stored in a record that contains the document itself and extra space (padding). Padding allows the document to grow as the result of updates while minimizing the likelihood of reallocations. Once the record is moved, the space that was originally occupied is freed up and will be tracked as free lists of different size. As mentioned, it’s the $freelist namespace in the .ns file.

In the MMAPv1 storage engine, as objects are deleted, modified, or created, fragmentation will occur over time, which will affect the performance. The compact command should be executed to move the fragmented data into contiguous spaces.

Every 60 seconds the files in RAM are flushed to disk. To prevent data loss in the event of a power failure, the default is to run with journaling switched on. The behavior of journal is dependent on the configured storage engine.

The MMAPv1 journal file is flushed to disk every 100ms, and if there is power loss, it is used to bring the database back to a consistent state.

In WiredTiger, the data in the cache is stored in a B+ tree structure which is optimized for in-memory. The cache maintains an on-disk page image in association with an index , which is used to identify where the data being asked for actually resides within the page (see Figure 8-12).

Figure 8-12.

WiredTiger cache

The write operation in WiredTiger never updates in-place.

Whenever an operation is issued to WiredTiger, internally it’s broken into multiple transactions wherein each transaction works within the context of an in-memory snapshot. The snapshot is of the committed version before the transactions started. Writers can create new versions concurrent with the readers.

The write operations do not change the page; instead the updates are layered on top of the page. A skipList data structure is used to maintain all the updates, where the most recent update is on the top. Thus, whenever a user reads/writes the data, the index checks whether a skiplist exists. If a skiplist is not there, it returns data from the on-disk page image. If skiplist exists, the data at the head of the list is returned to the threads, which then update the data. Once a commit is performed, the updated data is added to the head of the list and the pointers are adjusted accordingly. This way multiple users can access data concurrently without any conflict. The conflict occurs only when multiple threads are trying to update the same record. In that case, one update wins and the other concurrent update needs to retry.

Any changes to the tree structure due to the update, such as splitting the data if the page sizes increase, relocation, etc., are later reconciled by a background process. This accounts for the fast write operations of the WiredTiger engine; the task of data arrangement is left to the background process. See Figure 8-13.

Figure 8-13.

SkipList

WiredTiger uses the MVCC approach to ensure concurrency control wherein multiple versions of the data are maintained. It also ensures that every thread that is trying to access the data sees the most consistent version of the data. As you have seen, the writes are not in place; instead they are appended on top of the data in a skipList data structure with the most recent update on the top. Threads accessing the data get the latest copy, and they continue to work with that copy uninterrupted until the time they commit. Once they commit, the update is appended at the top of the list and thereafter any thread accessing the data will see that latest update.

This enables multiple threads to access the same data concurrently without any locking or contention. This also enables the writers to create new versions concurrently with the readers. The conflict occurs only when multiple threads are trying to update the same record. In that case, one update wins and the other concurrent update needs to retry.

Figure 8-14 depicts the MVCC in action .

Figure 8-14.

Update in action

The WiredTiger journal ensures that writes are persisted to disk between checkpoints. WiredTiger uses checkpoints to flush data to disk by default every 60 seconds or after 2GB of data has been written. Thus, by default, WiredTiger can lose up to 60 seconds of writes if running without journaling, although the risk of this loss will typically be much less if using replication for durability. The WiredTiger transaction log is not necessary to keep the data files in a consistent state in the event of an unclean shutdown, and so it is safe to run without journaling enabled, although to ensure durability the “replica safe” write concern should be configured. Another feature of the WiredTiger storage engine is the ability to compress the journal on disk, thereby reducing storage space.

8.5 How Data Is Written Using Journaling

In this section you will briefly look at how write operations are performed using journaling.

MongoDB disk writes are lazy, which means if there are 1,000 increments in one second, it will only be written once. The physical writes occurs a few seconds after the operation.

You will now see how an update actually happens in mongod.

As you know, in the MongoDB system, mongod is the primary daemon process. So the disk has the data files and the journal files . See Figure 8-15.

Figure 8-15.

mongod

When the mongod is started, the data files are mapped to a shared view. In other words, the data file is mapped to a virtual address space. See Figure 8-16.

Figure 8-16.

maps to shared view

Basically, the OS recognizes that your data file is 2000 bytes on disk, so it maps this to memory address 1,000,000 – 1,002,000. Note that the data will not be actually loaded until accessed; the OS just maps it and keeps it.

Until now you still had files backing up the memory. Thus any change in memory will be flushed to the underlying files by the OS.

This is how the mongod works when journaling is not enabled. Every 60 seconds the in-memory changes are flushed by the OS.

In this scenario, let’s look at writes with journaling enabled. When journaling is enabled, a second mapping is made to a private view by the mongod.

That’s why the virtual memory amount used by mongod doubles when the journaling is enabled. See Figure 8-17.

Figure 8-17.

maps to private view

You can see in Figure 8-17 how the data file is not directly connected to the private view, so the changes will not be flushed from the private view to the disk by the OS.

Let’s see what sequence of events happens when a write operation is initiated. When a write operation is initiated it, first it writes to the private view (Figure 8-18).

Figure 8-18.

Initiated write operation

Next, the changes are written to the journal file, appending a brief description of what’s changed in the files (Figure 8-19).

Figure 8-19.

Updating the journal file

The journal keeps appending the change description as and when it gets the change. If the mongod fails at this point, the journal can replay all the changes even if the data file is not yet modified, making the write safe at this point.

The journal will now replay the logged changes on the shared view (Figure 8-20).

Figure 8-20.

Updating the shared view

Finally, with a very fast speed the changes are written to the disk. By default, the OS is requested to do this every 60 seconds by the mongod (Figure 8-21).

Figure 8-21.

Updating the data file

In the last step, the shared view is remapped to the private view by the mongod. This is done to prevent the private view from getting too dirty (Figure 8-22).

Figure 8-22.

Remapping

8.6 GridFS – The MongoDB File System

You looked at what happens under the hood. You saw that MongoDB stores data in BSON documents. BSON documents have a document size limit of 16MB.

GridFS is MongoDB’s specification for handling large files that exceed BSON’s document size limit. This section will briefly cover GridFS.

Here “specification” means that it is not a MongoDB feature by itself, so there is no code in MongoDB that implements it. It just specifies how large files need to be handled. The language drivers such as PHP, Python, etc. implement this specification and expose an API to the user of that driver, enabling them to store/retrieve large files in MongoDB.

8.6.1 The Rationale of GridFS

By design, a MongoDB document (i.e. a BSON object) cannot be larger than 16MB. This is to keep performance at an optimum level, and the size is well suited for our needs. For example, 4MB of space might be sufficient for storing a sound clip or a profile picture. However, if the requirement is to store high quality audio or movie clips, or even files that are more than several hundred megabytes in size, MongoDB has you covered by using GridFS.

GridFS specifies a mechanism for dividing a large file among multiple documents. The language driver that implements it, for example, the PHP driver, takes care of the splitting of the stored files (or merging the split chunks when files are to be retrieved) under the hood. The developer using the driver does not need to know of such internal details. This way GridFS allows the developer to store and manipulate files in a transparent and efficient way .

GridFS uses two collections for storing the file. One collection maintains the metadata of the file and the other collection stores the file’s data by breaking it into small pieces called chunks. This means the file is divided into smaller chunks and each chunk is stored as a separate document. By default the chunk size is limited to 255KB.

This approach not only makes the storing of data scalable and easy but also makes the range queries easier to use when a specific part of files are retrieved.

Whenver a file is queried in GridFS, the chunks are reassembled as required by the client. This also provides the user with the capability to access arbitrary sections of the files. For example, the user can directly move to the middle of a video file.

The GridFS specification is useful in cases where the file size exceeds the default 16MB limitation of MongoDB BSON document. It’s also used for storing files that you need to access without loading the entire file in memory.

8.6.2 GridFSunder the Hood

GridFS is a lightweight specification used for storing files.

There’s no “special case” handling done at the MongoDB server for the GridFS requests. All the work is done at the client side.

GridFS enables you to store large files by splitting them up into smaller chunks and storing each of the chunks as separate documents. In addition to these chunks, there’s one more document that contains the metadata about the file. Using this metadata information, the chunks are grouped together, forming the complete file.

The storage overhead for the chunks can be kept to a minimum, as MongoDB supports storing binary data in documents.

The two collections that are used by GridFS for storing of large files are by default named as fs.files and fs.chunks, although a different bucket name can be chosen than fs.

The chunks are stored by default in the fs.chunks collection. If required, this can be overridden. Hence all of the data is contained in the fs.chunks collection.

The structure of the individual documents in the chunks collection is pretty simple:

{

"_id" : ObjectId("..."),"n" : 0,"data" : BinData("..."),

"files_id" : ObjectId("...")

}

The chunk document has the following important keys.

"_id": This is the unique identifier.

"files_id": This is unique identifier of the document that contains the metadata related to the chunk.

"n": This is basically depicting the position of the chunk in the original file.

"data": This is the actual binary data that constitutes this chunk.

The fs.files collection stores the metadata for each file. Each document within this collection represents a single file in GridFS. In addition to the general metadata information, each document of the collection can contain custom metadata specific to the file it’s representing.

The following are the keys that are mandated by the GridFS specification :

_id: This is the unique identifier of the file.

Length: This depicts the total bytes that make up the complete content of the file.

chunkSize: This is the file’s chunk size, in bytes. By default it’s 255KB, but if needed this can be adjusted.

uploadDate: This is the timestamp when the file was stored in GridFS.

md5: This is generated on the server side and is the md5 checksum of the files contents. MongoDB server generates its value by using the filemd5 command, which computes the md5 checksum of the uploaded chunks. This implies that the user can check this value to ensure that the file is uploaded correctly.

A typical fs.files document looks as follows (see also Figure 8-23) :

Figure 8-23.

GridFS

{

"_id" : ObjectId("..."), "length" : data_number,

"chunkSize" : data_number,

"uploadDate" : data_date,

"md5" : data_string

}

8.6.3 Using GridFS

In this section, you will be using the PyMongo driver to see how you can start using GridFS.

Add Reference to the Filesystem

The first thing that is needed is a reference to the GridFS filesystem :

>>> import pymongo

>>> import gridfs

>>>myconn=pymongo.Connection()

>>>mydb=myconn.gridfstest

>>>myfs=gridfs.GridFS(db)

write( )

Next you will execute a basic write:

>>> with myfs.new_file() as myfp:

myfp.write('This is my new sample file. It is just grand!')

find( )

Using the mongo shell let’s see what the underlying collections holds:

>>> list(mydb.myfs.files.find())

[{u'length': 38, u'_id': ObjectId('52fdd6189cd2fd08288d5f5c'), u'uploadDate': datetime.datetime(2014, 11, 04, 4, 20, 41, 800000), u'md5': u'332de5ca08b73218a8777da69293576a', u'chunkSize': 262144}]

>>> list(mydb.myfs.chunks.find())

[{u'files_id': ObjectId('52fdd6189cd2fd08288d5f5c'), u'_id': ObjectId('52fdd6189cd2fd08288d5f5d'), u'data': Binary('This is my new sample file. It is just grand!', 0), u'n': 0}]

Force Split the File

Let’s force split the file. This is done by specifying a small chunkSize while file creation, like so:

>>> with myfs.new_file(chunkSize=10) as myfp:

myfp.write('This is second file. I am assuming it will be split into various chunks')

>>>

>>>myfp

<gridfs.grid_file.GridIn object at 0x0000000002AC79B0>

>>>myfp._id

ObjectId('52fdd76e9cd2fd08288d5f5e')

>>> list(mydb.myfs.chunks.find(dict(files_id=myfp._id)))

.................

ObjectId('52fdd76e9cd2fd08288d5f65'), u'data': Binary('s', 0), u'n': 6}]

read( )

You now know how the file is actually stored in the database. Next, using the client driver, you will now read the file:

>>> with myfs.get(myfp._id) as myfp_read:

print myfp_read.read()

“This is second file. I am assuming it will be split into various chunks.”

The user need not be aware of the chunks at all. You need to use the APIs exposed by the client to read and write files from GridFS.

8.6.3.1 Treating GridFS More Like a File System

new_file( ) - Create a new file in GridFS

You can pass any number of keywords as arguments to the new_file(). This will be added in the fs.files document:

>>> with myfs.new_file(

filename='practicalfile.txt',

content_type='text/plain',

my_other_attribute=42) as myfp:

myfp.write('My New file')

>>>myfp

<gridfs.grid_file.GridIn object at 0x0000000002AC7AC8>

>>> db.myfs.files.find_one(dict(_id=myfp._id))

{u'contentType': u'text/plain', u'chunkSize': 262144, u'my_other_attribute': 42, u'filename': u'practicalfile.txt', u'length': 8, u'uploadDate': datetime.datetime(2014, 11, 04, 9, 01, 32, 800000), u'_id': ObjectId('52fdd8db9cd2fd08288d5f66'), u'md5': u'681e10aecbafd7dd385fa51798ca0fd6'}

>>>

A file can be overwritten using filenames. Since _id is used for indexing files in GridFS, the old file is not removed. Just a file version is maintained.

>>> with myfs.new_file(filename='practicalfile.txt', content_type='text/plain') as myfp:

myfp.write('Overwriting the "My New file"')

get_version( )/get_last_version( )

In the above case, get_version or get_last_version can be used to retrieve the file with the filename.

>>>myfs.get_last_version('practicalfile.txt').read()

'Overwriting the "My New file"'

>>>myfs.get_version('practicalfile.txt',0).read()

'My New file'

You can also list the files in GridFS:

>>>myfs.list()

[u'practicalfile.txt', u'practicalfile2.txt']

delete( )

Files can also be removed:

>>>myfp=myfs.get_last_version('practicalfile.txt')

>>>myfs.delete(myfp._id)

>>>myfs.list()

[u'practicalfile.txt', u'practicalfile2.txt']

>>>myfs.get_last_version('practicalfile.txt').read()

'My New file'

>>>

Note that only one version of practicalfile.txt was removed. You still have a file named practicalfile.txt in the filesystem.

exists( ) and put( )

Next, you will use exists() to check if a file exists and put() to quickly write a short file into GridFS:

>>>myfs.exists(myfp._id)

False

>>>myfs.exists(filename='practicalfile.txt')

True

>>>myfs.exists({'filename':'practicalfile.txt'}) # equivalent to above

True

>>>myfs.put('The red fish', filename='typingtest.txt')

ObjectId('52fddbc69cd2fd08288d5f6a')

>>>myfs.get_last_version('typingtest.txt').read()

'The red fish'

>>>

8.7 Indexing

In this part of the book, you will briefly examine what an index is in the MongoDB context. Following that, we will highlight the various types of indexes available in MongoDB, concluding the section by highlighting the behavior and limitations.

An index is a data structure that speeds up the read operations. In layman terms, it is comparable to a book index where you can reach any chapter by looking in the index for the chapter and jumping directly to the page number rather than scanning the entire book to reach to the chapter, which would be the case if no index existed.

Similarly, an index is defined on fields, which can help in searching for information in a better and efficient manner.

As in other databases, in MongoDB also it’s perceived in a similar fashion (it’s used for speeding up the find () operation ). The type of queries you run help to create efficient indexes for the databases. For example, if most of the queries use a Date field, it would be beneficial to create an index on the Date field. It can be tricky to figure out which index is optimal for your query, but it’s worth a try because the queries that otherwise take minutes will return results instantaneously if a proper index is in place.

In MongoDB, an index can be created on any field or sub-field of a document. Before you look at the various types of indexes that can be created in MongoDB, let’s list a few core features of the indexes:

The indexes are defined at the per-collection level. For each collection, there are different sets of indexes.

Like SQL indexes, a MongoDB index can also be created either on a single field or set of fields.

In SQL, although indexes enhance the query performance, you incur overhead for every write operation. So before creating any index, consider the type of queries, frequency, the size of the workload, and the insert load along with application requirements.

A BTree data structure is used by all MongoDB indexes.

Every query using the update operations uses only one index, which is decided by the query optimizer. This can be overridden by using a hint.

A query is said to be covered by an index if all fields are part of the index, irrespective of whether it’s used for querying or for projecting.

A covering index maximizes the MongoDB performance and throughput because the query can be satiated using an index only, without loading the full documents in memory.

An index will only be updated when the fields on which the index has been created are changed. Not all update operations on a document cause the index to be changed. It will only be changed if the associated fields are impacted.

8.7.1 Types of Indexes

In this section, you will look at the different types of indexes that are available in MongoDB.

8.7.1.1 _id index

This is the default index that is created on the _id field. This index cannot be deleted.

8.7.1.2 Secondary Indexes

All indexes that are user created using ensureIndex() in MongoDB are termed as secondary indexes .

These indexes can be created on any field in the document or the sub document. Let’s consider the following document:

{"_id": ObjectId(...), "name": "Practical User", "address": {"zipcode": 201301, "state": "UP"}}

In this document, an index can be created on the name field as well as the state field.

These indexes can be created on a field that is holding a sub-document.

If you consider the above document where address is holding a sub-document, in that case an index can be created on the address field as well.

These indexes can either be created on a single field or a set of fields. When created with set of fields, it’s also termed a compound index.

To explain it a bit further, let’s consider a products collection that holds documents of the following format:

{ "_id": ObjectId(...),"category": ["food", "grocery"], "item": "Apple", "location": "16 ^thFloor Store", "arrival": Date(...)}

If the maximum of the queries use the fields Item and Location, then the following compound index can be created:

db.products.ensureIndex ({"item": 1, "location": 1})

In addition to the query that is referring to all the fields of the compound index, the above compound index can also support queries that are using any of the index prefixes (i.e. it can also support queries that are using only the item field).

If the index is created on a field that holds an array as its value, then a multikey index is used for indexing each value of the array separately.

Consider the following document:

{ "_id" : ObjectId("..."),"tags" : [ "food", "hot", "pizza", "may" ] }

An index on tags is a multikey index, and it will have the following entries:

{ tags: "food" }

{ tags: "hot" }

{ tags: "pizza" }

{ tags: "may" }

Multikey compound indexes can also be created. However, at any point, only one field of the compound index can be of the array type.

If you create a compound index of {a1: 1, b1: 1}, the permissible documents are as follows:

{a1: [1, 2], b1: 1}

{a1: 1, b1: [1, 2]}

The following document is not permissible; in fact, MongoDB won’t be even able to insert this document:

{a1: [21, 22], b1: [11, 12]}

If an attempt is made to insert such a document, the insertion will be rejected and the following error results will be produced: “cannot index parallel arrays”.

You will next look at the various options/properties that might be useful while creating indexes.

Indexes with Keys Ordering

MongoDB indexes maintain references to the fields. The refernces are maintained in either an ascending order or descending order. This is done by specifying a number with the key when creating an index. This number indicates the index direction. Possible options are 1 and -1, where 1 stands for ascending and -1 stands for descending.

In a single key index, it might not be too important; however, the direction is very important in compound indexes.

Consider an Events collection that includes both username and timestamp. Your query is to return events ordered by username first and then with the most recent event first. The following index will be used:

db.events.ensureIndex({ "username" : 1, "timestamp" : -1 })

This index contains references to the documents that are sorted in the following manner:

First by the username field in ascending order.

Then for each username sorted by the timestamp field in the descending order.

Unique Indexes

When you create an index, you need to ensure uniqueness of the values being stored in the indexed field. In such cases, you can create indexes with the Unique property set to true (by default it’s false).

Say you want a unique_index on the field userid. The following command can be run to create the unique index:

db.payroll.ensureIndex( { "userid": 1 }, { unique: true } )

This command ensures that you have unique values in the user_id field. A few points that you need to note for the uniqueness constraint are

If the unique constraint is used on a compound index in that scenario, uniqueness is enforced on the combination of values.

A null value is stored in case there’s no value specified for the field of a unique index.

At any point only one document is permitted without a unique value.

dropDups

If you are creating a unique index on a collection that already has documents, the creation might fail because you may have some documents that contain duplicate values in the indexed field. In such scenarios, the dropDups options can be used for force creation of the unique index. This works by keeping the first occurrence of the key value and deleting all the subsequent values. By default dropDups is false.

Sparse Indexes

A sparse index is an index that holds entries of the documents within a collection that has the fields on which the index is created. If you want to create a sparse index on the LastName field of the User collection, the following command can be issued:

db.User.ensureIndex( { "LastName": 1 }, { sparse: true } )

This index will contain documents such as

{FirstName: Test, LastName: User}

{FirstName: Test2, LastName: }

However, the following document will not be part of the sparse index:

{FirstName: Test1}

The index is said to be sparse because this only contains documents with the indexes field and miss the documents when the fields are missing. Due to this nature, sparse indexes provide a significant space saving.

In contrast, the non-sparse index includes all documents irrespective of whether the indexed field is available in the document or not. Null value is stored in case the fields are missing.

TTL Indexes (Time To Live )

A new index property was introduced in version 2.2 that enables you to remove documents from the collection automatically after the specified time period is elapsed. This property is ideal for scenarios such as logs, session information, and machine-generated event data, where the data needs to be persistent only for a limited period.

If you want to set the TTL of one hour on collection logs, the following command can be used:

db.Logs.ensureIndex( { "Sample_Time": 1 }, { expireAfterSeconds: 3600} )

However, you need to note the following limitations:

The field on which the index is created must be of the date type only. In the above example, the field sample_time must hold date values.

It does not support compound indexes.

If the field that is indexed contains an array with multiple dates, the document expires when the smallest date in the array matches the expiration threshold.

It cannot be created on the field which already has an index created.

This index cannot be created on capped collections.

TTL indexes expire data using a background task, which is run every minute, to remove the expired documents. So you cannot guarantee that the expired document no longer exists in the collection.

Geospatial Indexes

With the rise of the smartphone, it’s becoming very common to query for things near a current location. In order to support such location-based queries, MongoDB provides geospatial indexes .

To create a geospatial index, a coordinate pair in the following forms must exist in the documents:

Either an array with two elements

Or an embedded document with two keys (the key names can be anything).

The following are valid examples:

{ "userloc" : [ 0, 90 ] }

{ "loc" : { "x" : 30, "y" : -30 } }

{ "loc" : { "latitude" : -30, "longitude" : 180 } }

{"loc" : {"a1" : 0, "b1" : 1}}.

The following can be used to create a geospatial index on the userloc field:

db.userplaces.ensureIndex( { userloc : "2d" } )

A geospatial index assumes that the values will range from -180 to 180 by default. If this needs to be changed, it can be specified along with ensureIndex as follows:

db.userplaces.ensureIndex({"userloc" : "2d"}, {"min" : -1000, "max" : 1000})

Any documents with values beyond the maximum and the minimum values will be rejected. You can also create compound geospatial indexes.

Let’s understand with an example how this index works. Say you have documents that are of the following type:

{"loc":[0,100], "desc":"coffeeshop"}

{"loc":[0,1], "desc":"pizzashop"}

If the query of a user is to find all coffee shops near her location, the following compound index can help:

db.ensureIndex({"userloc" : "2d", "desc" : 1})

Geohaystack Indexes

Geohaystack indexes are bucket-based geospatial indexes (also called geospatial haystack indexes). They are useful for queries that need to find out locations in a small area and also need to be filtered along another dimension, such as finding documents with coordinates within 10 miles and a type field value as restaurant.

While defining the index, it’s mandatory to specify the bucketSize parameter as it determines the haystack index granularity. For example,

db.userplaces.ensureIndex({ userpos : "geoHaystack", type : 1 }, { bucketSize : 1 })

This example creates an index wherein keys within 1 unit of latitude or longitude are stored together in the same bucket. You can also include an additional category in the index, which means that information will be looked up at the same time as finding the location details.

If your use case typically searches for "nearby" locations (i.e. "restaurants within 25 miles"), a haystack index can be more efficient.

The matches for the additional indexed field (e.g. category) can be found and counted within each bucket.

If, instead, you are searching for "nearest restaurant" and would like to return results regardless of distance, a normal 2d index will be more efficient.

There are currently (as of MongoDB 2.2.0) a few limitations on haystack indexes:

Only one additional field can be included in the haystack index.

The additional index field has to be a single value, not an array.

Null long/lat values are not supported.

In addition to the above mentioned types, there is a new type of index introduced in version 2.4 that supports text search on a collection.

Previously in beta, in the 2.6 release, text search is a built-in feature. It includes options such as searching in 15 languages and an aggregation option that can be used to set up faceted navigation by product or color, for example, on an e-commerce website.

8.7.1.3 Index Intersection

Index intersection is introduced in version 2.6 wherein multiple indexes can be intersected to satiate a query. To explain it a bit further, let’s consider a products collection that holds documents of the following format

{ "_id": ObjectId(...),"category": ["food", "grocery"], "item": "Apple", "location": "16 ^thFloor Store", "arrival": Date(...)}.

Let’s further assume that this collection has the following two indexes:

{ "item": 1 }.

{ "location": 1 }.

Intersection of the above two indexes can be used for the following query:

db.products.find ({"item": "xyz", "location": "abc"})

You can run explain() to determine if index intersection is used for the above query. The explain output will include either of the following stages: AND_SORTED or AND_HASH. When doing index intersection, either the entire index or only the index prefix can be used.

You next need to understand how this index intersection feature impacts the compound index creation.

While creating a compound index, both the order in which the keys are listed in the index and the sort order (ascending and descending) matters. Thus a compound index may not support a query that does not have the index prefix or has keys with different sort order.

To explain it a bit further, let’s consider a products collection that has the following compound index:

db.products.ensureIndex ({"item": 1, "location": 1})

In addition to the query, which is referring to all the fields of the compound index, the above compound index can also support queries that are using any of the index prefix (it can also support queries that are using only the item field). But it won’t be able to support queries that are using either only the location field or are using the item key with a different sort order.

Instead, if you create two separate indexes, one on the item and the other on the location, these two indexes can either individually or though intersections support the four queries mentioned above. Thus, the choice between whether to create a compound index or to rely on intersection of indexes depends on the system’s needs.

Note that index intersection will not apply when the sort() operation needs an index that is completely separate from the query predicate.

For example, let’s assume for the products collection you have the following indexes:

{ "item": 1 }.

{ "location": 1 }.

{ "location": 1, "arrival_date":-1 }.

{ "arrival_date": -1 }.

Index intersection will not be used for the following query:

db.products.find( { item: "xyz"} ).sort( { location: 1 } )

That is, MongoDBwill not use the { item: 1 } index for the query, and the separate { location: 1 } or the { location: 1, arrival_date: -1 } index for the sort.

However, index intersection can be used for the following query since the index {location: 1,arrival_date: -1 } can fulfil part of the query predicate:

db.products.find( { item: { "xyz"} , location: "A" } ).sort( { arrival_date: -1 } )

8.7.2 Behaviors and Limitations

Finally, the following are a few behaviors and limitations that you need to be aware of:

More than 64 indexes may not be allowed in a collection.

Index keys cannot be larger than 1024 bytes.

A document cannot be indexed if its fields’ values are greater than this size.

The following command can be used to query documents that are too large to index:

db.practicalCollection.find({<key>: <too large to index>}).hint({$natural: 1})

An index name (including the namespace) must be less than 128 characters.

The insert/update speeds are impacted to some extent by an index.

Do not maintain indexes that are not used or will not be used.

Since each clause of an $or query executes in parallel, each can use a different index.

The queries that use the sort () method and the $or operator will not be able to use the indexes on the $or fields.

Queries that use the $or operator are not supported by the second geospatial query.

8.8 Summary

In this chapter, you covered how data is stored under the hood and how writes happen using journaling. You also looked at GridFS and the different types of indexes available in MongoDB.

In the following chapter, you will look at MongoDB from administration perspective.

EasyReliableDBA

Monday, 5 March 2018

MongooDB Concept Part 8