9.5 Monitoring MongoDB
As a MongoDB server administrator, it’s
important to monitor the system’s performance and health. In this
section, you will learn ways of monitoring the system.
9.5.1 mongostat
mongostat comes as part of the MongoDB
distribution. This tool provides simple stats on the server;
although it’s not extensive, it provides a good overview. The
following shows the statistics of the localhost. Open a terminal
window and execute the following:
The first six columns show the rate at
which various operations are handled by the mongod server. Apart
from these columns, the following column is also worth mentioning
and can be of use when diagnosing problems :
-
Conn: This is an indicator of the
number of connections to the mongod instance. A high value here
can indicate a possibility that the connections are not getting
released or closed from the application, which means that
although the application is issuing an open connection, it’s
not closing the connection after completion of the operation.
Starting from version 3.0, mongostat can
also return its response in json format using option –json.
{"ANOC9":{"ar|aw":"0|0","command":"1|0","conn":"1","delete":"*0","faults":"1","flushes":"0","getmore":"0","host":"ANOC9","insert":"*0","locked":"",
"mapped":"560.0M","netIn":"79b","netOut":"10k","non
mapped":"","qr|qw":"0|0","query":"*0","res":"153.0M","time":"05:16:17","update":"*0","vsize":"1.2G"}}
9.5.2 mongod Web Interface
Whenever a mongod is started up, it
creates a web port by default, which is 1000 higher than the port
number mongod uses to listen for a connection. By default the HTTP
port is 28017.
This mongod web interface is accessed via
your web browser, and it displays most of the statistical
information. If the mongod is running on localhost and is listening
for the connections on port 27017, then the HTTP status page can be
accessed using the following URL: http://localhost:28017. The page
looks like Figure 9-2.
9.5.3 Third-Party Plug-Ins
In addition to this tool, there are
various third-party adapters available for MongoDB which let you
use common open source or commercial monitoring systems such as
cacti, Ganglia, etc. On its website, 10gen maintains a page that
shares the latest information about available MongoDB monitoring
interfaces.
To get an up-to-date list of third-party
plug-ins, go to
www.mongodb.org/display/DOCS/Monitoring+and+Diagnostics
.
9.5.4 MongoDB Cloud Manager
In addition to the tools and techniques
discussed above for monitoring and backup purposes, there is the
MongoDB Cloud Manager (formerly known as MMS – MongoDB Monitoring
Services). It’s developed by the team who developed MongoDB and
is free to use (30-day trial license). In contrast to the
techniques discussed above, MongoDB Cloud Manager provides user
interface as well as logs and performance details in the form of
graphs and charts.
MongoDB Cloud Manager charts are
interactive, enabling the user to set a custom date range, as
depicted in Figure 9-3.
Another neat feature of the Cloud Manager
is the ability to use email and text alerts in case of different
events. This is depicted in Figure 9-4.
Not only does Cloud Manager provides
graphs and alerts, it also lets you view the slower queries ordered
by response time. You can easily see how your queries are
performing all at one place. Figure 9-5
shows the graph that charts query performance.
For AWS users, it offers direct
integration so the MongoDB can be launched on AWS without ever
leaving Cloud Manager. You saw how to provision with AWS in Chapter
tk.
Cloud Manager also helps you discover
inefficiencies in your system and make corrections for smooth
operation.
It collects and reports metrics using the
agent you install. Cloud Manager provides a quick glance of the
MongoDB system health and helps you identify the root causes of
performance issues.
Next, you will look at the key metrics
that should be used for any performance investigation. Along the
way, you will also look at what the combination of the metric
indicates.
9.5.4.1 Metrics
You will be primarily focusing on the
following key metrics ; these metrics play a key role when
investigating a performance problem issue. They provide an
immediate glance of what’s happening inside the MongoDB system
and which of the system resources (i.e. CPU, RAM, or disk) are the
bottlenecks.
To view the below mentioned chart, you
can click the Deployment link under Deployment Section. Select the
MongoDB instance that has been configured to be monitored by Cloud
Manager. Next, select required graphs/charts from the Manage
Charts section.
Page fault shows the average number of
page faults per second happening in the system. Figure 9-6
shows the page faults graph.
OpCounters shows average number of
operations per second being performed on the system. See Figure
9-7.
In the Page Fault to Opcounters ratio,
the page faults depend on the operations being performed on the
system and what’s currently in memory. Hence a ratio of page
faults per second to that of opcounters per second can provide a
fair picture of the disk I/O requirement. See Figure 9-8.
The Queues graph displays the operations
count waiting for a lock to be released at any given time. See
Figure 9-9.
The CPU Time (IOWaits and User) graph
shows how the CPU cores are spending their cycles. See Figure
9-10.
IOWait indicates the time the CPU spends
waiting for the other resources, such as disks or the network. See
Figure 9-11.
User time indicates the time spent
performing computations such as documents updating, updating and
rebalancing indexes, selecting or ordering query results, or
running aggregation framework commands, Map/Reduce, or server-side
JavaScripts. See Figure 9-12.
9.6 Summary
10. MongoDB Use Cases
Shakuntala Gupta Edward1
and Navin Sabharwal2
10.1 Use Case 1 - Performance Monitoring
In this section, you will explore how to
use MongoDB to store and retrieve performance data. You’ll focus
on the data model that you will be using for storing the data.
Retrieving will consist of simply reading from the respective
collection. You will also be looking at how you can apply sharding
and replication for better performance and data safety.
We assume a monitoring tool that is
collecting server-defined parameter data in CSV format . In
general, the monitoring tools either store the data as text files
in a designated folder on the server or they redirect their output
to any reporting database server. In this use case , there’s a
scheduler that will be reading this shared folder path and
importing the data within MongoDB database.
10.1.1 Schema Design
The first step in designing a solution is
to decide on the schema. The schema depends on the format of the
data that the monitoring tool is capturing.
Although this captures the data, it makes
no sense to the user, so if you want to find out events are from a
particular server, you need to use regular expressions, which will
lead to full scan of the collection, which is very inefficient.
Instead, you can extract the data from
the log file and store it as meaningful fields in MongoDB
documents.
Note that when designing the structure,
it’s very important to use the correct data type. This not only
saves space but also has a significant impact on the performance.
For example, if you store the date and
time field of the log as a string, it’ll not only use more bytes
but it’ll also be difficult to fire date range queries. Instead
of using strings, if you store the date as a UTC timestamp, it
will take 8 bytes as opposed to 28 bytes for a string, so it’ll
be easier to execute date range queries. As you can see, using
proper types for the data increases querying flexibility.
The actual
log data might have extra fields; if you capture it all, it’ll
lead to a large document, which is an inefficient use of storage
and memory. When designing the schema, you should omit the
details that are not required. It’s very important to identify
which fields you must capture in order to meet your requirements.
10.1.2 Operations
Having designed the document structure,
next you will look at the various operations that you need to
perform on the system.
10.1.2.1 Inserting Data
1.
2.
3.
10.1.2.2 Bulk Insert
Inserting events in bulk is always
beneficial when using stringent write concerns, as in your case,
because this enables MongoDB to distribute the incurred
performance penalty across a group of insert.
If possible, bulk inserts should be used
for inserting the monitoring data because the data will be huge
and will be generated in seconds. Grouping them together as a
group and inserting will have better impact, because in the same
wait time, multiple events will be getting saved. So for this use
case, you will be grouping multiple events using a bulk insert.
10.1.2.3 Querying Performance Data
You have seen how to insert the event
data. The value of maintaining data comes when you are able to
respond to specific queries by querying the data.
For example, you may want to view all
the performance data associated with a specific field, say Host.
You will look at few query patterns for
fetching the data and then you will look at how to optimize these
operations.
Query2: Fetching Data Within a
Date Range from July 10, 2015 to July 20, 2015
This is important if you want to
consider and analyze the data collected for a specific date
range. In this case, an index on “time” will have a positive
impact on the performance.
Query3: Fetching Data Within a
Date Range from July 10, 2015 to July 20, 2015 for a Specific
Host
In such queries where multiple fields
are involved, the indexes that are used have a significant impact
on the performance. For example, for the above query, creating a
compound index will be beneficial.
Also note that the field’s order
within the compound index has an impact. Let’s understand the
difference with an example. Let’s create a compound index as
follows:
>
db.perfpoc.find({GeneratedOn:{"$gte":
ISODate("2015-07-10"), "$lte":
ISODate("2015-07-20")}, Host:
"Host1"}).explain(“allPlansExecution”)
>
db.perfpoc.find({GeneratedOn:{"$gte":
ISODate("2015-07-10"), "$lte":
ISODate("2015-07-20")}, Host:
"Host1"}).explain("allPlansExecution")
Using explain() , you can figure out the
impact of indexes and accordingly decide on the indexes based on
your application usage.
It’s also recommended to have a single
compound indexes covering maximum queries rather than having
multiple single key indexes.
Based on you application usage and the
results of the explain statistics, you will use only one compound
index on {'GeneratedOn':1, 'Host': 1} to cover all the above
mentioned queries.
Query4: Fetching Count of
Performance Data by Host and Day
Listing the
data is good, but most often queries on performance data are
performed to find out the count, average, sum, or other
aggregate operation during analysis. Here you will see how to
use the aggregate command to select, process, and aggregate the
results to fulfil the need of the powerful ad-hoc queries.
10.1.3 Sharding
The performance monitoring data set is
humongous, so sooner or later it will exceed the capacity of a
single server. As a result, you should consider using a shard
cluster .
In this section, you will look at which
shard key suits your use case of performance data properly so that
the load is distributed across the cluster and no one server is
overloaded.
The shard key controls how data is
distributed and the resulting system’s capacity for queries and
writes. Ideally, the shard key should have the following two
characteristics:
1.
2.
3.
However, the
biggest potential drawback is that all data collected for a
single host must go to the same chunk since all the documents in
it have the same shard key. This will not be a problem if the
data is getting collected across all the hosts, but if the
monitoring collects a disproportionate amount of data for one
host, you can end up with a large chunk that will be completely
unsplittable, causing an unbalanced load on one shard.
4.
10.1.4 Managing the Data
Since the performance data is humongous
and it continues to grow, you can define a data retention policy
which states that you will be maintaining the data for a specified
period (say 6 months).
So how do you remove the old data? You
can use the following patterns:
1.
Multiple
collections to store the data: The third pattern is to have a
day-wise collection created, which contains documents that store
that day’s performance data. This way you will end up having
multiple collections within a database. Although this will
complicate the querying (in order to fetch two days’ worth of
data, you might need to read from two collections), dropping a
collection is fast, and the space can be reused effectively
without any data fragmentation. In your use case, you are using
this pattern for managing the data.
10.2 Use Case 2 – Social Networking
In this section, you will explore how to
use MongoDB to store and retrieve data of a social networking site.
This use case
is basically a friendly social networking site that allows users to
share their statuses and photos. The solution provided for this use
case assumes the following:
1.
2.
3.
4.
10.2.1 Schema Design
The solution you are providing is aimed
at minimizing the number of documents that must be loaded in order
to display any given page. The application in question has two
main pages: a first page that displays the user’s wall (which is
intended to display posts created by or directed to a particular
user), and a social news page where all the notifications and
activities of all the people who are following the user or whom
the user is following are displayed.
In addition to these two pages, there is
a user profile page that displays the user’s profile-related
details, with information on his friend group (those who are
following him or whom he is following). In order to cater to this
requirement, the schema for this use case consists of the
following collections.
{Commentedby: {id: "user_id",
name: "user name"}, ts: ISODate(), Commenttext: "comment
text"}, .....
-
This collection is basically for
displaying all of the user’s activities. by provides
information on the user who posted the post. Circles controls the
visibility of the post to other users. Type is used to identify
the content of the post. ts is the datetime when post was
created. detail contains the post text and it has comments
embedded within it.
The third collection is user.wall, which
is used for rendering the user’s wall page in a fraction of a
second. This collection fetches data from the second collection
and stores it in a summarized format for fast rendering of the
wall page.
The forth collection is social.posts ,
which is used for quick rendering of the social news screen. This
is the screen where all posts get displayed.
10.2.2 Operations
10.2.2.1 Viewing Posts
Since the social.posts and user.wall
collections are optimized for rendering the news feed or wall
posts in a fraction of a second, the query is fairly
straightforward.
Both of the collections have similar
schema, so the fetch operation can be supported by the same code.
Below is the pseudo code for the same. The function takes as
parameters the following:
The above function retrieves all the
posts on the given user’s wall or news feed in
reverse-chronological order.
When rendering posts, there are certain
checks that you need to apply. The following are a few of them.
First, when the user is viewing his or
her page, while rendering posts on the wall you need to check
whether the same can be displayed on their own wall. A user wall
contains the posts that he has posted or the posts of the users
they are following. The following function takes two parameters:
the user to whom the wall belongs and the post that is being
rendered:
The above loop goes through the circles
specified in the user.profile collection, and if the mentioned
post is posted by a user on the list, it returns true.
This function first checks whether the
post’s circle is public. If it’s public, the post will be
displayed to all users.
If the post’s circle is not set to
public, it will be displayed to the user if he/she is following
the user. If neither is true, it goes to the circle of all the
users who are following the logged-in user. If the list of circle
is in posts circle list, this implies that the user is in a
circle receiving the post, so the post will be visible. If
neither condition is met, the post will not be visible to the
user.
10.2.2.2 Creating Comments
To create a comment by a user on a given
post containing the given text, you need to execute code similar
to the following:
Set comment document as {"by":
{id: commentedby[id], "Name": commentedby["name"]},
"ts": commentedon, "text": commenttext}
Since you are displaying a maximum of
three comments in both dependent collections (the user.wall and
social.posts collections), you need to run the following update
statement periodically:
No comments:
Post a Comment