Tuesday, 27 February 2018

When And Why To Use HugePages on Linux x86-64


Applies to:

Oracle Database - Standard Edition - Version 10.1.0.2 to 12.2.0.1 [Release 10.1 to 12.2]
Linux x86-64
This document applies only to database with large SGA and / or many sessions.

Many documents describe the usage of HugePages on Linux as a possibility but do not emphasize enough that HugePages becomes a requirement in certain database configurations.
For a database with a small SGA or with a small number of connected users (small 'sessions') configuring HugePages won't give any improvement.
But for a database with large SGA (above 2Gb) and with many connected users (sessions>500) configuring HugePages becomes mandatory in order to diminish the memory usage at the OS level and to improve overall performance of the database.

Symptoms

The following symptoms can be observed on Linux x86-64 systems:
- ORA-4030
- poor performance of the database, slow sessions, slow queries
- low free physical memory on the system or swapping
All these symptoms happen with no obvious evidence of high memory usage on the instance side: SGA + total PGA used is much less than physical memory.
Looking at OS level we see high 'Filesystem cache', shown as "cache" or "buff/cache" in the output of 'free', despite the fact that database uses directIO (IO with the datafiles bypass Filesystem cache).
A database uses directIO when it has ASM or it has any of the following parameters:
filesystemio_options=setall
filesystemio_options=directio

During the problem time, the size of PageTables from /proc/meminfo is high, e.g it can be equal or greater than the entire SGA.

Changes

Increased the SGA to a larger value, or increased the number of sessions, or more users started to use this database and started to experience one of the described symptoms.
This is not happening all the time but only during peak hours, when there are many connected sessions at database level.
This document applies only to database with large SGA and/or many connected sessions.

Cause

Many documents describe the usage of HugePages on Linux as a possibility but do not to emphasize enough that it is mandatory to use HugePages in certain database configurations.
For a database with a small SGA or with a small number of connected users (small 'sessions') configuring HugePages won't give any improvement. But for database with large SGA (above 2Gb) and /or with many connected users (many opened sessions) configuring HugePages becomes a requirement to diminish the memory consumption at the OS level.
The foreground process of each session will have a memory structure called PageTable through which that process access the SGA.
In a nutshell, since the default page size is small on Linux (and this is an Intel limitation) with applications using very large memory, the page tables become too big and unmanageable.
HugePages is the solution developed to overcome this. It is definitely not a kernel bug. For Oracle databases using large amount of SGA area, using HugePages is the best practice, if not a necessary requirement.

The bigger the SGA is, the bigger the PageTable of each process will be. This PageTable exist for each process so for many connected sessions the sum of all PageTables will grow even bigger.
The total size of memory structure PageTables can be seen with:
grep PageTables /proc/meminfo
The 'free' utility does not show explicitly this type of memory but does include it in Filesystem cache under "cache" or "buff/cache".
A database with large SGA and/or many sessions, must be configured with HugePages. It is recommended for the performance and small memory footprint.
It could be a database with relatively low sga_target=2-5Gb but with many sessions=1000-2000 or a database with huge sga_target=200Gb and few sessions=500.
In these particular database configurations PageTables can consume additional memory on the machine which can be equal or greater to the total SGA size.
The PageTables memory is added to the SGA and total PGA memory consumed by the database on that machine.
Some examples:
1/ For a database with:
sga_target=25 GB
sessions= 500
at peak time (near 500 sessions connected) PageTables consumes 20Gb from the physical memory of the machine
grep PageTables /proc/meminfo
PageTables: 26365324 kB
So this database at peak times would use 25Gb (sga) + 20Gb (PageTables) + 10Gb (total PGA) = 55Gb .
A machine with 45Gb physical RAM, which we normally think would accommodate easily a database with 25Gb SGA and 10Gb total PGA, will start swapping using 10Gb of swap.
Performance goes down and if more users connect to the database, we would end up without memory.
2/ For a database with:
sga_target = 250G
sessions= 5000
a. before configuring HugePages we see a huge 'buff/cache' of 209Gb almost equal to the SGA
free -g
         total   used   free   shared   buff/cache   available
Mem: 503      118    175       201    209            181
Swap: 19         1     18
b. after configuring  HugePages, 'buff/cache' drops by 95% and 'free' memory increase considerably
free -g
          total   used   free   shared   buff/cache   available
Mem:  503    210    283       0         10              291
Swap: 19         1      18

3/ For a database with:
sga_target=2Gb
sessions=500
 
a. without HugePages, with 500 connected sessions, PageTables consumes 2.5Gb from the physical memory of the machine
    grep PageTables /proc/meminfo
    PageTables: 2617248 kB b. after configuring HugePages, with 500 connected sessions, PageTables consumes only 200Mb from the physical memory of the machine
    grep PageTables /proc/meminfo
    PageTables: 226364 kB

- With HugePages enabled, the system uses fewer PageTables, reducing the overhead for maintaining and accessing them. Huge pages remain pinned in memory and are not replaced, so the kernel swap daemon has no work to do in managing them, and the kernel does not need to perform page table lookups for them. The smaller number of pages reduces the overhead involved in performing memory operations, and also reduces the likelihood of a bottleneck when accessing page tables.
- Without HugePages, the operating system keeps each 4KB ( normally the default page size) of memory as a page, and when it is allocated to the SGA, then the lifecycle of that page (dirty, free, mapped to a process, and so on) is kept up to date by the operating system kernel, and for that creates problems of performance.

Solution

IF 
  SGA >= 2GB AND sessions >= 500
THEN
enabling HugePages becomes a requirement on Linux 64-bit system.

No comments:

Post a Comment