Thursday, 23 June 2022

Routine Database Maintenance Optimization on Postgresql

 Required tasks

Just like a car, your database needs routine maintenance to help achieve optimum performance. There are four required tasks:

  • Create backup copies.
  • Vacuum the database.
  • Manage log files.
  • Update planner statistics.

Backup copies

Without a recent backup, you have little chance of recovery after a catastrophic event. Successful recovery requires the creation of backups of the data on a regular schedule.

Catastrophic events include disk failure, fire, or mistakenly dropping a critical table.

Vacuuming

PostgreSQL databases require periodic vacuuming. For many installations, it is sufficient to run the autovacuum daemon. This course will explore vacuuming in an upcoming lesson. 


Log-file management

Log output is valuable to users when diagnosing problems. You should save the server's log output. Users also need to rotate log files so that new log files are started and old ones are removed. Set a retention period for system logs by using the rds.log_retention_period parameter.

You can also use Amazon Relational Database Service (Amazon RDS) or Amazon Aurora to manage log files.

Updating planner statistics

The query planner relies on statistical information about the contents of tables to generate good plans for queries. It is important to have accurate statistics. Otherwise, poor choices of plans might degrade database performance.


Bloat 

Bloat, or fragmentation, can happen in tables and indexes. It is a side effect of the PostgreSQL multiversion concurrency control (MVCC). MVCC is what enables multiple transactions or sessions to hit the same rows at the same time. When a user runs UPDATE or DELETE statements, dead space is left behind. This dead space is unavailable and takes up disk space.

Under normal use, dead space is often left in the table and indexes after UPDATE and DELETE statements are run. If the database is maintained properly using VACUUM and the workload permits, this dead space can be reused. However, this leftover, unused space can occupy unnecessary disk space for INSERT, UPDATE, DELETE, and SELECT command

When you delete a row, it is not actually deleted. It is only marked as unavailable to future transactions. This dead space is bloat. Every time you run a query against a table, the query checks the visibility flags on individual rows and index entries to determine whether they are available to the transaction. Dead rows add time to a transaction. This is an example of how bloat can lead to poor performance.

Index bloat is checked independently of table bloat. A table could have little to no bloat, but one or more of its indexes could be badly bloated. Index bloat is easier to solve because you can re-index or create a new index, whereas removing dead rows requires exclusive use of the database. You will learn more about re-indexing in a later lesson.

Finding bloat

Finding table bloat is done by using estimates. More exact methods are available, but rarely does the extra overhead justify the need for precision. PostgreSQL comes with the pgstattuple contrib module. When that module is installed, users can get an accurate assessment of the amount of bloat in their database


pgstattuple

pgstattuple is a simple and powerful extension. It does not require additional libraries to be loaded. When statistics are needed, the pgstattuple extension runs specific functions on database objects.

The key function to detect bloat is the default pgstattuple(regclass). It returns information about live and dead tuples and the free space in the object.


To use the pgstattuple contrib module, follow these steps:

  • 1

    Install the pgstattuple module using these directions.

  • 2

    Run the CREATE EXTENSION command.

  • 3

     Call the  pgstattuple_approx function.


This function only works for tables. The following code snippet highlights the process for adding the pgstattuple contrib module and using pgstattuple_approx to call information about a table. The user can view table sizes, and use them to identify the location of possible bloat areas. It is important to note that though this function is fast, it sacrifices accuracy because it does not review each individual tuple.

CREATE EXTENSION  pgstattuple

select * from pgstattuple_approx('table');

For a precise calculation of bloat, call the pgstattuple ('pgbench_accounts'). This function examines every row in the entire table. This is resource intensive and will temporarily affect performance. For more information about bloat and table bloat queries,


Transaction ID wraparound

The concept of transaction IDs (XIDs) is critical to understand when you are maintaining a PostgreSQL database. XIDs can be exhausted, or used up, if not monitored properly. If exhaustion is ever reached, it will force the database to shut down. The repair could cause an indeterminate amount of downtime.

A transaction ID (XID) is assigned for every transaction that occurs in a database. The PostgreSQL MVCC transaction isolation-level semantics compares XID numbers

XID logic

A row version with an insertion XID greater than the XID of the current transaction is viewed as in the future. It is not visible to the current transaction.


XID wraparound

XID wraparound is a condition that occurs when the XID counter wraps around to zero. Suddenly transactions that were in the past appear to be in the future—which means their output becomes invisible. The database logic says that you can't see something that hasn't occurred yet.


XID limitation

XIDs are limited in size (to 32 bits). A cluster that runs for a long time (over 4 billion transactions) would suffer XID wraparound. This cause of XID wraparound is a size limitation.

FrozenTransactionId

To prevent XID wraparound, PostgreSQL reserves a special XID, FrozenTransactionId. The FrozenTransactionId does not follow the normal XID comparison rules. It is always considered older than every normal XID.

Frozen row versions are treated as if the insertion XID were FrozenTransactionId, so that they appear to be in the past to all normal transactions regardless of wraparound issues. Such row versions are valid until deleted.

Vacuuming

Periodic vacuuming solves the problem because the VACUUM command marks rows as frozen. The effects of the past insertion transaction are certain to be visible to all current and future transactions.

Avoiding XID wraparound

XID wraparound can result in loss of access to data. It is critical to vacuum every table in every database at least once every 2 billion transactions.


The VACUUM command

PostgreSQL databases require periodic vacuuming. This vacuuming prevents XID wraparound from causing problems. Vacuums can run manually or automatically. This lesson will review both.

The VACUUM command marks dead space left in tables and indexes as available for reuse. Note that running the VACUUM does not reclaim disk space. Disk space can be reclaimed, however, with some offline options. Two of these options, the VACUUM FULL command and pg_repack extension, will be covered later in this course. The drawback of the offline options is that the database cannot be in use like it can for the standard VACUUM command.


Manual vacuum mode and options

There are two variants of VACUUM: standard VACUUM and VACUUM FULL. We will cover the standard VACUUM first. The standard default VACUUM command vacuums the entire table and all associated indexes. It looks for free space and marks it in the Free Space Map (FSM).

The FSM keeps track of pages that have free space available for use. You can add parameters to the VACUUM command to fine-tune results and accomplish additional tasks.

To learn more about the VACUUM command and key parameters, flip each of the following cards.

This is the basic command used to vacuum a table. 


vacuum(table);


This command vacuums and performs a statistical analysis as well.


vacuum analyze(table);


This command vacuums, performs a statistical analysis, initiates the FREEZE operation, and resets xmin (the identity of the inserting transaction for this row version) to 2.


This command vacuums, performs a statistical analysis, and prints the statistics. 


vacuum analyze freeze (table);

This command vacuums, performs a statistical analysis, and prints the statistics. 

vacuum analyze verbose(table);



VACUUM FULL

VACUUM FULL actively compacts tables by writing a completely new version of the table with no dead space. This minimizes the size of the table but can take a long time. It also requires extra disk space for the new copy of the table until the operation is complete.


vacuum full analyze verbose [table]


The goal of routine vacuuming is to conduct standard VACUUMs often enough to avoid running a VACUUM FULL. Running a VACUUM FULL can reclaim more disk space but runs much slower than a standard VACUUM. Another drawback is that the VACUUM FULL command requires an exclusive lock on the table it is working. This means that the database must be offline. Administrators should strive to use a standard VACUUM, and avoid a VACUUM FULL if possible


vacuumdb

The vacuumdb utility is for cleaning a database. If you have multiple databases in a cluster, remember to vacuum each one.

The vacuumdb utility will also generate internal statistics used by the query optimizer. The vacuumdb utility is a wrapper around the SQL command VACUUM. 

Autovacuum

PostgreSQL has an optional but highly recommended feature called autovacuum. Autovacuum is a daemon that automates the launch of VACUUM and ANALYZE commands (to gather statistics). Autovacuum checks for bloated tables in the database and reclaims the space for reuse. The autovacuum daemon is activated by default in Aurora and Amazon RDS PostgreSQL.


When enabled, the AUTOVACUUM background process checks for tables that have had a large number of inserted, updated, or deleted tuples.

 

These checks use the statistics collection facility. To run an AUTOVACUUM, the track_counts parameter must be set to true.



In the default configuration, autovacuuming is activated and the related configuration parameters are appropriately set. VACUUM can run manually. However, PosgreSQL recommends that you run the autovacuum daemon.

The autovacuum daemon consists of multiple processes. There is a persistent daemon process, called the autovacuum launcher, that is in charge of starting autovacuum worker processes for all databases.

There is a maximum of autovacuum_max_workers worker processes that can run at the same time. If there are more processes than the maximum, the next database will process as soon as the first worker finishes.

Each worker process will check each table in its database, and run VACUUM and ANALYZE as necessary.

To monitor the autovacuum workers' activity, you can set the log_autovacuum_min_duration.


Autovacuum uses the following workflow:

  • 1

    Wake up.

  • 2

    Look for a table that has hit a certain threshold (default threshold is 50 tuples).

  • 3

    Vacuum the table.

  • 4

    Sleep.


Autovacuum is capable of all of the same parameters as the default manual VACUUM command, with the exception of FULL.

Review each tab to learn more about autovacuum settings that can control the behavior of the autovacuum daemon. 


A maximum of autovacuum_max_workers worker processes is allowed to run at the same time. If there are more than autovacuum_max_workers databases to be processed, the next database will be processed as soon as the first worker finishes.


The default is three. You can only set this parameter at the server start.  



The autovacuum_naptime parameter specifies the minimum delay between autovacuum runs on any given database.


In each round, the daemon examines the database and issues VACUUM and ANALYZE commands as necessary for tables in that database. If this value is specified without units, it is presumed to be seconds.


The default is one minute. This parameter can only be set in the postgresql.conf file or on the server command line.




Throttling is a feature built into the autovacuum daemon. It prevents the cleanup from consuming too many resources. The autovacuum_vacuum_cost_limit parameter specifies the cost limit value that will be used in automatic VACUUM operations. If -1 is specified (which is the default), the regular vacuum_cost_limit value will be used.


Note that the value is distributed proportionally among the running autovacuum workers if there is more than one. This is so that the sum of the limits for each worker does not exceed the value of this variable.



The maximum time that a table can go unvacuumed is 2 billion transactions minus the vacuum_freeze_min_age value at the time of the last aggressive vacuum. If it were to go unvacuumed for longer than that, data loss could result. To help ensure that this does not happen, autovacuum is invoked on any table that might contain unfrozen rows with XIDs older than the age specified by the configuration parameter autovacuum_freeze_max_age.  This will happen even if autovacuum is disabled.


The autovacuum_max_workers parameter specifies the maximum age (in transactions) that a table's pg_class.relfrozenxid field can attain before a VACUUM operation is forced to prevent transaction ID wraparound in the table.


Vacuum also allows removal of old files from the pg_xact subdirectory, which is why the default is a relatively low 200 million transactions. This parameter can only be set at server start, but the setting can be reduced for individual tables by changing table storage parameters. For more information, see PostgreSQL Section 24.1.5.



Autovacuum starvation

Autovacuum operates based on certain thresholds. This is how autovacuum determines which tables to vacuum first, based on:

  • Volatility and bloat present in the table
  • Number of changes to a table
  • Changes to pg_stat_user_tables using commands such as n_tup_upd and n_tup_del

If a table receives ultra-high-volume data changes, it would effectively appear at the top of the list for autovacuum each time it runs. This effectively drops your max_workers by one, which could lead to starvation. Starvation means that the autovacuum is always focused on the table at the top of the list, starving the rest of the database of cleanup.


Dealing with starvation

One way to deal with starvation is to modify global thresholds using postgresql.conf. This is not preferable, especially if only one or two tables are causing issues. You can instead modify thresholds on a per-table basis by using the ALTER TABLE command.


Additionally, you can increase max_workers by the number of bloated tables you identify. You can also disable autovacuum and run an outside job for vacuum instead, as shown in the following snippet.

Per table thresholds

The default values controlling autovacuum are adequate for most tables. However, a more complex table might need more aggressive settings.

Autovacuum workers rarely block other commands. If a process attempts to acquire a lock that conflicts with the SHARE UPDATE EXCLUSIVE lock held by autovacuum, lock acquisition will interrupt the autovacuum.

Users should take care because regularly running commands that acquire locks conflicting with a SHARE, UPDATE, or EXCLUSIVE lock can effectively prevent autovacuums from ever being completed.


Data Statistics


Planner/optimizer

A query plan is a sequence of steps used to access data. PostgreSQL devises a query plan for each query it receives. Choosing the right plan to match the query structure and the properties of the data is critical for good performance. The planner/optimizer creates an optimal plan. As a user, you should resolve and remove outdated statistics before they can create problems in your database operation.


Out-of-date, or expired, query plans and statistics should be updated so that you get accurate statistics. The PostgreSQL query planner relies on statistical information about the contents of tables to generate good plans for queries. These statistics are gathered by the ANALYZE command, which can be invoked by itself or as an optional step in VACUUM. AUTOVACUUM also gathers statistics automatically. It is important to have reasonably accurate statistics; otherwise, poor choices of plans might degrade database performance.


Cost optimizer

PostgreSQL uses a cost-based optimizer to generate query plans. Each operation in a query plan has a cost derived from table and index statistics.

The cost optimizer compares all query methods and selects the lowest cost option.


Age of statistics

Depending on the volatility of data in a table, the statistics can get stale very quickly. A table with data turning over every second will have more outdated statistics than a table that is used infrequently. The meaning of age is the interval at which statistics run and depends on the application.


select relname,last_analyze,last_autoanalyze from pg_stat_user_tables;



Updating table statistics

For large tables, ANALYZE takes a random sample of the table contents, rather than examining every row. The sampling method of ANALYZE enables a quick analysis on large tables.

However, the statistics are only approximate and will change slightly each time ANALYZE is run, even if the actual table contents did not change. This can result in small changes in the planner's estimated costs shown by EXPLAIN. In rare situations, ANALYZE will cause the planner's choices of query plans to change after it runs. To avoid this, increase the number of statistics collected by ANALYZE by adjusting the default_statistics_target.



pg_repack works online.

Unlike CLUSTER and VACUUM FULL, pg_repack works online, without maintaining an exclusive lock on the processed tables during implementation.

pg_repack requires server-side extension and client-side utility.

The following example shows how to use the CREATE EXTENSION command to activate the pg_repack utility.

SCboaiYlRSx3faeN-PostgreSQLMaintenance%25E2%2580%258B_code_Slide14a.png
The -k (no superuser check) option is required for Amazon RDS.

The following snippet shows how to use the -k (no superuser check) flag.

FWRMYvRvUrBx_5ZB-PostgreSQLMaintenance%25E2%2580%258B_code_Slide14b.png

pg_repack: Quick check

Take a moment to check your knowledge. Answer the following question to check what you have learned about pg_repack.

Unlike CLUSTER and VACUUM FULL, pg_repack works online without maintaining an exclusive lock on the processed tables during implementation.


Routine Reindexing

It is a good idea to periodically address the need for routine re-indexing on your database, especially if you don't vacuum frequently enough to clean up your database indexes. This is a best practice in maintaining a healthy database.

In some situations, it is worthwhile to periodically rebuild indexes using the REINDEX command. 



 

REINDEX rebuilds an index using the data stored in the index's table, replacing the old copy of the index.

 

It is also critical that the index is properly constructed to make sure it is efficiently used by the query optimizer. Because indexes add significant overhead to any table change operation, unused indexes should be removed.




You can reclaim empty index pages for reuse, but bloat can still occur. If this happens, use the REINDEX command to rebuild the index if:

  • An index has become corrupted and no longer contains valid data.
  • An index has become bloated, which means it contains many empty or nearly empty pages.
  • The index build fails with the CONCURRENTLY option, and this failure leaves the index invalid. Such indexes are useless. You can use REINDEX to rebuild them.

Minimizing reindex locking

To avoid locking on the index's parent table and index while rebuilding the index, an alternative is to create a new index concurrently and then drop the old index. PostgreSQL supports rebuilding indexes with minimum locking of writes. This method is invoked by specifying the CONCURRENTLY option of REINDEX. When this method is used, PostgreSQL must perform two scans of the table for each index that must be rebuilt. PostgreSQL then waits for termination of all existing transactions that could potentially use the index.

Rebuilding indexes concurrently

This method requires more total work than a standard index rebuild. It also takes significantly longer to complete because it must wait for unfinished transactions that might modify the index. However, because it allows normal operations to continue while the index is being rebuilt, this method is useful for rebuilding indexes in a production environment. Of course, the extra central processing unit (CPU), memory, and I/O load imposed by the index rebuild might slow down other operations.

The following snippet shows the steps in the script process. First, the script creates a new index using the same columns as the current index. This creates a new index without bloat. Second, the script drops the old bloated index. Third, the script renames the new index to replace the old index and commits changes.


Unused indexes

When creating and using indexes, remember that indexes add overhead for every INSERT and UPDATE. To avoid performance issues, remove an unused index if it does not enforce a constraint.

The following snippet shows a command string that is used to detect unused indexes.


PostgreSQLMaintenance​_code_Slide21.png








No comments:

Post a Comment