Deep dive into postgres stats: pg_stat_all_tables

Everything you always wanted to know about Postgres stats

Today's post is about pg_stat_all_tables. This view contains various statistics about tables usage and may be useful in different scenarios. I'm going to discuss the following ones:
  1. Sequential scans.
  2. Table's write activity.
  3. Autovacuum queue.
Sequential scans. One of the most useful types of information that you can get from checking pg_stat_all_tables is the number of scans. It shows you how many times tables were accessed directly or through indexes and how many rows were returned by these scans - this information is located in seq_scan/seq_tup_read and idx_scan/idx_tup_fetch columns.

We need the first two columns that show number of times the tables were accessed through sequential scans and number of tuples returned as a result.

Why use sequential scan? It's not a problem when seqscan handles small tables, but in case of larger tables sequential scan might read the whole table and this might take a while. It also becomes an issue when postgres handles many sequential scans concurrently and performance of the storage drops significantly. As a rule, it's an absence of index for new kind of queries, inaccurate statistics used by query planner or simply forgotten LIMIT clause in the query. Anyway, pg_stat_all_tables allow to quickly check, whether there are any sequential scans in the systems. Using the mentioned columns we can write quite simple query and get the right information:

    schemaname, relname,
    seq_scan, seq_tup_read,
    seq_tup_read / seq_scan as avg_seq_tup_read
FROM pg_stat_all_tables
WHERE seq_scan > 0

schemaname |         relname        | seq_scan | seq_tup_read | avg_seq_tup_read


public     | deals                  |      621 |  81712449358 |        131582044
public     | client_balance         |       26 |    574164012 |         22083231
public     | events_by_date_summary |     2698 |  57342287963 |         21253627
public     | clients_summary        |     5924 |  91655173288 |         15471838
public     | advertising_stats      |      505 |   5055606286 |         10011101

In this query we use manually calculated avg_seq_tup_read, It's an average number of rows returned by one scan. You should pay attention to tables and queries when the average number of rows go beyond million rows per scan. You also can add pg_relation_size() function to get an idea of tables’ sizes and a rough estimate of the amount of data read during each scan.

So, when sequentially accessed tables are found, you should remember which queries use these tables, review them and try to fix causes of scans.

Table's write activity. Table's data input/output in Postgres is not a simple process, especially writing operations. The INSERT and DELETE commands are simpler than UPDATE, because that one doesn't modify target rows - UPDATE command inserts new versions of the rows and marks old versions as removed. Also, if there are any indexes referenced in the updated rows, similar changes within indexes will be made. Thus, update operations aren't lightweight as they might seem. In other words, Postgres doesn't like heavy update-intensive workload. To alleviate overhead caused by writing operations various improvements have been made and the most known one is the HOT (Heap-only tuples) updates introduced in 8.3.

In short, it allows leave index's entries untouched when non-indexed values are updated within rows. This, however, only works when free space (or space which marked for reuse) is available within page where the target rows are - HOT updates don't work when page is completely filled by the rows.

What about pg_stat_all_tables? Using this view we can estimate HOT updates ratio for the most updated tables. Each table within this view has n_tup_upd and n_tup_hot_upd columns that correspond to total number of regular and HOT updates for particular tables. Thus, the task comes down to find the tables with highest writing activity and calculate their HOT rate. Example of the query can be found here.

What's next? Tables with high HOT rate are the “good” tables, we should pay attention to tables with high number of writes and low or zero HOT rate. General rule for these is change in fillfactor settings - that allows to reserve free space when new rows inserted and table is extended. Reserved space guarantees that rows will be updated within page and there is a high chance that HOT update will occur. The fillfactor setting can be changed on the fly with ALTER TABLE command and good starting value for it is 70 or 80. Also, you need to know a few rules to properly work with fillfactor.
  • The first, after fillfactor changes your table will take up more space on the disk. 
  • The second is that fillfactor is applied only to newly allocated pages and if you want new fillfactor for all table' pages you will need to rebuild the table and it might be quite painful (hello VACUUM FULL). 
  • The third and the last, is that fillfactor is useful only in cases of queries that update non-indexed values, otherwise there won’t be any positive effect.
Autovacuum queue. Autovacuum is the really important feature in Postgres that allows you to keep tables and indexes intact - it cleans dead rows’ versions, thus in case of ineffective autovacuum, tables and indexes will bloat so it would permanently affect the performance. Number of concurrently working autovacuums is limited by autovacuum_max_workers parameter and by default it's 3. When database has a lot of tables with high number of writing operations, autovacuum_max_workers might create a bottleneck and tables that require vacuuming might wait a long time before they will be cleaned. Since Postgres 9.6 autovacuum can be observed with new pg_stat_progress_vacuum view, but there is still no information about how many tables require vacuuming. Using information from other views it's possible to estimate size of so-called autovacuum queue.

I'd like to introduce another useful query for listing tables that require autovacuum. This query is slightly long, so here is the link. Without getting into nitty-gritty of this query workings , I'd like to mention a few key points to keep in mind:
  • the query shows list of tables that require normal or wraparound vacuum, or analyze. 
  • query takes into account information about table's storage parameters.
  • important to remember: the query also shows tables that are currently processed by autovacuum.
Using this query, you can estimate queue size and tune autovacuum accordingly. It's ok, when queue is empty, it means that autovacuum can cope with the amount of work. When the queue isn't empty, it might be a good idea to configure the autovacuum in a more aggressive way. There are several possible ways to do that:
  1. increase number of autovacuum_max_workers - this allows to run more autovacuum workers concurrently.
  2. increase autovacuum_vacuum_cost_limit or vacuum_cost_limit - this allows to process more pages per round and vacuum faster.
  3. decrease autovacuum_vacuum_cost_delay - this is a pause between rounds when autovacuum sleeps, reducing delay allows the vacuum to rest less and work more.
  4. decrease autovacuum_vacuum_scale_factor and autovacuum_analyze_scale_factor - this is the ratio of dead rows or changes since the last analyze, and is used to calculate threshold that vacuum or analyze of table should trigger. After reducing these scale factors, vacuum operations will occur more frequently on tables that require vacuuming.
All these methods can be applied in various combinations and the most important point here is to remember to check the storage utilization, because too aggressive vacuum might generate higher number of IO operations which in turn might affect responsiveness and overall performance efficiency. 
It would be a good idea to add information on running autovacuum into the monitoring and to always have a clear picture about it.
postgres autovacuum monitoring

Finally, I'd like to point out that the pg_stat_all_tables is quite useful view and this post by all means doesn't cover all possible cases for its use. Moreover, there is a similar pg_statio_all_tables view, that contains information about table's buffer IO and you can join this table with pg_stat_all_tables to make your stats queries even more informative.

Hope you enjoyed this post, if you have any questions please comment!


Popular Posts