Safeguarding from Postgres RDS Lock with session-level statement_timeout

Published on October 23, 2024

Safeguarding Against PostgreSQL Locking with Session-Level statement_timeout

PostgreSQL’s locking mechanism is essential for maintaining data consistency and ensuring proper concurrent access to resources. However, in Amazon RDS (Relational Database Service), long-running locks can lead to application delays or failures, especially under high concurrency.

What is database lock?

A database lock is a mechanism that ensures data integrity by controlling how multiple transactions access shared resources (such as data rows, tables, or pages) in a database concurrently. Locks prevent conflicts, such as two transactions trying to modify the same data simultaneously, which could lead to inconsistencies or corruption.

AWS RDS Performance Insight

The symptom can usually be observed from AWS RDS Performance Insight, where there is a long Average active session with “LWLock: BufferIO” and “LWLock: LockManager.”

BufferIO is usually related to the DataFileRead event, where an SQL query tries to access the shared resources but is locked by ongoing SQL execution.

Example on RDS Performance Insight

You can further investigate by running the below query:

SELECT * FROM pg_stat_activity
WHERE wait_event_type LIKE '%Lock%'
ORDER BY wait_event_type desc;

Release Lock

To release the lock, you can run the below query to find out the blocking PID, then cancel/terminate the process PID.

SELECT pg_cancel_backend(pid) FROM pg_stat_activity
WHERE pid <> pg_backend_pid()
AND wait_event_type LIKE '%Lock%';

In case you are unable to cancel the process, you can use the following:

SELECT pg_terminate_backend(pid) FROM pg_stat_activity
WHERE pid <> pg_backend_pid()
AND wait_event_type LIKE '%Lock%';

Consequences of Database lock

1. Lock Contention

  • Increased waiting times: Transactions may spend more time waiting for locks to be released, reducing throughput.
  • Reduced concurrency: If too many locks are required or locks are held for a long time, fewer transactions can run simultaneously, limiting database performance.
  • Poor user experience: Applications that rely on the database may experience slowdowns, leading to longer response times for end users.

2. Starvation

  • Indefinite delays: Some transactions may never get the chance to run, leading to tasks not completing.
  • Business impact: Critical tasks might be starved, delaying the processing of essential business operations.

3. Performance Degradation

  • Lower throughput: More time is spent managing locks than executing transactions, reducing overall system performance.
  • Inefficient resource usage: Overly restrictive locking reduces database concurrency and wastes processing resources as transactions sit idle, waiting for locks to be released.

4. Cost of Conflict Resolution

  • Higher resource usage: CPU, memory, and disk resources are consumed, leaving fewer resources for transaction processing.

Prevention DB lock

statement_timeout

To safeguard the PostgreSQL database, statement_timeout is available to set any running statement's maximum duration.

statement_timeout is the parameter that sets the maximum amount of time, in milliseconds, that any SQL statement can run before it is automatically terminated. This helps prevent long-running or runaway queries from consuming too many resources or affecting database performance.

We can implement statement timeout at three levels

  1. Database Level
  2. User Level
  3. Session Level

Use Case:

  • Preventing Long-Running Queries: This is useful in environments where runaway or inefficient queries might affect overall database performance. By setting a reasonable timeout, you can ensure that no query monopolizes resources for too long.
  • Protecting Critical Applications: In applications where query performance and response time are critical, setting statement_timeout helps ensure that the application does not hang waiting for a query that takes an unexpectedly long time to complete.
  • Optimizing Resource Usage: In multi-tenant or shared databases, setting a statement_timeout helps prevent one long-running query from affecting other users.

Database Level

This is the global setting for the statement timeout. This affects all sessions unless overridden by a session-specific setting.

Not Recommended to set this because it affects long-running maintenance queries like vacuum and analyze

# Modify statement_timeout on a DB parameter group
aws rds modify-db-parameter-group \
--db-parameter-group-name \
--parameters "ParameterName=statement_timeout,ParameterValue=,ApplyMethod=immediate"

# Modify statement_timeout on a DB cluster parameter group
aws rds modify-db-cluster-parameter-group \
--db-cluster-parameter-group-name \
--parameters "ParameterName=statement_timeout,ParameterValue=,ApplyMethod=immediate

User Level

This allows you to enforce a specific setting for particular users without affecting the global configuration or other users.

ALTER USER  SET statement_timeout='30s';

Session Level

This setting applies only to the current session. It can be set at the start of a session or modified dynamically within the session as needed. Once a session ends, the timeout setting reverts to the default value for new sessions.

On the session level, it can be set on the JDBC URL for the config map.

Example of springboot configuration:

datasource:
url: jdbc:postgresql://localhost:5432/process?options=-c%20statement_timeout=30000

Benefit of Session-Level statement_timeout

  1. Avoid Long-Running Queries
  • Automatic query termination: By setting a statement_timeout, any query that exceeds the specified duration will automatically be terminated. This prevents long-running queries from consuming too many resources and affecting the overall performance of the database.
  • Better resource management: Long-running queries can hold locks for an extended time, potentially causing contention or blocking other queries. By capping the execution time, you ensure no query holds resources too long.

2. Prevents Accidental Resource Exhaustion

  • Mitigates impact of inefficient queries: Queries with inefficient execution plans (e.g., due to missing indexes, suboptimal joins, or table scans) can run much longer than expected. A session-level statement_timeout helps prevent such queries from monopolizing CPU, memory, or disk I/O.
  • Protects from accidental mistakes: Developers or applications may occasionally unintentionally issue costly queries. The timeout acts as a safeguard, preventing these queries from running indefinitely and bringing down the system’s performance.

3. Improves Database Stability

  • Reduces locking issues: Long-running queries can hold row-level or table-level locks for extended periods, causing other transactions to wait. With a session-level statement_timeout, you reduce the chances of creating blocking situations that can lead to lock contention or even deadlocks.
  • Prevents transaction bloat: When a query is terminated due to the timeout, PostgreSQL rolls back the transaction, freeing any locks and avoiding potential transaction bloat.

4. Customizable Per Session

  • Session-specific tuning: You can set the statement_timeout at session level to customize it based on the expected workload or client needs. This is useful in environments where different applications or users have varying performance requirements
  • For example, analytical queries might need more extended execution than transaction-based workloads. Customizing the timeout ensures the application balances performance and stability best.

5. Improves Application Responsiveness

  • Prevents client-side hang: In web applications or APIs, users expect timely responses. A runaway query can delay response times significantly, leading to a poor user experience. Setting a session-level statement_timeout helps avoid this by ensuring queries fail gracefully after a set time, allowing the application to handle the failure (e.g., by retrying, reporting an error, or returning cached data).
  • Better time-bound expectations: Time-sensitive applications must avoid waiting indefinitely for results. With the timeout set, applications can handle delayed responses or failures more predictably.

6. Provides a Fallback When Application-Side Timeouts Fail

  • Catch missed application-level timeouts: Sometimes, application-layer timeouts (e.g., HTTP request timeouts, ORM-level query timeouts) may not be correctly set or fail to trigger. By setting the statement_timeout at the database level, you ensure that the database has a built-in safety mechanism to catch runaway queries, even if the application does not.

7. Helps in Query Optimization

  • Identifies problematic queries: Queries that consistently hit the statement_timeout are likely inefficient or need optimization. By monitoring such queries, you can focus on optimizing them (e.g., by adding indexes, improving query logic, or partitioning tables).
  • Encourages better query design: Developers and DBAs are likelier to write optimized queries if they know a hard time limit. It pushes them to think about performance while designing and testing queries.

8. Prevents Connection Pool Saturation

  • Reduces long-held connections: Long-running queries can hog connections in applications using connection pooling, causing a bottleneck where new queries cannot be executed until the pool is freed up. A session-level statement_timeout ensures that such queries are terminated, freeing up the connection pool faster and maintaining throughput.

Key Takeaway

RDS Performance Insights:

AWS RDS provides a Performance Insights feature, which can help you track query performance, resource consumption, and locking activity. These tools make identifying problematic queries or periods of high lock contention easier.

Session-level statement_timeout :

Overall, setting a session-level statement_timeout provides a simple yet effective safeguard against the risks posed by inefficient or long-running queries, leading to a more stable and performant PostgreSQL environment.


Safeguarding from Postgres RDS Lock with session-level statement_timeout was originally published in Government Digital Products, Singapore on Medium, where people are continuing the conversation by highlighting and responding to this story.