Learn more about PostgreSQL
Benechmarking PostgreSQL Server
Pgbench is a benchmarking tool that is included in the PostgreSQL distribution. It is used to test the performance of PostgreSQL servers under a variety of workloads. Pgbench can be used to test a wide range of factors, including:
- Transaction throughput
- Latency
- Memory usage
- CPU usage
- Disk I/O
Pgbench works by creating a number of concurrent client sessions that execute a series of SQL statements. The SQL statements are typically representative of the types of transactions that are executed in a real-world application. Pgbench then measures the performance of the PostgreSQL server under the load generated by the client sessions.
Pgbench can be used to test a variety of workloads, including:
- TPC-B: A benchmark that simulates a complex online transaction processing (OLTP) workload.
- TPROC: A benchmark that simulates a simple OLTP workload.
- TPC-C: A benchmark that simulates a warehouse management system workload.
- Custom workloads: Pgbench can also be used to test custom workloads by specifying a script file that contains the SQL statements to be executed.
To run pgbench, you simply need to specify the following parameters:
- The number of concurrent client sessions
- The duration of the benchmark test
- The database to connect to
- The transaction script to use
Pgbench will then generate a report that shows the performance of the PostgreSQL server under the load generated by the client sessions.
The report will include the following metrics:
- Transactions per second (tps): The average number of transactions that were completed per second.
- Latency: The average time it took to complete a transaction.
- Memory usage: The amount of memory that was used by the PostgreSQL server during the benchmark.
- CPU usage: The amount of CPU time that was used by the PostgreSQL server during the benchmark.
- Disk I/O: The amount of disk I/O that was performed by the PostgreSQL server during the benchmark.
Here are some tips for using pgbench for newbies:
- Start with a simple benchmark: If you are new to pgbench, it is best to start with a simple benchmark, such as the TPROC benchmark. This will help you to get familiar with the tool and to learn how to interpret the results.
- Use a small number of clients: It is also best to start with a small number of concurrent client sessions when running pgbench benchmarks. You can then gradually increase the number of clients to see how the performance of the PostgreSQL server changes under load.
- Monitor the performance of the PostgreSQL server: When running pgbench benchmarks, it is important to monitor the performance of the PostgreSQL server. You can use tools such as pgAdmin or pgBouncer to monitor metrics such as CPU usage, memory usage, and disk I/O.
- Analyze the results: Once you have run a pgbench benchmark, it is important to analyze the results. This will help you to identify any performance bottlenecks and to make changes to improve the performance of the PostgreSQL server.
Now, We will explain in this workshop:
pgbench -i --fillfactor=90 --scale=500 --host=rdspg-fcj-labs.cssuddr073hp.us-east-1.rds.amazonaws.com --username masteruser pglab
- -i: *creates four tables
pgbench_accounts
, pgbench_branches
, pgbench_history
, and pgbench_tellers
, destroying any existing tables of these names.
- -f (–fillfactor=90): Specifies that the test data should be inserted using a fill factor of 90%. This means that each page of the database will be filled to 90% capacity before moving on to the next page.
- -s (–scale=500): Specifies that the test data should be scaled to 500 times the size of the default test data set
- -h (–host=rdspg-fcj-labs.cssuddr073hp.us-east-1.rds.amazonaws.com): Specifies the hostname of the PostgreSQL server to connect to.
- -U (–username=masteruser): Specifies the username to use to connect to the PostgreSQL server.
- pglab: The name of the database to create and populate.
pgbench --host=rdspg-fcj-labs.cssuddr073hp.us-east-1.rds.amazonaws.com --username masteruser --protocol=prepared -P 30 --time=300 --client=200 --jobs=200 pglab
- -h (–host=rdspg-fcj-labs.cssuddr073hp.us-east-1.rds.amazonaws.com): Specifies the hostname of the PostgreSQL server to connect to.
- -U (–username masteruser): Specifies the username to use to connect to the PostgreSQL server.
- -p (–protocol=prepared): Specifies that prepared statements should be used to execute the benchmark workload.
- –progress (-P 30): Specifies that the benchmark should run for 30 seconds.
- -T (–time=300): Specifies that the benchmark should run for 300 seconds.
- -c (–client=200): Specifies that 200 concurrent client sessions should be used to execute the benchmark workload.
- -j (–jobs=200): Specifies that 200 jobs should be created to execute the benchmark workload.
- pglab: The name of the database to create and populate.