Sunday, April 24, 2016

A quick overview of DRITA

A while back, I had to use a tool from EDB called Dynamic Runtime Instrumentation Tools Architecture (DRITA) to measure wait times in EDB Postgres Advanced Server (EPAS). More recently, a project came my way where system baselines had to be established, and compared at a later point in time. I decided to use DRITA for EDB Postgres as the tool to collect and view database performance for this project. 

DRITA builds on features in PostgreSQL like the Postgres statistics collector, extensions such as pg_stat_statements, and modules like pg_buffercacheDRITA is especially helpful to Oracle DBAs who are familiar with Oracle Statspack, but new to Postgres. Here is a quick overview. As always, read the docs




DRITA requires timed_statistics to be enabled. 

demo=# \! grep timed $PGDATA/postgresql.conf
timed_statistics = on # record wait timings, defaults to on

demo=# show timed_statistics;
 timed_statistics
------------------
 on
(1 row)

Some information will be excluded from the reports if pg_buffercache is not installed. 

demo=# create extension pg_buffercache;

CREATE EXTENSION

Initialize pgbench, so we can generate some server activity

demo=# \! pgbench -i -F 90 -s 10 demo
NOTICE:  table "pgbench_history" does not exist, skipping
NOTICE:  table "pgbench_tellers" does not exist, skipping
NOTICE:  table "pgbench_accounts" does not exist, skipping
NOTICE:  table "pgbench_branches" does not exist, skipping
creating tables...
100000 of 1000000 tuples (10%) done (elapsed 0.10 s, remaining 0.92 s)
200000 of 1000000 tuples (20%) done (elapsed 0.24 s, remaining 0.94 s)
300000 of 1000000 tuples (30%) done (elapsed 0.39 s, remaining 0.91 s)
400000 of 1000000 tuples (40%) done (elapsed 0.56 s, remaining 0.84 s)
500000 of 1000000 tuples (50%) done (elapsed 0.73 s, remaining 0.73 s)
600000 of 1000000 tuples (60%) done (elapsed 0.87 s, remaining 0.58 s)
700000 of 1000000 tuples (70%) done (elapsed 1.04 s, remaining 0.45 s)
800000 of 1000000 tuples (80%) done (elapsed 1.38 s, remaining 0.34 s)
900000 of 1000000 tuples (90%) done (elapsed 3.43 s, remaining 0.38 s)
1000000 of 1000000 tuples (100%) done (elapsed 4.36 s, remaining 0.00 s)
vacuum...
set primary keys...
done.

Check database size. 

demo=# select pg_size_pretty(pg_database_size('demo'));
 pg_size_pretty
----------------
 174 MB
(1 row)

Call the edbsnap function to take a starting snapshot.  

demo=# select * from edbsnap();
       edbsnap
----------------------
 Statement processed.

(1 row)

Generate server activity with pgbench . 

demo=# \! pgbench -n -c 2 -T 600 demo
transaction type: TPC-B (sort of)
scaling factor: 10
query mode: simple
number of clients: 2
number of threads: 1
duration: 600 s
number of transactions actually processed: 313975
latency average: 3.822 ms
tps = 523.287709 (including connections establishing)
tps = 523.290766 (excluding connections establishing)

Call the edbsnap function again to take an ending snapshot of the workload. 

demo=# select * from edbsnap();
       edbsnap
----------------------
 Statement processed.

(1 row)

We can check which snapshots are available by calling the get_snaps function. 

demo=# select * from get_snaps();
          get_snaps
-----------------------------
 1 19-APR-16 17:39:53.043944
 2 19-APR-16 17:58:34.766845
(2 rows)

There are five available reports for comparing two snapshots. The edbreport function includes data from all five reports.  We can view the report in psql, or export the report analyze and compare later. 


demo=# \o /tmp/demo_pgbench_20160419
demo=# select * from edbreport(1,2);

demo=# \o

-bash-4.2$ cat demo_pgbench_20160419
                                                                       edbreport
-------------------------------------------------------------------------------------------------------------------------------------------------------
    EnterpriseDB Report for database demo        19-APR-16
 Version: EnterpriseDB 9.5.1.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-55), 64-bit

      Begin snapshot: 1 at 19-APR-16 18:23:52.980122
      End snapshot:   2 at 19-APR-16 18:35:04.78618

 Size of database demo is 191 MB
      Tablespace: pg_default Size: 857 MB Owner: enterprisedb
      Tablespace: pg_global Size: 920 kB Owner: enterprisedb

 Schema: pg_toast_temp_1                Size: 0 bytes         Owner: enterprisedb
 Schema: public                         Size: 180 MB          Owner: enterprisedb

                Top 10 Relations by pages

 TABLE                                        RELPAGES
 ----------------------------------------------------------------------------------
 pgbench_accounts                              18182
 pgbench_history                               1976
 pg_proc                                       140
 pg_attribute                                  83
 pg_depend                                     78
 pg_description                                37
 pg_collation                                  36
 pg_class                                      20
 pg_rewrite                                    20
 pgbench_tellers                               19

                Top 10 Indexes by pages

 INDEX                                        RELPAGES
 ----------------------------------------------------------------------------------
 pgbench_accounts_pkey                         2745
 pg_depend_reference_index                     65
 pg_depend_depender_index                      63
 pg_proc_proname_args_nsp_index                51
 pg_attribute_relid_attnam_index               22
 pg_description_o_c_o_index                    21
 pg_attribute_relid_attnum_index               16
 pg_proc_oid_index                             13
 pg_shdepend_depender_index                    9
 pg_shdepend_reference_index                   8

                Top 10 Relations by DML

 SCHEMA          RELATION                                     UPDATES   DELETES   INSERTS
 --------------------------------------------------------------------------------------------------
 public          pgbench_accounts                             313975    0         1000000
 public          pgbench_tellers                              313975    0         100
 public          pgbench_branches                             313975    0         10
 public          pgbench_history                              0         0         313975

   DATA from pg_stat_database

 DATABASE   NUMBACKENDS  XACT COMMIT  XACT ROLLBACK   BLKS READ  BLKS HIT   BLKS ICACHE HIT      HIT RATIO  ICACHE HIT RATIO
 ------------------------------------------------------------------------------------------------------------------------------
 demo       0            314407       0               4966       5469881    0                    99.91      0.00

   DATA from pg_buffercache

 RELATION                            BUFFERS
 -----------------------------------------------------------------------------
 pgbench_accounts                    18190
 pgbench_accounts_pkey               2745
 pgbench_history                     2006
 pg_toast_2618                       110
 pg_statistic                        31
 pg_rewrite                          24
 pgbench_tellers                     23
 pg_depend                           22
 pgbench_branches                    17
 edb$stat_all_indexes                16

   DATA from pg_stat_all_tables ordered by seq scan

 SCHEMA               RELATION                       SEQ SCAN   REL TUP READ IDX SCAN   IDX TUP READ INS    UPD    DEL
 -----------------------------------------------------------------------------------------------------------------------
 public               pgbench_branches               128602     1206750      193301     193301       0      313975 0
 public               pgbench_tellers                121518     12045800     193517     193517       0      313975 0
 pg_catalog           pg_extension                   268        1240         0          0            0      0      0
 pg_catalog           pg_database                    110        1210         2796       2796         0      0      0
 pg_catalog           pg_tablespace                  107        166          820        820          0      0      0
 pg_catalog           pg_class                       105        42461        3741       3149         0      0      0
 pg_catalog           pg_am                          91         91           0          0            0      0      0
 pg_catalog           pg_namespace                   18         558          431        348          0      0      0
 pg_catalog           pg_constraint                  14         182          0          0            0      0      0
 pg_catalog           pg_db_role_setting             14         65           3320       830          0      0      0

   DATA from pg_stat_all_tables ordered by rel tup read

 SCHEMA               RELATION                       SEQ SCAN   REL TUP READ IDX SCAN   IDX TUP READ INS    UPD    DEL
 -----------------------------------------------------------------------------------------------------------------------
 public               pgbench_tellers                121518     12045800     193517     193517       0      313975 0
 public               pgbench_branches               128602     1206750      193301     193301       0      313975 0
 pg_catalog           pg_proc                        14         44138        3100       4512         0      0      0
 pg_catalog           pg_class                       105        42461        3741       3149         0      0      0
 pg_catalog           pg_extension                   268        1240         0          0            0      0      0
 pg_catalog           pg_database                    110        1210         2796       2796         0      0      0
 pg_catalog           pg_index                       5          815          1161       1461         0      0      0
 pg_catalog           pg_namespace                   18         558          431        348          0      0      0
 pg_catalog           pg_constraint                  14         182          0          0            0      0      0
 pg_catalog           pg_tablespace                  107        166          820        820          0      0      0

   DATA from pg_statio_all_tables

 SCHEMA               RELATION             HEAP     HEAP     HEAP     IDX      IDX      IDX      TOAST    TOAST    TOAST    TIDX     TIDX     TIDX
                                           READ     HIT      ICACHE   READ     HIT      ICACHE   READ     HIT      ICACHE   READ     HIT      ICACHE
                                                             HIT                        HIT                        HIT                        HIT
 -----------------------------------------------------------------------------------------------------------------------------------------------------
 public               pgbench_accounts     0        996476   0        2745     1884521  0        0        0        0        0        0        0
 public               pgbench_branches     17       913444   0        2        193513   0        0        0        0        0        0        0
 public               pgbench_tellers      21       682650   0        7        397685   0        0        0        0        0        0        0
 public               pgbench_history      2006     328733   0        0        0        0        0        0        0        0        0        0
 pg_catalog           pg_proc              0        5334     0        1        8606     0        0        0        0        0        0        0
 pg_catalog           pg_class             0        5089     0        0        7724     0        0        0        0        0        0        0
 pg_catalog           pg_attribute         24       4381     0        2        8702     0        0        0        0        0        0        0
 pg_catalog           pg_database          0        2912     0        0        4456     0        0        0        0        0        0        0
 pg_catalog           pg_operator          0        1876     0        0        1973     0        0        0        0        0        0        0
 pg_catalog           pg_type              0        1787     0        0        4756     0        0        0        0        0        0        0

   DATA from pg_stat_all_indexes

 SCHEMA               RELATION                  INDEX                               IDX SCAN   IDX TUP READ IDX TUP FETCH
 -------------------------------------------------------------------------------------------------------------------------
 public               pgbench_accounts          pgbench_accounts_pkey               627950     627950       627950
 public               pgbench_tellers           pgbench_tellers_pkey                193517     226804       193517
 public               pgbench_branches          pgbench_branches_pkey               193301     413723       129226
 pg_catalog           pg_attribute              pg_attribute_relid_attnum_index     4287       10820        10820
 pg_catalog           pg_db_role_setting        pg_db_role_setting_databaseid_rol_i 3320       830          830
 pg_catalog           pg_class                  pg_class_oid_index                  2819       3318         2822
 pg_catalog           pg_proc                   pg_proc_oid_index                   2480       2480         2480
 pg_catalog           pg_cast                   pg_cast_source_target_index         2038       355          355
 pg_catalog           pg_type                   pg_type_oid_index                   1741       1741         1741
 pg_catalog           pg_database               pg_database_datname_index           1559       1559         1559

   DATA from pg_statio_all_indexes

 SCHEMA               RELATION                  INDEX                               IDX BLKS READ   IDX BLKS HIT    IDX BLKS ICACHE HIT
 ---------------------------------------------------------------------------------------------------------------------------------------
 public               pgbench_accounts          pgbench_accounts_pkey               2745            1884521         0
 public               pgbench_tellers           pgbench_tellers_pkey                7               397685          0
 public               pgbench_branches          pgbench_branches_pkey               2               193513          0
 pg_catalog           pg_attribute              pg_attribute_relid_attnum_index     2               8702            0
 pg_catalog           pg_class                  pg_class_oid_index                  0               5735            0
 pg_catalog           pg_proc                   pg_proc_oid_index                   1               5050            0
 pg_catalog           pg_db_role_setting        pg_db_role_setting_databaseid_rol_i 0               4150            0
 pg_catalog           pg_type                   pg_type_oid_index                   0               3573            0
 pg_catalog           pg_proc                   pg_proc_proname_args_nsp_index      0               3556            0
 pg_catalog           pg_statistic              pg_statistic_relid_att_inh_index    0               2548            0

    System Wait Information

 WAIT NAME                                COUNT      WAIT TIME       % WAIT
 ---------------------------------------------------------------------------
 wal flush                                2111       7.738697        37.57
 query plan                               7189       5.251493        25.49
 wal write                                1806       3.828541        18.59
 wal file sync                            1806       3.704491        17.98
 db file read                             34         0.072955        0.35
 db file extend                           109        0.003226        0.02
 xid gen lock acquire                     1          0.000190        0.00
 clog control lock acquire                3          0.000016        0.00

    Database Parameters from postgresql.conf

 PARAMETER                         SETTING                                  CONTEXT     MINVAL       MAXVAL
 -------------------------------------------------------------------------------------------------------------------------
 allow_system_table_mods            off                                      postmaster
 application_name                   psql.bin                                 user
 archive_command                    rsync -aq %p /mnt/pgbackups/archivedir/% sighup
...

We can also get individual session wait information with the following functions. 

demo=# \df sys.sess*
                               List of functions
 Schema |     Name     | Result data type |    Argument data types    |  Type
--------+--------------+------------------+---------------------------+--------
 sys    | sess_rpt     | SETOF text       | integer, integer, integer | normal
 sys    | sesshist_rpt | SETOF text       | integer, integer          | normal
 sys    | sessid_rpt   | SETOF text       | integer, integer, integer | normal

(3 rows)

Here are the 5 individual report functions plus sys_rpt for viewing wait informaiton. 

SELECT
  n.nspname ||'.'|| p.proname as edbreports
FROM pg_catalog.pg_proc p
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = p.pronamespace
WHERE p.protype = '0'
  AND p.proname ~ '^(.*_rpt)$'
AND p.proname NOT LIKE 'sess%'
AND n.nspname ~ '^(sys)$';

"psql.edit.107272.sql" 8L, 259C written
       edbreports
------------------------
 sys.stat_db_rpt
 sys.stat_indexes_rpt
 sys.stat_tables_rpt
 sys.statio_indexes_rpt
 sys.statio_tables_rpt
 sys.sys_rpt
(6 rows)

An example of viewing just the top 10 wait events.

demo=# select * from sys_rpt(1,2,10);

                                   sys_rpt
-----------------------------------------------------------------------------
 WAIT NAME                                COUNT      WAIT TIME       % WAIT
 ---------------------------------------------------------------------------
 wal flush                                2111       7.738697        37.57
 query plan                               7189       5.251493        25.49
 wal write                                1806       3.828541        18.59
 wal file sync                            1806       3.704491        17.98
 db file read                             34         0.072955        0.35
 db file extend                           109        0.003226        0.02
 xid gen lock acquire                     1          0.000190        0.00
 clog control lock acquire                3          0.000016        0.00
(10 rows))

DRITA catalog views are available as well.

demo=# \dt edb$*
                    List of relations
 Schema |           Name           | Type  |    Owner
--------+--------------------------+-------+--------------
 sys    | edb$session_wait_history | table | enterprisedb
 sys    | edb$session_waits        | table | enterprisedb
 sys    | edb$snap                 | table | enterprisedb
 sys    | edb$stat_all_indexes     | table | enterprisedb
 sys    | edb$stat_all_tables      | table | enterprisedb
 sys    | edb$stat_database        | table | enterprisedb
 sys    | edb$statio_all_indexes   | table | enterprisedb
 sys    | edb$statio_all_tables    | table | enterprisedb
 sys    | edb$system_waits         | table | enterprisedb
(9 rows)

demo=# select edb_id,dbname,backend_id,wait_name,elapsed,p1 from edb$session_wait_history where wait_name = 'query plan' limit 1;
 edb_id | dbname | backend_id | wait_name  | elapsed | p1
--------+--------+------------+------------+---------+----
      1 | demo   |      98079 | query plan |      15 |  0
(1 row)

Tuesday, October 27, 2015

Getting Started with PostGIS

Released nearly 15 years ago, PostGIS is an open source extension to Postgres that spatially enables your database server, and follows the OGC Simple Features for SQL specification. If you are coming from another standards based spatial database you may notice similarities in the spatial data types, naming conventions, and functionality.

You can easily install PostGIS using EDB StackBuilder and then connect to your database and run the create extension command to enable PostGIS. Now you can create tables with a spatial column, or use the AddGeometryColumn function to add a spatial column to an existing table. And don’t forget to add a spatial index to your spatial column using the standard create index syntax - ‘create index idx_regions_geom on hr.regions using gist (geom);’.

Loading spatial data can be accomplished with standard SQL or by using a tool. For desktop application users with a repository of Esri shapefiles laying around, shp2pgsql is a simple tool for loading those shapefiles to Postgres. I have ran into a number of databases out in the field with tables containing latitude/longitude values stored as numeric data types. In those cases, we can leverage the many type conversion functions such as st_geomfromtext in a INSERT or UPDATE statement to convert to a PostGIS geometry. Alternatively, we can use one of the st_make functions to make a geometry object, for example ‘update regions set geom = st_setsrid(st_makepoint(lon,lat),4326) where regid = regid;’.

Once your data is in PostGIS, you can now answer formely complex questions with simple SQL - such as tell me the number of atms within one mile of a given point: ‘select count(*) from atms b where st_dwithin(b.geom,st_geomfromtext(‘POINT()’,4326)),1609 * 1);’. 
  • Determine the length of any line such as a walking path using st_length.
  • Determine the area of a polygon using st_area(geom).
  • Tell me if my spatial object intersects with a protected area using st_intersects(geom,geom).
Those are just a few of the powerful functions available in PostGIS, there are many more available to answer complex location based questions with just a little bit of SQL.

Along with powerful functions, PostGIS stands out from other geospatial programs by inheriting the flexibility and power of Postgres. PostGIS stands out from the crowd with:
  • Easily scale out your workload with Streaming Replication.
  • Support for multiple readers and writers with MVCC.
  • Use triggers to track edit history or automatically extend your workflow.
  • Mash up data from other data sources using Foreign Data Wrappers.
  • Use table-partitioning to break large data sets down into more manageable pieces.
  • Build custom functions using one of the many procedural languages in Postgres like PL/Python or PL/R. 

Database professionals who are new to GIS technology should not expect to become experts after reading this blog post. But with any luck, I gave you an introduction to PostGIS and what it means to support a spatial database.

Sunday, December 14, 2014

Highlighting xDB Replication Server

This is the first in a series of posts highlighting common business requirements for managing data I see in the field, and the tools that are available to help meet those requirements.

One common requirement many organizations have is being able to move data around to different database systems. While I deal with a lot of requests for help moving data from Oracle to Postgres, I also assist businesses that routinely push data from Postgres to Oracle. Typically because the corporate BI database is still on Oracle.

This post captures a simple use-case that I encounter often. "We need to move data in Postgres to a staging table in our Oracle BI database." For a job such as this one, no custom scripts are necessary - xDB from EnterpriseDB is the perfect tool.


I can verify the tables do not already exist in Oracle by querying with a oci-dblink connection from Postgres.

edbstore=# select * from cat@edbstore_ora;
 table_name | table_type
------------+------------
(0 rows)

Let's describe the tables in our edbstore database. These tables should look familiar if you've ever taken one of EnterpriseDB's training courses.                               
edbstore=# \dt+
                          List of relations
  Schema  |    Name    | Type  |  Owner   |    Size    | Description
----------+------------+-------+----------+------------+-------------
 edbstore | categories | table | edbstore | 8192 bytes |
 edbstore | cust_hist  | table | edbstore | 2640 kB    |
 edbstore | customers  | table | edbstore | 3936 kB    |
 edbstore | dept       | table | edbstore | 8192 bytes |
 edbstore | emp        | table | edbstore | 8192 bytes |
 edbstore | inventory  | table | edbstore | 464 kB     |
 edbstore | job_grd    | table | edbstore | 16 kB      |
 edbstore | jobhist    | table | edbstore | 8192 bytes |
 edbstore | locations  | table | edbstore | 8192 bytes |
 edbstore | orderlines | table | edbstore | 3104 kB    |
 edbstore | orders     | table | edbstore | 920 kB     |
 edbstore | products   | table | edbstore | 832 kB     |
 edbstore | reorder    | table | edbstore | 0 bytes    |
(13 rows)

Now I'm going to add the edbstore database to our xDB Publication Server.



Next, I'm adding the Oracle target database to the Subscription server using the xDB Replication Console. 



A Publication can be created with the xDB Replication Console. In this case we are only interested in the  customers and orders table. 
Note: Click on the Table Filters tab if you need to add a filter.



Now we can move over the the Subscription Server and subscribe to the newly created Publication from within the same console.




Take a Snapshot.



The Data Load and Migration Summary will be displayed toward the end of the console messages:
[orders] Table Data Load Summary: Total Time(s): 6.379 Total Rows: 12000
Data Load Summary: Total Time (sec): 14.398 Total Rows: 32000 Total Size(MB): 0.0

Schema edbstore imported successfully.

Let's verify the data is in Oracle using SQL*Plus                                                
SQL> select
  2  (select count(*) from customers) AS customers_count,
  3  (select count(*) from orders) AS orders_count
  4  from dual;

CUSTOMERS_COUNT ORDERS_COUNT
--------------- ------------
 20000        12000

Configure Synchronization to whatever schedule suits your needs.                                            
Synchronization can be configured as Daily, Weekly, Monthly or a Cron Expression. Let's go with Continuously. 


From the xDB Replication Console:






Verify that changes are immediately being replicated to the target Oracle Database.
I'm going to COPY 50 records into the Orders table in Postgres and verify the updates are immediately available in Oracle as well.

edbstore=# select
edbstore-# (select count(*) from orders) AS orders_postgres,
edbstore-# (select count(*) from orders@edbstore_ora) AS orders_oracle
edbstore-# from dual;
 orders_postgres | orders_oracle
-----------------+---------------
           12050 |         12050
(1 row)