module documentation

Sequences/SERIAL/IDENTITY

PostgreSQL supports sequences, and SQLAlchemy uses these as the default means of creating new primary key values for integer-based primary key columns. When creating tables, SQLAlchemy will issue the SERIAL datatype for integer-based primary key columns, which generates a sequence and server side default corresponding to the column.

To specify a specific named sequence to be used for primary key generation, use the ~sqlalchemy.schema.Sequence construct:

Table('sometable', metadata,
        Column('id', Integer, Sequence('some_id_seq'), primary_key=True)
    )

When SQLAlchemy issues a single INSERT statement, to fulfill the contract of having the "last insert identifier" available, a RETURNING clause is added to the INSERT statement which specifies the primary key columns should be returned after the statement completes. The RETURNING functionality only takes place if PostgreSQL 8.2 or later is in use. As a fallback approach, the sequence, whether specified explicitly or implicitly via SERIAL, is executed independently beforehand, the returned value to be used in the subsequent insert. Note that when an ~sqlalchemy.sql.expression.insert() construct is executed using "executemany" semantics, the "last inserted identifier" functionality does not apply; no RETURNING clause is emitted nor is the sequence pre-executed in this case.

To force the usage of RETURNING by default off, specify the flag implicit_returning=False to _sa.create_engine.

PostgreSQL 10 and above IDENTITY columns

PostgreSQL 10 and above have a new IDENTITY feature that supersedes the use of SERIAL. The _schema.Identity construct in a _schema.Column can be used to control its behavior:

from sqlalchemy import Table, Column, MetaData, Integer, Computed

metadata = MetaData()

data = Table(
    "data",
    metadata,
    Column(
        'id', Integer, Identity(start=42, cycle=True), primary_key=True
    ),
    Column('data', String)
)

The CREATE TABLE for the above _schema.Table object would be:

CREATE TABLE data (
    id INTEGER GENERATED BY DEFAULT AS IDENTITY (START WITH 42 CYCLE),
    data VARCHAR,
    PRIMARY KEY (id)
)
Changed in version 1.4: Added _schema.Identity construct in a _schema.Column to specify the option of an autoincrementing column.

Note

Previous versions of SQLAlchemy did not have built-in support for rendering of IDENTITY, and could use the following compilation hook to replace occurrences of SERIAL with IDENTITY:

from sqlalchemy.schema import CreateColumn
from sqlalchemy.ext.compiler import compiles


@compiles(CreateColumn, 'postgresql')
def use_identity(element, compiler, **kw):
    text = compiler.visit_create_column(element, **kw)
    text = text.replace(
        "SERIAL", "INT GENERATED BY DEFAULT AS IDENTITY"
     )
    return text

Using the above, a table such as:

t = Table(
    't', m,
    Column('id', Integer, primary_key=True),
    Column('data', String)
)

Will generate on the backing database as:

CREATE TABLE t (
    id INT GENERATED BY DEFAULT AS IDENTITY,
    data VARCHAR,
    PRIMARY KEY (id)
)

Server Side Cursors

Server-side cursor support is available for the psycopg2, asyncpg dialects and may also be available in others.

Server side cursors are enabled on a per-statement basis by using the :paramref:`.Connection.execution_options.stream_results` connection execution option:

with engine.connect() as conn:
    result = conn.execution_options(stream_results=True).execute(text("select * from table"))

Note that some kinds of SQL statements may not be supported with server side cursors; generally, only SQL statements that return rows should be used with this option.

Deprecated since version 1.4: The dialect-level server_side_cursors flag is deprecated and will be removed in a future release. Please use the :paramref:`_engine.Connection.stream_results` execution option for unbuffered cursor support.

Transaction Isolation Level

Most SQLAlchemy dialects support setting of transaction isolation level using the :paramref:`_sa.create_engine.execution_options` parameter at the _sa.create_engine level, and at the _engine.Connection level via the :paramref:`.Connection.execution_options.isolation_level` parameter.

For PostgreSQL dialects, this feature works either by making use of the DBAPI-specific features, such as psycopg2's isolation level flags which will embed the isolation level setting inline with the "BEGIN" statement, or for DBAPIs with no direct support by emitting SET SESSION CHARACTERISTICS AS TRANSACTION ISOLATION LEVEL <level> ahead of the "BEGIN" statement emitted by the DBAPI. For the special AUTOCOMMIT isolation level, DBAPI-specific techniques are used which is typically an .autocommit flag on the DBAPI connection object.

To set isolation level using _sa.create_engine:

engine = create_engine(
    "postgresql+pg8000://scott:tiger@localhost/test",
    execution_options={
        "isolation_level": "REPEATABLE READ"
    }
)

To set using per-connection execution options:

with engine.connect() as conn:
    conn = conn.execution_options(
        isolation_level="REPEATABLE READ"
    )
    with conn.begin():
        # ... work with transaction

Valid values for isolation_level on most PostgreSQL dialects include:

  • READ COMMITTED
  • READ UNCOMMITTED
  • REPEATABLE READ
  • SERIALIZABLE
  • AUTOCOMMIT

Setting READ ONLY / DEFERRABLE

Most PostgreSQL dialects support setting the "READ ONLY" and "DEFERRABLE" characteristics of the transaction, which is in addition to the isolation level setting. These two attributes can be established either in conjunction with or independently of the isolation level by passing the postgresql_readonly and postgresql_deferrable flags with _engine.Connection.execution_options. The example below illustrates passing the "SERIALIZABLE" isolation level at the same time as setting "READ ONLY" and "DEFERRABLE":

with engine.connect() as conn:
    conn = conn.execution_options(
        isolation_level="SERIALIZABLE",
        postgresql_readonly=True,
        postgresql_deferrable=True
    )
    with conn.begin():
        #  ... work with transaction

Note that some DBAPIs such as asyncpg only support "readonly" with SERIALIZABLE isolation.

New in version 1.4: added support for the postgresql_readonly and postgresql_deferrable execution options.

Setting Alternate Search Paths on Connect

The PostgreSQL search_path variable refers to the list of schema names that will be implicitly referred towards when a particular table or other object is referenced in a SQL statement. As detailed in the next section :ref:`postgresql_schema_reflection`, SQLAlchemy is generally organized around the concept of keeping this variable at its default value of public, however, in order to have it set to any arbitrary name or names when connections are used automatically, the "SET SESSION search_path" command may be invoked for all connections in a pool using the following event handler, as discussed at :ref:`schema_set_default_connections`:

from sqlalchemy import event
from sqlalchemy import create_engine

engine = create_engine("postgresql+psycopg2://scott:tiger@host/dbname")

@event.listens_for(engine, "connect", insert=True)
def set_search_path(dbapi_connection, connection_record):
    existing_autocommit = dbapi_connection.autocommit
    dbapi_connection.autocommit = True
    cursor = dbapi_connection.cursor()
    cursor.execute("SET SESSION search_path='%s'" % schema_name)
    cursor.close()
    dbapi_connection.autocommit = existing_autocommit

The reason the recipe is complicated by use of the .autocommit DBAPI attribute is so that when the SET SESSION search_path directive is invoked, it is invoked outside of the scope of any transaction and therefore will not be reverted when the DBAPI connection has a rollback.

Remote-Schema Table Introspection and PostgreSQL search_path

Section Best Practices Summarized

keep the search_path variable set to its default of public, without any other schema names. For other schema names, name these explicitly within _schema.Table definitions. Alternatively, the postgresql_ignore_search_path option will cause all reflected _schema.Table objects to have a _schema.Table.schema attribute set up.

The PostgreSQL dialect can reflect tables from any schema, as outlined in :ref:`schema_table_reflection`.

With regards to tables which these _schema.Table objects refer to via foreign key constraint, a decision must be made as to how the .schema is represented in those remote tables, in the case where that remote schema name is also a member of the current PostgreSQL search path.

By default, the PostgreSQL dialect mimics the behavior encouraged by PostgreSQL's own pg_get_constraintdef() builtin procedure. This function returns a sample definition for a particular foreign key constraint, omitting the referenced schema name from that definition when the name is also in the PostgreSQL schema search path. The interaction below illustrates this behavior:

test=> CREATE TABLE test_schema.referred(id INTEGER PRIMARY KEY);
CREATE TABLE
test=> CREATE TABLE referring(
test(>         id INTEGER PRIMARY KEY,
test(>         referred_id INTEGER REFERENCES test_schema.referred(id));
CREATE TABLE
test=> SET search_path TO public, test_schema;
test=> SELECT pg_catalog.pg_get_constraintdef(r.oid, true) FROM
test-> pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n
test-> ON n.oid = c.relnamespace
test-> JOIN pg_catalog.pg_constraint r  ON c.oid = r.conrelid
test-> WHERE c.relname='referring' AND r.contype = 'f'
test-> ;
               pg_get_constraintdef
---------------------------------------------------
 FOREIGN KEY (referred_id) REFERENCES referred(id)
(1 row)

Above, we created a table referred as a member of the remote schema test_schema, however when we added test_schema to the PG search_path and then asked pg_get_constraintdef() for the FOREIGN KEY syntax, test_schema was not included in the output of the function.

On the other hand, if we set the search path back to the typical default of public:

test=> SET search_path TO public;
SET

The same query against pg_get_constraintdef() now returns the fully schema-qualified name for us:

test=> SELECT pg_catalog.pg_get_constraintdef(r.oid, true) FROM
test-> pg_catalog.pg_class c JOIN pg_catalog.pg_namespace n
test-> ON n.oid = c.relnamespace
test-> JOIN pg_catalog.pg_constraint r  ON c.oid = r.conrelid
test-> WHERE c.relname='referring' AND r.contype = 'f';
                     pg_get_constraintdef
---------------------------------------------------------------
 FOREIGN KEY (referred_id) REFERENCES test_schema.referred(id)
(1 row)

SQLAlchemy will by default use the return value of pg_get_constraintdef() in order to determine the remote schema name. That is, if our search_path were set to include test_schema, and we invoked a table reflection process as follows:

>>> from sqlalchemy import Table, MetaData, create_engine, text
>>> engine = create_engine("postgresql://scott:tiger@localhost/test")
>>> with engine.connect() as conn:
...     conn.execute(text("SET search_path TO test_schema, public"))
...     metadata_obj = MetaData()
...     referring = Table('referring', metadata_obj,
...                       autoload_with=conn)
...
<sqlalchemy.engine.result.CursorResult object at 0x101612ed0>

The above process would deliver to the _schema.MetaData.tables collection referred table named without the schema:

>>> metadata_obj.tables['referred'].schema is None
True

To alter the behavior of reflection such that the referred schema is maintained regardless of the search_path setting, use the postgresql_ignore_search_path option, which can be specified as a dialect-specific argument to both _schema.Table as well as _schema.MetaData.reflect:

>>> with engine.connect() as conn:
...     conn.execute(text("SET search_path TO test_schema, public"))
...     metadata_obj = MetaData()
...     referring = Table('referring', metadata_obj,
...                       autoload_with=conn,
...                       postgresql_ignore_search_path=True)
...
<sqlalchemy.engine.result.CursorResult object at 0x1016126d0>

We will now have test_schema.referred stored as schema-qualified:

>>> metadata_obj.tables['test_schema.referred'].schema
'test_schema'

Best Practices for PostgreSQL Schema reflection

The description of PostgreSQL schema reflection behavior is complex, and is the product of many years of dealing with widely varied use cases and user preferences. But in fact, there's no need to understand any of it if you just stick to the simplest use pattern: leave the search_path set to its default of public only, never refer to the name public as an explicit schema name otherwise, and refer to all other schema names explicitly when building up a _schema.Table object. The options described here are only for those users who can't, or prefer not to, stay within these guidelines.

Note that in all cases, the "default" schema is always reflected as None. The "default" schema on PostgreSQL is that which is returned by the PostgreSQL current_schema() function. On a typical PostgreSQL installation, this is the name public. So a table that refers to another which is in the public (i.e. default) schema will always have the .schema attribute set to None.

See Also

:ref:`reflection_schema_qualified_interaction` - discussion of the issue from a backend-agnostic perspective

The Schema Search Path - on the PostgreSQL website.

INSERT/UPDATE...RETURNING

The dialect supports PG 8.2's INSERT..RETURNING, UPDATE..RETURNING and DELETE..RETURNING syntaxes. INSERT..RETURNING is used by default for single-row INSERT statements in order to fetch newly generated primary key identifiers. To specify an explicit RETURNING clause, use the ._UpdateBase.returning method on a per-statement basis:

# INSERT..RETURNING
result = table.insert().returning(table.c.col1, table.c.col2).\
    values(name='foo')
print(result.fetchall())

# UPDATE..RETURNING
result = table.update().returning(table.c.col1, table.c.col2).\
    where(table.c.name=='foo').values(name='bar')
print(result.fetchall())

# DELETE..RETURNING
result = table.delete().returning(table.c.col1, table.c.col2).\
    where(table.c.name=='foo')
print(result.fetchall())

INSERT...ON CONFLICT (Upsert)

Starting with version 9.5, PostgreSQL allows "upserts" (update or insert) of rows into a table via the ON CONFLICT clause of the INSERT statement. A candidate row will only be inserted if that row does not violate any unique constraints. In the case of a unique constraint violation, a secondary action can occur which can be either "DO UPDATE", indicating that the data in the target row should be updated, or "DO NOTHING", which indicates to silently skip this row.

Conflicts are determined using existing unique constraints and indexes. These constraints may be identified either using their name as stated in DDL, or they may be inferred by stating the columns and conditions that comprise the indexes.

SQLAlchemy provides ON CONFLICT support via the PostgreSQL-specific _postgresql.insert() function, which provides the generative methods _postgresql.Insert.on_conflict_do_update and ~.postgresql.Insert.on_conflict_do_nothing:

>>> from sqlalchemy.dialects.postgresql import insert
>>> insert_stmt = insert(my_table).values(
...     id='some_existing_id',
...     data='inserted value')
>>> do_nothing_stmt = insert_stmt.on_conflict_do_nothing(
...     index_elements=['id']
... )
>>> print(do_nothing_stmt)
{opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
ON CONFLICT (id) DO NOTHING
{stop}

>>> do_update_stmt = insert_stmt.on_conflict_do_update(
...     constraint='pk_my_table',
...     set_=dict(data='updated value')
... )
>>> print(do_update_stmt)
{opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
ON CONFLICT ON CONSTRAINT pk_my_table DO UPDATE SET data = %(param_1)s
New in version 1.1.

See Also

INSERT .. ON CONFLICT - in the PostgreSQL documentation.

Specifying the Target

Both methods supply the "target" of the conflict using either the named constraint or by column inference:

  • The :paramref:`_postgresql.Insert.on_conflict_do_update.index_elements` argument specifies a sequence containing string column names, _schema.Column objects, and/or SQL expression elements, which would identify a unique index:

    >>> do_update_stmt = insert_stmt.on_conflict_do_update(
    ...     index_elements=['id'],
    ...     set_=dict(data='updated value')
    ... )
    >>> print(do_update_stmt)
    {opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
    ON CONFLICT (id) DO UPDATE SET data = %(param_1)s
    {stop}
    
    >>> do_update_stmt = insert_stmt.on_conflict_do_update(
    ...     index_elements=[my_table.c.id],
    ...     set_=dict(data='updated value')
    ... )
    >>> print(do_update_stmt)
    {opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
    ON CONFLICT (id) DO UPDATE SET data = %(param_1)s
    
  • When using :paramref:`_postgresql.Insert.on_conflict_do_update.index_elements` to infer an index, a partial index can be inferred by also specifying the use the :paramref:`_postgresql.Insert.on_conflict_do_update.index_where` parameter:

    >>> stmt = insert(my_table).values(user_email='a@b.com', data='inserted data')
    >>> stmt = stmt.on_conflict_do_update(
    ...     index_elements=[my_table.c.user_email],
    ...     index_where=my_table.c.user_email.like('%@gmail.com'),
    ...     set_=dict(data=stmt.excluded.data)
    ... )
    >>> print(stmt)
    {opensql}INSERT INTO my_table (data, user_email)
    VALUES (%(data)s, %(user_email)s) ON CONFLICT (user_email)
    WHERE user_email LIKE %(user_email_1)s DO UPDATE SET data = excluded.data
    
  • The :paramref:`_postgresql.Insert.on_conflict_do_update.constraint` argument is used to specify an index directly rather than inferring it. This can be the name of a UNIQUE constraint, a PRIMARY KEY constraint, or an INDEX:

    >>> do_update_stmt = insert_stmt.on_conflict_do_update(
    ...     constraint='my_table_idx_1',
    ...     set_=dict(data='updated value')
    ... )
    >>> print(do_update_stmt)
    {opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
    ON CONFLICT ON CONSTRAINT my_table_idx_1 DO UPDATE SET data = %(param_1)s
    {stop}
    
    >>> do_update_stmt = insert_stmt.on_conflict_do_update(
    ...     constraint='my_table_pk',
    ...     set_=dict(data='updated value')
    ... )
    >>> print(do_update_stmt)
    {opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
    ON CONFLICT ON CONSTRAINT my_table_pk DO UPDATE SET data = %(param_1)s
    {stop}
    
  • The :paramref:`_postgresql.Insert.on_conflict_do_update.constraint` argument may also refer to a SQLAlchemy construct representing a constraint, e.g. .UniqueConstraint, .PrimaryKeyConstraint, .Index, or .ExcludeConstraint. In this use, if the constraint has a name, it is used directly. Otherwise, if the constraint is unnamed, then inference will be used, where the expressions and optional WHERE clause of the constraint will be spelled out in the construct. This use is especially convenient to refer to the named or unnamed primary key of a _schema.Table using the _schema.Table.primary_key attribute:

    >>> do_update_stmt = insert_stmt.on_conflict_do_update(
    ...     constraint=my_table.primary_key,
    ...     set_=dict(data='updated value')
    ... )
    >>> print(do_update_stmt)
    {opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
    ON CONFLICT (id) DO UPDATE SET data = %(param_1)s
    

The SET Clause

ON CONFLICT...DO UPDATE is used to perform an update of the already existing row, using any combination of new values as well as values from the proposed insertion. These values are specified using the :paramref:`_postgresql.Insert.on_conflict_do_update.set_` parameter. This parameter accepts a dictionary which consists of direct values for UPDATE:

>>> stmt = insert(my_table).values(id='some_id', data='inserted value')
>>> do_update_stmt = stmt.on_conflict_do_update(
...     index_elements=['id'],
...     set_=dict(data='updated value')
... )
>>> print(do_update_stmt)
{opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
ON CONFLICT (id) DO UPDATE SET data = %(param_1)s

Warning

The _expression.Insert.on_conflict_do_update method does not take into account Python-side default UPDATE values or generation functions, e.g. those specified using :paramref:`_schema.Column.onupdate`. These values will not be exercised for an ON CONFLICT style of UPDATE, unless they are manually specified in the :paramref:`_postgresql.Insert.on_conflict_do_update.set_` dictionary.

Updating using the Excluded INSERT Values

In order to refer to the proposed insertion row, the special alias ~.postgresql.Insert.excluded is available as an attribute on the _postgresql.Insert object; this object is a _expression.ColumnCollection which alias contains all columns of the target table:

>>> stmt = insert(my_table).values(
...     id='some_id',
...     data='inserted value',
...     author='jlh'
... )
>>> do_update_stmt = stmt.on_conflict_do_update(
...     index_elements=['id'],
...     set_=dict(data='updated value', author=stmt.excluded.author)
... )
>>> print(do_update_stmt)
{opensql}INSERT INTO my_table (id, data, author)
VALUES (%(id)s, %(data)s, %(author)s)
ON CONFLICT (id) DO UPDATE SET data = %(param_1)s, author = excluded.author

Additional WHERE Criteria

The _expression.Insert.on_conflict_do_update method also accepts a WHERE clause using the :paramref:`_postgresql.Insert.on_conflict_do_update.where` parameter, which will limit those rows which receive an UPDATE:

>>> stmt = insert(my_table).values(
...     id='some_id',
...     data='inserted value',
...     author='jlh'
... )
>>> on_update_stmt = stmt.on_conflict_do_update(
...     index_elements=['id'],
...     set_=dict(data='updated value', author=stmt.excluded.author),
...     where=(my_table.c.status == 2)
... )
>>> print(on_update_stmt)
{opensql}INSERT INTO my_table (id, data, author)
VALUES (%(id)s, %(data)s, %(author)s)
ON CONFLICT (id) DO UPDATE SET data = %(param_1)s, author = excluded.author
WHERE my_table.status = %(status_1)s

Skipping Rows with DO NOTHING

ON CONFLICT may be used to skip inserting a row entirely if any conflict with a unique or exclusion constraint occurs; below this is illustrated using the ~.postgresql.Insert.on_conflict_do_nothing method:

>>> stmt = insert(my_table).values(id='some_id', data='inserted value')
>>> stmt = stmt.on_conflict_do_nothing(index_elements=['id'])
>>> print(stmt)
{opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
ON CONFLICT (id) DO NOTHING

If DO NOTHING is used without specifying any columns or constraint, it has the effect of skipping the INSERT for any unique or exclusion constraint violation which occurs:

>>> stmt = insert(my_table).values(id='some_id', data='inserted value')
>>> stmt = stmt.on_conflict_do_nothing()
>>> print(stmt)
{opensql}INSERT INTO my_table (id, data) VALUES (%(id)s, %(data)s)
ON CONFLICT DO NOTHING

FROM ONLY ...

The dialect supports PostgreSQL's ONLY keyword for targeting only a particular table in an inheritance hierarchy. This can be used to produce the SELECT ... FROM ONLY, UPDATE ONLY ..., and DELETE FROM ONLY ... syntaxes. It uses SQLAlchemy's hints mechanism:

# SELECT ... FROM ONLY ...
result = table.select().with_hint(table, 'ONLY', 'postgresql')
print(result.fetchall())

# UPDATE ONLY ...
table.update(values=dict(foo='bar')).with_hint('ONLY',
                                               dialect_name='postgresql')

# DELETE FROM ONLY ...
table.delete().with_hint('ONLY', dialect_name='postgresql')

PostgreSQL-Specific Index Options

Several extensions to the .Index construct are available, specific to the PostgreSQL dialect.

Covering Indexes

The postgresql_include option renders INCLUDE(colname) for the given string names:

Index("my_index", table.c.x, postgresql_include=['y'])

would render the index as CREATE INDEX my_index ON table (x) INCLUDE (y)

Note that this feature requires PostgreSQL 11 or later.

New in version 1.4.

Partial Indexes

Partial indexes add criterion to the index definition so that the index is applied to a subset of rows. These can be specified on .Index using the postgresql_where keyword argument:

Index('my_index', my_table.c.id, postgresql_where=my_table.c.value > 10)

Operator Classes

PostgreSQL allows the specification of an operator class for each column of an index (see https://www.postgresql.org/docs/8.3/interactive/indexes-opclass.html). The .Index construct allows these to be specified via the postgresql_ops keyword argument:

Index(
    'my_index', my_table.c.id, my_table.c.data,
    postgresql_ops={
        'data': 'text_pattern_ops',
        'id': 'int4_ops'
    })

Note that the keys in the postgresql_ops dictionaries are the "key" name of the _schema.Column, i.e. the name used to access it from the .c collection of _schema.Table, which can be configured to be different than the actual name of the column as expressed in the database.

If postgresql_ops is to be used against a complex SQL expression such as a function call, then to apply to the column it must be given a label that is identified in the dictionary by name, e.g.:

Index(
    'my_index', my_table.c.id,
    func.lower(my_table.c.data).label('data_lower'),
    postgresql_ops={
        'data_lower': 'text_pattern_ops',
        'id': 'int4_ops'
    })

Operator classes are also supported by the _postgresql.ExcludeConstraint construct using the :paramref:`_postgresql.ExcludeConstraint.ops` parameter. See that parameter for details.

New in version 1.3.21: added support for operator classes with _postgresql.ExcludeConstraint.

Index Types

PostgreSQL provides several index types: B-Tree, Hash, GiST, and GIN, as well as the ability for users to create their own (see https://www.postgresql.org/docs/8.3/static/indexes-types.html). These can be specified on .Index using the postgresql_using keyword argument:

Index('my_index', my_table.c.data, postgresql_using='gin')

The value passed to the keyword argument will be simply passed through to the underlying CREATE INDEX command, so it must be a valid index type for your version of PostgreSQL.

Index Storage Parameters

PostgreSQL allows storage parameters to be set on indexes. The storage parameters available depend on the index method used by the index. Storage parameters can be specified on .Index using the postgresql_with keyword argument:

Index('my_index', my_table.c.data, postgresql_with={"fillfactor": 50})
New in version 1.0.6.

PostgreSQL allows to define the tablespace in which to create the index. The tablespace can be specified on .Index using the postgresql_tablespace keyword argument:

Index('my_index', my_table.c.data, postgresql_tablespace='my_tablespace')
New in version 1.1.

Note that the same option is available on _schema.Table as well.

Indexes with CONCURRENTLY

The PostgreSQL index option CONCURRENTLY is supported by passing the flag postgresql_concurrently to the .Index construct:

tbl = Table('testtbl', m, Column('data', Integer))

idx1 = Index('test_idx1', tbl.c.data, postgresql_concurrently=True)

The above index construct will render DDL for CREATE INDEX, assuming PostgreSQL 8.2 or higher is detected or for a connection-less dialect, as:

CREATE INDEX CONCURRENTLY test_idx1 ON testtbl (data)

For DROP INDEX, assuming PostgreSQL 9.2 or higher is detected or for a connection-less dialect, it will emit:

DROP INDEX CONCURRENTLY test_idx1
New in version 1.1: support for CONCURRENTLY on DROP INDEX. The CONCURRENTLY keyword is now only emitted if a high enough version of PostgreSQL is detected on the connection (or for a connection-less dialect).

When using CONCURRENTLY, the PostgreSQL database requires that the statement be invoked outside of a transaction block. The Python DBAPI enforces that even for a single statement, a transaction is present, so to use this construct, the DBAPI's "autocommit" mode must be used:

metadata = MetaData()
table = Table(
    "foo", metadata,
    Column("id", String))
index = Index(
    "foo_idx", table.c.id, postgresql_concurrently=True)

with engine.connect() as conn:
    with conn.execution_options(isolation_level='AUTOCOMMIT'):
        table.create(conn)

PostgreSQL Index Reflection

The PostgreSQL database creates a UNIQUE INDEX implicitly whenever the UNIQUE CONSTRAINT construct is used. When inspecting a table using _reflection.Inspector, the _reflection.Inspector.get_indexes and the _reflection.Inspector.get_unique_constraints will report on these two constructs distinctly; in the case of the index, the key duplicates_constraint will be present in the index entry if it is detected as mirroring a constraint. When performing reflection using Table(..., autoload_with=engine), the UNIQUE INDEX is not returned in _schema.Table.indexes when it is detected as mirroring a .UniqueConstraint in the _schema.Table.constraints collection .

Changed in version 1.0.0: - _schema.Table reflection now includes .UniqueConstraint objects present in the _schema.Table.constraints collection; the PostgreSQL backend will no longer include a "mirrored" .Index construct in _schema.Table.indexes if it is detected as corresponding to a unique constraint.

Special Reflection Options

The _reflection.Inspector used for the PostgreSQL backend is an instance of .PGInspector, which offers additional methods:

from sqlalchemy import create_engine, inspect

engine = create_engine("postgresql+psycopg2://localhost/test")
insp = inspect(engine)  # will be a PGInspector

print(insp.get_enums())

PostgreSQL Table Options

Several options for CREATE TABLE are supported directly by the PostgreSQL dialect in conjunction with the _schema.Table construct:

  • TABLESPACE:

    Table("some_table", metadata, ..., postgresql_tablespace='some_tablespace')
    

    The above option is also available on the .Index construct.

  • ON COMMIT:

    Table("some_table", metadata, ..., postgresql_on_commit='PRESERVE ROWS')
    
  • WITH OIDS:

    Table("some_table", metadata, ..., postgresql_with_oids=True)
    
  • WITHOUT OIDS:

    Table("some_table", metadata, ..., postgresql_with_oids=False)
    
  • INHERITS:

    Table("some_table", metadata, ..., postgresql_inherits="some_supertable")
    
    Table("some_table", metadata, ..., postgresql_inherits=("t1", "t2", ...))
    
    .. versionadded:: 1.0.0
    
  • PARTITION BY:

    Table("some_table", metadata, ...,
          postgresql_partition_by='LIST (part_column)')
    
    .. versionadded:: 1.2.6
    

Table values, Table and Column valued functions, Row and Tuple objects

PostgreSQL makes great use of modern SQL forms such as table-valued functions, tables and rows as values. These constructs are commonly used as part of PostgreSQL's support for complex datatypes such as JSON, ARRAY, and other datatypes. SQLAlchemy's SQL expression language has native support for most table-valued and row-valued forms.

Table-Valued Functions

Many PostgreSQL built-in functions are intended to be used in the FROM clause of a SELECT statement, and are capable of returning table rows or sets of table rows. A large portion of PostgreSQL's JSON functions for example such as json_array_elements(), json_object_keys(), json_each_text(), json_each(), json_to_record(), json_populate_recordset() use such forms. These classes of SQL function calling forms in SQLAlchemy are available using the _functions.FunctionElement.table_valued method in conjunction with _functions.Function objects generated from the _sql.func namespace.

Examples from PostgreSQL's reference documentation follow below:

  • json_each():

    >>> from sqlalchemy import select, func
    >>> stmt = select(func.json_each('{"a":"foo", "b":"bar"}').table_valued("key", "value"))
    >>> print(stmt)
    SELECT anon_1.key, anon_1.value
    FROM json_each(:json_each_1) AS anon_1
    
  • json_populate_record():

    >>> from sqlalchemy import select, func, literal_column
    >>> stmt = select(
    ...     func.json_populate_record(
    ...         literal_column("null::myrowtype"),
    ...         '{"a":1,"b":2}'
    ...     ).table_valued("a", "b", name="x")
    ... )
    >>> print(stmt)
    SELECT x.a, x.b
    FROM json_populate_record(null::myrowtype, :json_populate_record_1) AS x
    
  • json_to_record() - this form uses a PostgreSQL specific form of derived columns in the alias, where we may make use of _sql.column elements with types to produce them. The _functions.FunctionElement.table_valued method produces a _sql.TableValuedAlias construct, and the method _sql.TableValuedAlias.render_derived method sets up the derived columns specification:

    >>> from sqlalchemy import select, func, column, Integer, Text
    >>> stmt = select(
    ...     func.json_to_record('{"a":1,"b":[1,2,3],"c":"bar"}').table_valued(
    ...         column("a", Integer), column("b", Text), column("d", Text),
    ...     ).render_derived(name="x", with_types=True)
    ... )
    >>> print(stmt)
    SELECT x.a, x.b, x.d
    FROM json_to_record(:json_to_record_1) AS x(a INTEGER, b TEXT, d TEXT)
    
  • WITH ORDINALITY - part of the SQL standard, WITH ORDINALITY adds an ordinal counter to the output of a function and is accepted by a limited set of PostgreSQL functions including unnest() and generate_series(). The _functions.FunctionElement.table_valued method accepts a keyword parameter with_ordinality for this purpose, which accepts the string name that will be applied to the "ordinality" column:

    >>> from sqlalchemy import select, func
    >>> stmt = select(
    ...     func.generate_series(4, 1, -1).table_valued("value", with_ordinality="ordinality")
    ... )
    >>> print(stmt)
    SELECT anon_1.value, anon_1.ordinality
    FROM generate_series(:generate_series_1, :generate_series_2, :generate_series_3) WITH ORDINALITY AS anon_1
    
New in version 1.4.0b2.

Column Valued Functions

Similar to the table valued function, a column valued function is present in the FROM clause, but delivers itself to the columns clause as a single scalar value. PostgreSQL functions such as json_array_elements(), unnest() and generate_series() may use this form. Column valued functions are available using the _functions.FunctionElement.column_valued method of _functions.FunctionElement:

  • json_array_elements():

    >>> from sqlalchemy import select, func
    >>> stmt = select(func.json_array_elements('["one", "two"]').column_valued("x"))
    >>> print(stmt)
    SELECT x
    FROM json_array_elements(:json_array_elements_1) AS x
    
  • unnest() - in order to generate a PostgreSQL ARRAY literal, the _postgresql.array construct may be used:

    >>> from sqlalchemy.dialects.postgresql import array
    >>> from sqlalchemy import select, func
    >>> stmt = select(func.unnest(array([1, 2])).column_valued())
    >>> print(stmt)
    SELECT anon_1
    FROM unnest(ARRAY[%(param_1)s, %(param_2)s]) AS anon_1
    

    The function can of course be used against an existing table-bound column that's of type _types.ARRAY:

    >>> from sqlalchemy import table, column, ARRAY, Integer
    >>> from sqlalchemy import select, func
    >>> t = table("t", column('value', ARRAY(Integer)))
    >>> stmt = select(func.unnest(t.c.value).column_valued("unnested_value"))
    >>> print(stmt)
    SELECT unnested_value
    FROM unnest(t.value) AS unnested_value
    

Row Types

Built-in support for rendering a ROW may be approximated using func.ROW with the _sa.func namespace, or by using the _sql.tuple_ construct:

>>> from sqlalchemy import table, column, func, tuple_
>>> t = table("t", column("id"), column("fk"))
>>> stmt = t.select().where(
...     tuple_(t.c.id, t.c.fk) > (1,2)
... ).where(
...     func.ROW(t.c.id, t.c.fk) < func.ROW(3, 7)
... )
>>> print(stmt)
SELECT t.id, t.fk
FROM t
WHERE (t.id, t.fk) > (:param_1, :param_2) AND ROW(t.id, t.fk) < ROW(:ROW_1, :ROW_2)

Table Types passed to Functions

PostgreSQL supports passing a table as an argument to a function, which it refers towards as a "record" type. SQLAlchemy _sql.FromClause objects such as _schema.Table support this special form using the _sql.FromClause.table_valued method, which is comparable to the _functions.FunctionElement.table_valued method except that the collection of columns is already established by that of the _sql.FromClause itself:

>>> from sqlalchemy import table, column, func, select
>>> a = table( "a", column("id"), column("x"), column("y"))
>>> stmt = select(func.row_to_json(a.table_valued()))
>>> print(stmt)
SELECT row_to_json(a) AS row_to_json_1
FROM a
New in version 1.4.0b2.

ARRAY Types

The PostgreSQL dialect supports arrays, both as multidimensional column types as well as array literals:

  • _postgresql.ARRAY - ARRAY datatype
  • _postgresql.array - array literal
  • _postgresql.array_agg - ARRAY_AGG SQL function
  • _postgresql.aggregate_order_by - helper for PG's ORDER BY aggregate function syntax.

JSON Types

The PostgreSQL dialect supports both JSON and JSONB datatypes, including psycopg2's native support and support for all of PostgreSQL's special operators:

  • _postgresql.JSON
  • _postgresql.JSONB

HSTORE Type

The PostgreSQL HSTORE type as well as hstore literals are supported:

  • _postgresql.HSTORE - HSTORE datatype
  • _postgresql.hstore - hstore literal

ENUM Types

PostgreSQL has an independently creatable TYPE structure which is used to implement an enumerated type. This approach introduces significant complexity on the SQLAlchemy side in terms of when this type should be CREATED and DROPPED. The type object is also an independently reflectable entity. The following sections should be consulted:

  • _postgresql.ENUM - DDL and typing support for ENUM.
  • .PGInspector.get_enums - retrieve a listing of current ENUM types
  • .postgresql.ENUM.create , .postgresql.ENUM.drop - individual CREATE and DROP commands for ENUM.

Using ENUM with ARRAY

The combination of ENUM and ARRAY is not directly supported by backend DBAPIs at this time. Prior to SQLAlchemy 1.3.17, a special workaround was needed in order to allow this combination to work, described below.

Changed in version 1.3.17: The combination of ENUM and ARRAY is now directly handled by SQLAlchemy's implementation without any workarounds needed.
from sqlalchemy import TypeDecorator
from sqlalchemy.dialects.postgresql import ARRAY

class ArrayOfEnum(TypeDecorator):
    impl = ARRAY

    def bind_expression(self, bindvalue):
        return sa.cast(bindvalue, self)

    def result_processor(self, dialect, coltype):
        super_rp = super(ArrayOfEnum, self).result_processor(
            dialect, coltype)

        def handle_raw_string(value):
            inner = re.match(r"^{(.*)}$", value).group(1)
            return inner.split(",") if inner else []

        def process(value):
            if value is None:
                return None
            return super_rp(handle_raw_string(value))
        return process

E.g.:

Table(
    'mydata', metadata,
    Column('id', Integer, primary_key=True),
    Column('data', ArrayOfEnum(ENUM('a', 'b, 'c', name='myenum')))

)

This type is not included as a built-in type as it would be incompatible with a DBAPI that suddenly decides to support ARRAY of ENUM directly in a new version.

Using JSON/JSONB with ARRAY

Similar to using ENUM, prior to SQLAlchemy 1.3.17, for an ARRAY of JSON/JSONB we need to render the appropriate CAST. Current psycopg2 drivers accommodate the result set correctly without any special steps.

Changed in version 1.3.17: The combination of JSON/JSONB and ARRAY is now directly handled by SQLAlchemy's implementation without any workarounds needed.
class CastingArray(ARRAY):
    def bind_expression(self, bindvalue):
        return sa.cast(bindvalue, self)

E.g.:

Table(
    'mydata', metadata,
    Column('id', Integer, primary_key=True),
    Column('data', CastingArray(JSONB))
)
Constant AUTOCOMMIT​_REGEXP Undocumented
Constant IDX​_USING Undocumented
Constant RESERVED​_WORDS Undocumented
Variable colspecs Undocumented
Variable ischema​_names Undocumented
Class _​Colon​Cast Undocumented
Class ​PGCompiler No class docstring; 1/29 method documented
Class ​PGDDLCompiler Undocumented
Class ​PGDeferrable​Connection​Characteristic Undocumented
Class ​PGDialect No class docstring; 0/12 instance variable, 0/20 class variable, 1/38 method documented
Class ​PGExecution​Context Undocumented
Class ​PGIdentifier​Preparer Undocumented
Class ​PGInspector No class docstring; 4/4 methods documented
Class ​PGRead​Only​Connection​Characteristic Undocumented
Class ​PGType​Compiler Undocumented
Constant ​_DECIMAL​_TYPES Undocumented
Constant ​_FLOAT​_TYPES Undocumented
Constant ​_INT​_TYPES Undocumented
AUTOCOMMIT_REGEXP =

Undocumented

Value
re.compile(r'\s*(?:UPDATE|INSERT|CREATE|DELETE|DROP|ALTER|GRANT|REVOKE|IMPORT FO
REIGN SCHEMA|REFRESH MATERIALIZED VIEW|TRUNCATE)',
           re.I|re.UNICODE)
IDX_USING =

Undocumented

Value
re.compile(r'^(?:btree|hash|gist|gin|[\w_]+)$',
           re.I)
RESERVED_WORDS =

Undocumented

Value
set(['all',
     'analyse',
     'analyze',
     'and',
     'any',
     'array',
     'as',
...
colspecs =

Undocumented

ischema_names =

Undocumented

_DECIMAL_TYPES: tuple[int, ...] =

Undocumented

Value
(1231, 1700)
_FLOAT_TYPES: tuple[int, ...] =

Undocumented

Value
(700, 701, 1021, 1022)
_INT_TYPES: tuple[int, ...] =

Undocumented

Value
(20, 21, 23, 26, 1005, 1007, 1016)