module documentation

Connecting to PyODBC

The URL here is to be translated to PyODBC connection strings, as detailed in ConnectionStrings.

DSN Connections

A DSN connection in ODBC means that a pre-existing ODBC datasource is configured on the client machine. The application then specifies the name of this datasource, which encompasses details such as the specific ODBC driver in use as well as the network address of the database. Assuming a datasource is configured on the client, a basic DSN-based connection looks like:

engine = create_engine("mssql+pyodbc://scott:tiger@some_dsn")

Which above, will pass the following connection string to PyODBC:

DSN=some_dsn;UID=scott;PWD=tiger

If the username and password are omitted, the DSN form will also add the Trusted_Connection=yes directive to the ODBC string.

Hostname Connections

Hostname-based connections are also supported by pyodbc. These are often easier to use than a DSN and have the additional advantage that the specific database name to connect towards may be specified locally in the URL, rather than it being fixed as part of a datasource configuration.

When using a hostname connection, the driver name must also be specified in the query parameters of the URL. As these names usually have spaces in them, the name must be URL encoded which means using plus signs for spaces:

engine = create_engine("mssql+pyodbc://scott:tiger@myhost:port/databasename?driver=ODBC+Driver+17+for+SQL+Server")

Other keywords interpreted by the Pyodbc dialect to be passed to pyodbc.connect() in both the DSN and hostname cases include: odbc_autotranslate, ansi, unicode_results, autocommit, authentication. Note that in order for the dialect to recognize these keywords (including the driver keyword above) they must be all lowercase. Multiple additional keyword arguments must be separated by an ampersand (&), not a semicolon:

engine = create_engine(
    "mssql+pyodbc://scott:tiger@myhost:49242/databasename"
    "?driver=ODBC+Driver+17+for+SQL+Server"
    "&authentication=ActiveDirectoryIntegrated"
)

The equivalent URL can be constructed using _sa.engine.URL:

from sqlalchemy.engine import URL
connection_url = URL.create(
    "mssql+pyodbc",
    username="scott",
    password="tiger",
    host="myhost",
    port=49242,
    database="databasename",
    query={
        "driver": "ODBC Driver 17 for SQL Server",
        "authentication": "ActiveDirectoryIntegrated",
    },
)

Pass through exact Pyodbc string

A PyODBC connection string can also be sent in pyodbc's format directly, as specified in the PyODBC documentation, using the parameter odbc_connect. A _sa.engine.URL object can help make this easier:

from sqlalchemy.engine import URL
connection_string = "DRIVER={SQL Server Native Client 10.0};SERVER=dagger;DATABASE=test;UID=user;PWD=password"
connection_url = URL.create("mssql+pyodbc", query={"odbc_connect": connection_string})

engine = create_engine(connection_url)

Connecting to databases with access tokens

Some database servers are set up to only accept access tokens for login. For example, SQL Server allows the use of Azure Active Directory tokens to connect to databases. This requires creating a credential object using the azure-identity library. More information about the authentication step can be found in Microsoft's documentation.

After getting an engine, the credentials need to be sent to pyodbc.connect each time a connection is requested. One way to do this is to set up an event listener on the engine that adds the credential token to the dialect's connect call. This is discussed more generally in :ref:`engines_dynamic_tokens`. For SQL Server in particular, this is passed as an ODBC connection attribute with a data structure described by Microsoft.

The following code snippet will create an engine that connects to an Azure SQL database using Azure credentials:

import struct
from sqlalchemy import create_engine, event
from sqlalchemy.engine.url import URL
from azure import identity

SQL_COPT_SS_ACCESS_TOKEN = 1256  # Connection option for access tokens, as defined in msodbcsql.h
TOKEN_URL = "https://database.windows.net/"  # The token URL for any Azure SQL database

connection_string = "mssql+pyodbc://@my-server.database.windows.net/myDb?driver=ODBC+Driver+17+for+SQL+Server"

engine = create_engine(connection_string)

azure_credentials = identity.DefaultAzureCredential()

@event.listens_for(engine, "do_connect")
def provide_token(dialect, conn_rec, cargs, cparams):
    # remove the "Trusted_Connection" parameter that SQLAlchemy adds
    cargs[0] = cargs[0].replace(";Trusted_Connection=Yes", "")

    # create token credential
    raw_token = azure_credentials.get_token(TOKEN_URL).token.encode("utf-16-le")
    token_struct = struct.pack(f"<I{len(raw_token)}s", len(raw_token), raw_token)

    # apply it to keyword arguments
    cparams["attrs_before"] = {SQL_COPT_SS_ACCESS_TOKEN: token_struct}

Tip

The Trusted_Connection token is currently added by the SQLAlchemy pyodbc dialect when no username or password is present. This needs to be removed per Microsoft's documentation for Azure access tokens, stating that a connection string when using an access token must not contain UID, PWD, Authentication or Trusted_Connection parameters.

Enable autocommit for Azure SQL Data Warehouse (DW) connections

Azure SQL Data Warehouse does not support transactions, and that can cause problems with SQLAlchemy's "autobegin" (and implicit commit/rollback) behavior. We can avoid these problems by enabling autocommit at both the pyodbc and engine levels:

connection_url = sa.engine.URL.create(
    "mssql+pyodbc",
    username="scott",
    password="tiger",
    host="dw.azure.example.com",
    database="mydb",
    query={
        "driver": "ODBC Driver 17 for SQL Server",
        "autocommit": "True",
    },
)

engine = create_engine(connection_url).execution_options(
    isolation_level="AUTOCOMMIT"
)

Pyodbc Pooling / connection close behavior

PyODBC uses internal pooling by default, which means connections will be longer lived than they are within SQLAlchemy itself. As SQLAlchemy has its own pooling behavior, it is often preferable to disable this behavior. This behavior can only be disabled globally at the PyODBC module level, before any connections are made:

import pyodbc

pyodbc.pooling = False

# don't use the engine before pooling is set to False
engine = create_engine("mssql+pyodbc://user:pass@dsn")

If this variable is left at its default value of True, the application will continue to maintain active database connections, even when the SQLAlchemy engine itself fully discards a connection or if the engine is disposed.

See Also

pooling - in the PyODBC documentation.

Driver / Unicode Support

PyODBC works best with Microsoft ODBC drivers, particularly in the area of Unicode support on both Python 2 and Python 3.

Using the FreeTDS ODBC drivers on Linux or OSX with PyODBC is not recommended; there have been historically many Unicode-related issues in this area, including before Microsoft offered ODBC drivers for Linux and OSX. Now that Microsoft offers drivers for all platforms, for PyODBC support these are recommended. FreeTDS remains relevant for non-ODBC drivers such as pymssql where it works very well.

Rowcount Support

Pyodbc only has partial support for rowcount. See the notes at :ref:`mssql_rowcount_versioning` for important notes when using ORM versioning.

Fast Executemany Mode

The Pyodbc driver has added support for a "fast executemany" mode of execution which greatly reduces round trips for a DBAPI executemany() call when using Microsoft ODBC drivers, for limited size batches that fit in memory. The feature is enabled by setting the flag .fast_executemany on the DBAPI cursor when an executemany call is to be used. The SQLAlchemy pyodbc SQL Server dialect supports setting this flag automatically when the .fast_executemany flag is passed to _sa.create_engine ; note that the ODBC driver must be the Microsoft driver in order to use this flag:

engine = create_engine(
    "mssql+pyodbc://scott:tiger@mssql2017:1433/test?driver=ODBC+Driver+13+for+SQL+Server",
    fast_executemany=True)

Warning

The pyodbc fast_executemany mode buffers all rows in memory and is not compatible with very large batches of data. A future version of SQLAlchemy may support this flag as a per-execution option instead.

New in version 1.3.

See Also

fast executemany - on github

Setinputsizes Support

The pyodbc cursor.setinputsizes() method can be used if necessary. To enable this hook, pass use_setinputsizes=True to _sa.create_engine:

engine = create_engine("mssql+pyodbc://...", use_setinputsizes=True)

The behavior of the hook can then be customized, as may be necessary particularly if fast_executemany is in use, via the .DialectEvents.do_setinputsizes hook. See that method for usage examples.

Changed in version 1.4.1: The pyodbc dialects will not use setinputsizes unless use_setinputsizes=True is passed.
Class _​BINARY_pyodbc Undocumented
Class ​_ms​_binary​_pyodbc No summary
Class ​_ms​_numeric​_pyodbc Turns Decimals with adjusted() < 0 or > 7 into strings.
Class _​MSFloat_pyodbc Undocumented
Class _​MSNumeric_pyodbc Undocumented
Class _​ODBCDate​Time Undocumented
Class _​ODBCDate​Time​Bind​Processor Add bind processors to handle datetimeoffset behaviors
Class ​_ODBCDATETIMEOFFSET Undocumented
Class _​VARBINARY_pyodbc Undocumented
Class ​MSDialect_pyodbc Undocumented
Class ​MSExecution​Context_pyodbc No class docstring; 0/2 instance variable, 1/2 method documented