Connect a Dash App to an SQL Database

This documentation is for Dash Enterprise.
Dash Enterprise is the fastest way to write & deploy Dash apps and
Jupyter notebooks.
10% of the Fortune 500 uses Dash Enterprise to productionize AI and
data science apps. Find out if your company is using Dash Enterprise

If you want to connect to Databricks, Snowflake, AWS Redshift, or Google BigQuery, consider using Data Sources, which allows you to share access to these services with licensed users on Dash Enterprise.

Managing Connection Drivers and Libraries

Dash apps can use open source libraries to query external databases and datastores in callbacks or job queues.

The Python DB-API standardizes the interface for most Python database access modules, allowing you to write Dash
apps with code that can be modified with little effort to connect to different types of databases with different dialects.
For more information on the Python DB-API, see PEP-0249.

To query databases, you’ll need to install the Python database packages and in many cases system-level dependencies.
For complete examples, see the following apps from the App Catalog:

MySQL App
PostgreSQL App
Oracle App

Working with Connection Objects in a Dash App

Many of the database apps follow a similar pattern:

Install the database driver and system-level dependencies. This will vary by database type and we recommend viewing
the examples that we’ve provided first. Many of the database drivers are more easily installed in Linux environments,
and so we recommend developing these apps in a Dash Enterprise workspace.
Create a connection. We recommend using SQLAlchemy if available for your database.
Do not store your connection’s database password in your app’s code. Instead, we recommend storing it as an environment using the Dash Enterprise App Manager.
Learn how to add an environment variable to your app in Dash Enterprise.
Test that the database server has started successfully, verify that it is accepting connections, and validate the credentials and connection string. To achieve this, consider creating a simple
try_connection function that sends a basic query to the database and checks to see if it is successful. If it fails,
the connection is retried after an exponentially increasing delay. This allows you to distinguish between errors
arising from issues with the callback logic and errors caused by database configuration or connectivity problems. Here’s an example of a try_connection function:
```
@retry(wait=wait_exponential(multiplier=2, min=1, max=10), stop=stop_after_attempt(5))
def try_connection():
try:
with postgres_engine.connect() as connection:
stmt = text(“SELECT 1”)
connection.execute(stmt)
print(“Connection to database successful.”)

except Exception as e:
print(“Connection to database failed, retrying.”)
raise Exception
```
Consider connection pooling.
In the examples above, we don’t use database connection pools. Connection pools have tricky implications when using
alongside Celery or Gunicorn’s --preload option. If you aren’t using --preload nor Celery, then you can improve
your query performance by using a node pool. With a node pool, connections are shared instead of recreated and discarded.
It is possible to use --preload and Celery with a node pool but it requires extra caution. For more details, see:

To disable connection pooling, use NullPool:
from sqlalchemy.pool import NullPool engine = create_engine( 'postgresql+psycopg2://username:password@localhost/test', pool=NullPool)

Construct your SQL query. You have several options here:

Use raw SQL commands. This is the low-level option and great if you want full control and transparency and are comfortable with SQL. We recommend using the text function in sqlalchemy and parameterized queries. This will avoid SQL injection issues:
```python
import pandas as pd
from sqlalchemy import text

t = text(“SELECT * FROM users WHERE id=:user_id”)
result = pd.read_sql(t, params={‘user_id’: 12})
```
Use the Pythonic classes available with sqlalchemy. For examples, see SQLAlchemy Python Tutorial.
Use ibis, a library that uses SQLAlchemy internally but has a more Pandas-like interface.

Open a connection when you need to perform an operation and close it after. In sqlalchemy, this can be done with the with engine.connect() clause.
While a NullPool can do this for us by default as well, we can also use the Engine object to implicitly open and close connections
on an ad hoc basis for both this and other pooling implementations. Enclosed in the with context statement, the
connection object generated by the Engine.connect() method is automatically closed by Connection.close() at the
end of the code block. For example:

python with mssql_engine.connect() as connection: countries = pd.read_sql('SELECT DISTINCT location from data', connection)

Database URLs

As we saw earlier when creating the SQLAlchemy engine object, a database URL needs to be passed to specify the
dialect and driver you’ll use to connect to your database. This can be passed to the pool or directly to the Engine
object to use a default pool.

dialect+driver://username:password@host:port/database

# For PostgreSQL
postgres+psycopg2://username:password@localhost:8050/test-database

For more information on dialects and drivers, see the official SQLAlchemy documentation.