This is the 5th chapter of the essential Dash Tutorial. The
previous chapter covered how to use callbacks
with the <code>dash_core_components.Graph<code> component. The rest of the Dash
documentation covers other topics like multi-page apps and component
libraries. Just getting started? Make sure to install the necessary
dependencies. The next and final chapter covers
frequently asked questions and gotchas.
globalvariables will break your app
Dash is designed to work in multi-user environments
where multiple people may view the application at the
same time and will have independent sessions.
If your app uses modified
then one user’s session could set the variable to one value
which would affect the next user’s session.
Dash is also designed to be able to run with multiple python
workers so that callbacks can be executed in parallel.
This is commonly done with
gunicorn using syntax like
$ gunicorn --workers 4 app:server
app refers to a file named
server refers to a variable
in that file named
server = app.server).
When Dash apps run across multiple workers, their memory
is not shared. This means that if you modify a global
variable in one callback, that modification will not be
applied to the rest of the workers.
In order to share data safely across multiple python
processes, we need to store the data somewhere that is accessible to
each of the processes.
There are three main places to store this data:
1 - In the user’s browser session
2 - On the disk (e.g. on a file or on a new database)
3 - In a shared memory space like with Redis
The following three examples illustrate these approaches.
To save data in user’s browser’s session:
- Implemented by saving the data as part of Dash’s front-end store
through methods explained in
- Data has to be converted to a string like JSON for storage and transport
- Data that is cached in this way will only be available in the
user’s current session.
- If you open up a new browser, the app’s callbacks will always
compute the data. The data is only cached and transported between
callbacks within the session.
- As such, unlike with caching, this method doesn’t increase the
memory footprint of the app.
- There could be a cost in network transport. If you’re sharing 10MB
of data between callbacks, then that data will be transported over
the network between each callback.
- If the network cost is too high, then compute the aggregations
upfront and transport those.
Your app likely won’t be displaying 10MB of data,
it will just be displaying a subset or an aggregation of it.
Sending the computed data over the network can be expensive if
the data is large. In some cases, serializing this data and JSON
can also be expensive.
In many cases, your app will only display a subset or an aggregation
of the computed or filtered data. In these cases, you could precompute
your aggregations in your data processing callback and transport these
aggregations to the remaining callbacks.
- Uses Redis via Flask-Cache for storing “global variables”.
This data is accessed through a function, the output of which is
cached and keyed by its input arguments.
- Uses the hidden div solution to send a signal to the other
callbacks when the expensive computation is complete.
- Note that instead of Redis, you could also save this to the file
system. See https://flask-caching.readthedocs.io/en/latest/
for more details.
- This “signaling” is cool because it allows the expensive
computation to only take up one process.
Without this type of signaling, each callback could end up
computing the expensive computation in parallel,
locking four processes instead of one.
This approach is also advantageous in that future sessions can
use the pre-computed value.
This will work well for apps that have a small number of inputs.
Here’s what this example looks like. Some things to note:
The previous example cached computations on the filesystem and
those computations were accessible for all users.
In some cases, you want to keep the data isolated to user sessions:
one user’s derived data shouldn’t update the next user’s derived data.
One way to do this is to save the data in a hidden
as demonstrated in the first example.
Another way to do this is to save the data on the
filesystem cache with a session ID and then reference the data
using that session ID. Because data is saved on the server
instead of transported over the network, this method is generally faster than the
“hidden div” method.
This example was originally discussed in a
Dash Community Forum thread.
- Caches data using the
flask_caching filesystem cache. You can also save to an in-memory database like Redis.
- Serializes the data as JSON.
- If you are using Pandas, consider serializing
with Apache Arrow. Community thread
- Saves session data up to the number of expected concurrent users.
This prevents the cache from being overfilled with data.
- Creates unique session IDs by embedding a hidden random string into
the app’s layout and serving a unique layout on every page load.
Note: As with all examples that send data to the client, be aware
that these sessions aren’t necessarily secure or encrypted.
These session IDs may be vulnerable to
There are three things to notice in this example:
- The timestamps of the dataframe don’t update when we retrieve
the data. This data is cached as part of the user’s session.
- Retrieving the data initially takes five seconds but successive queries
are instant, as the data has been cached.
- The second session displays different data than the first session:
the data that is shared between callbacks is isolated to individual
Questions? Discuss these examples on the
Dash Community Forum.