ManhattanPlot Examples and Reference


Default ManhattanPlot

An example of a default ManhattanPlot component without any extra properties.

import pandas as pd

import dash
import dash_bio as dashbio
import dash_html_components as html
import dash_core_components as dcc


external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']

app = dash.Dash(__name__, external_stylesheets=external_stylesheets)

df = pd.read_csv(
    'https://raw.githubusercontent.com/plotly/dash-bio-docs-files/master/' +
    'manhattan_data.csv'
)

app.layout = html.Div([
    'Threshold value',
    dcc.Slider(
        id='manhattanplot-input',
        min=1,
        max=10,
        marks={
            i: {'label': str(i)} for i in range(10)
        },
        value=6
    ),
    html.Br(),
    html.Div(
        dcc.Graph(
            id='my-dashbio-manhattanplot',
            figure=dashbio.ManhattanPlot(
                dataframe=df
            )
        )
    )
])


@app.callback(
    dash.dependencies.Output('my-dashbio-manhattanplot', 'figure'),
    [dash.dependencies.Input('manhattanplot-input', 'value')]
)
def update_manhattanplot(threshold):

    return dashbio.ManhattanPlot(
        dataframe=df,
        genomewideline_value=threshold
    )


if __name__ == '__main__':
    app.run_server(debug=True)
Threshold value


Line Colors

Change the colors of the suggestive line and the genome-wide line.

import pandas as pd
import dash_core_components as dcc
import dash_bio as dashbio

df = pd.read_csv("https://raw.githubusercontent.com/plotly/dash-bio-docs-files/master/manhattan_data.csv")

n_chr = 23  # number of chromosome pairs in humans
assert 'CHR' in df.columns
assert df['CHR'].max() == n_chr

# Trim down the data
DATASET = df.groupby('CHR').apply(lambda u: u.head(50))
DATASET = DATASET.droplevel('CHR').reset_index(drop=True)

manhattanplot = dashbio.ManhattanPlot(
    dataframe=DATASET,
    suggestiveline_color='#AA00AA',
    genomewideline_color='#AA5500'
)

dcc.Graph(figure=manhattanplot)  

Highlighted Points Color

Change the color of the points that are considered significant.

import pandas as pd
import dash_core_components as dcc
import dash_bio as dashbio

df = pd.read_csv("https://raw.githubusercontent.com/plotly/dash-bio-docs-files/master/manhattan_data.csv")

n_chr = 23  # number of chromosome pairs in humans
assert 'CHR' in df.columns
assert df['CHR'].max() == n_chr

# Trim down the data
DATASET = df.groupby('CHR').apply(lambda u: u.head(50))
DATASET = DATASET.droplevel('CHR').reset_index(drop=True)

manhattanplot = dashbio.ManhattanPlot(
    dataframe=DATASET,
    highlight_color='#00FFAA'
)

dcc.Graph(figure=manhattanplot)  

Access this documentation in your Python terminal with:
```python

help(dash_bio.ManhattanPlot)
```

dataframe (dataframe; required): A pandas dataframe which must contain at least the following three columns: - the chromosome number - genomic base-pair position - a numeric quantity to plot such as a p-value or zscore

annotation (string; optional): A string denoting the column to use as annotations. This column could be a string or a float. It could be any annotation information that you want to include in the plot (e.g., zscore, effect size, minor allele frequency).

bp (string; default 'BP'): A string denoting the column name for the chromosomal position.

chrm (string; default 'CHR'): A string denoting the column name for the chromosome. This column must be float or integer. Minimum number of chromosomes required is 1. If you have X, Y, or MT chromosomes, be sure to renumber these 23, 24, 25, etc.

col (string; optional): A string representing the color of the points of the scatter plot. Can be in any color format accepted by plotly.graph_objects.

gene (string; default 'GENE'): A string denoting the column name for the GENE names. This column could be a string or a float. More generally, it could be any annotation information that you want to include in the plot.

genomewideline_value (bool | float; default -log10(5e-8)): A boolean which must be either False to deactivate the option, or a numerical value corresponding to the p-value above which the data points are considered significant.

genomewideline_color (string; default 'red'): Color of the genome-wide line. Can be in any color format accepted by plotly.graph_objects.

genomewideline_width (number; default 1): Width of the genome-wide line.

highlight (bool; default True): turning on/off the highlighting of data points considered significant.

highlight_color (string; default 'red'): Color of the data points highlighted because they are significant. Can be in any color format accepted by plotly.graph_objects. # … Example 1: Random Manhattan Plot ‘’‘ dataframe = pd.DataFrame( np.random.randint(0,100,size=(100, 3)), columns=[‘P’, ‘CHR’, ‘BP’]) fig = create_manhattan(dataframe, title=’XYZ Manhattan plot’) plotly.offline.plot(fig, image=’png’) ‘’‘

logp (bool; optional): If True, the -log10 of the p-value is plotted. It isn’t very useful to plot raw p-values; however, plotting the raw value could be useful for other genome-wide plots (e.g., peak heights, Bayes factors, test statistics, other “scores”, etc.)

p (string; default 'P'): A string denoting the column name for the float quantity to be plotted on the y-axis. This column must be numeric. It does not have to be a p-value. It can be any numeric quantity such as peak heights, Bayes factors, test statistics. If it is not a p-value, make sure to set logp = False.

point_size (number; default 5): Size of the points of the Scatter plot.

snp (string; default 'SNP'): A string denoting the column name for the SNP names (e.g., rs number). More generally, this column could be anything that identifies each point being plotted. For example, in an Epigenomewide association study (EWAS), this could be the probe name or cg number. This column should be a character. This argument is optional, however it is necessary to specify it if you want to highlight points on the plot, using the highlight argument in the figure method.

showgrid (bool; default true): Boolean indicating whether gridlines should be shown.

showlegend (bool; default true): Boolean indicating whether legends should be shown.

suggestiveline_value (bool | float; default 8): A value which must be either False to deactivate the option, or a numerical value corresponding to the p-value at which the line should be drawn. The line has no influence on the data points.

suggestiveline_color (string; default 'grey'): Color of the suggestive line.

suggestiveline_width (number; default 2): Width of the suggestive line.

title (string; default 'Manhattan Plot'): The title of the graph.

xlabel (string; optional): Label of the x axis.

ylabel (string; default '-log10(p)'): Label of the y axis.