Nebula

v4.51

Analysis Status

📊 Data matrix

Try example data

Copy & Paste

Clean Data Matrix File

DIANN Protein Group Matrix

Row ID column for plots:

Expected format: DIANN `results.pg*_matrix.tsv` (first six columns kept as row annotations for search; intensity columns from column 7 onward).

DIANN Gene Group Matrix

Expected format: DIANN `results.gg*_matrix.tsv` (first column `Genes` as row ID; intensity columns from column 4 onward, parsed with the same DIANN header rules).

Data Format: TSV or CSV

First row: Column headers

First column: Row IDs

DIANN mode: Use "DIANN Protein Group Matrix" uploader for `results.pg*_matrix.tsv` (intensity starts at column 7; row IDs from your selected DIANN ID column, default `Protein.Names`; first six columns are stored for Row Profile search). Use "DIANN Gene Group Matrix" uploader for `results.gg*_matrix.tsv` (intensity starts at column 4; row IDs always use the first column `Genes`).

📋 Meta table

Upload Meta Table

Copy & Paste

Browse & Load File

Note: Meta table is auto-generated from matrix column names when you load a data matrix. Upload here to override.

Editable: Use Data PreProcess → Meta Table sub-tab to edit cells (Sample_ID is read-only).

💾 Session snapshot (JSON / report JS)

Save or reload the matrix, meta table, heatmap clustering result, PCA/t-SNE (if computed), optionally the Clustergrammer network, plus Differential, Enrichr, and SAINT results when present (formatVersion 3+). The JSON includes a plotting section with heatmap / PCA / t-SNE group-annotation checkbox and color-scheme choices. Export as qc_session.js and place it at data/qc_session.js to auto-load as a bundled report (see README).

Include Clustergrammer network (larger file)

Tip: Large matrices and the Clustergrammer network increase file size; uncheck the include option if you only need matrix, meta, and plotting settings.

Row Filter

Filter Rows by Valid values

Scope

Minimum count of valid values Minimum percentage of valid values

Threshold

Column Filter

Column Select

Check or uncheck sample columns here. Pattern select, Choose by groups, and Valid values can update the same checkboxes.

Sel	Column

Pattern select

Pattern (substring / regex)

Contains Regex

Choose by groups

Group by (meta column) Then by (optional)

Choose a grouping column to list groups.

Filter columns by Valid values

Scope

Minimum count of valid values Minimum percentage of valid values

Threshold

DIANN filters

Filter by DIANN pg-matrix annotation columns, then Apply to matrix. Only rows that pass all enabled criteria are kept for downstream analysis.

Annotation table: row order, sort, and pagination follow the Data Filter sub-tab for Passed. Use Filtered out to review rows dropped by the last Apply to matrix from this sidebar.

Name mapping

Map matrix row IDs to gene symbols, protein names, and UniProt accessions via MyGene.info (free). Results are stored for feature labels and Row Profile search (like DIANN annotations).

Species NCBI taxid (if Other) Interpret row IDs as

Idle.

API: MyGene.info (no key). Large tables are queried in batches; use Stop to keep partial results.

Offline: Serve over http://localhost if the browser blocks requests.

Annotation table: row order, sort, and pagination follow the Data Filter sub-tab.

Log

Meta Table

Note: Meta table is auto-generated from matrix column names. Override by uploading in Data Preparation.

Editable: Click cells to edit (Sample_ID is read-only)

Batch edit: Ctrl/Cmd+click to multi-select cells in one column, Shift+click for range, or use "Select All Visible (Same Column)".

Overall

High-level view of the loaded intensity matrix (current Data Filter result when applicable). Quantified ID = finite value > 0 in a cell.

Per-column summary (Plotly)

Bar metric

Uses all columns. Transform / treat-zero follow the Column correlation sidebar (shared controls).

Load a data matrix in Data Preparation (or apply filters), then use Refresh dashboard.

Row Profile Bar Chart

Row label (list & plot legend)

Uses matrix row IDs if DIANN protein-group annotations are not loaded.

Load data to search and select IDs.

Selected: 0 rows

Plot Settings

Plot type Color theme Use log scale (Y axis) Z-score per row (across samples) — normalize each selected row to mean 0, sd 1 so trends are comparable between rows

Tip: Search and select rows above, then click "Plot selected rows".

Row profile plot not generated yet

Use quick selection or the checklist in the sidebar, then click "Plot selected rows".

Column profile

Per sample column: ranked intensities, cumulative %, treemap, pie, and log10(1+I) histogram + density for all features (interactive Plotly). Pick a sample in the sub-tabs above the plots.

Display options

Row/feature label Max features (bar, curve, treemap, pie) Use log10(1 + intensity) for bar & line chart

Column Correlation

Sample–sample QC: use Overall correlations for subset heatmaps, Paired correlation for scatter / QQ / Bland–Altman. Transform and zero-handling also apply to Data QC → Overall per-column summary.

Shared scale

Transform Treat 0 as missing

Pair of columns

Column X Column Y Feature label (hover) Bland–Altman

Matrices (subset)

Correlation metric Max columns in heatmaps Distance from correlation

Heatmap

Clustering

Cluster features (rows) Cluster samples (columns) Distance metric Linkage

Colors

Color scale Reverse scale (high = cold, low = hot)

Row labels

Feature label

Uses DIANN columns and/or Name mapping (Data PreProcess) when loaded. Changing this redraws the plot without re-clustering.

Figure & margins

Title (optional)

Width (px)

Height (px)

Row dendro %

Column dendro %

Margin left (px)

Margin right (px)

Margin bottom (px)

Increase if angled column names are clipped; 0 is allowed. The framed area scrolls if the figure is larger than the panel.

Heatmap not generated yet

Click "Generate Heatmap" button to create the visualization

Clustergrammer

Feature label (rows)

Applies to Cluster and visualize (current matrix) and to the heatmap-side Clustergrammer when a heatmap exists. Changing it refreshes an open Clustergrammer tab or rebuilds from the cached heatmap.

Distance Linkage Pre-filter rows by variance (optional) Log10 transform Z-score per row

Use current data to start.

PCA Controls

PC to plot on X-axis: PC to plot on Y-axis: PC to plot on Z-axis (3D): Show Column Labels Auto label font/color Label font size (manual): Label color (manual): Auto show arrows Show arrows (manual) Show group confidence ellipses (95%) Plot Width (px): Plot Height (px): Main title (optional): Max Samples for PCA: Max Features for PCA:

Reducing features significantly speeds up PCA for large datasets

Number of Components: Fast Mode (fewer iterations, faster but less precise)

Performance: Using optimized typed arrays for faster computation. For best speed, reduce features to 200-500.

Data Preprocessing: Log10 Transform Column-wise Z-score Normalization Row-wise Z-score Normalization

PCA plot not generated yet

Click "Generate PCA Plot" button to create the visualization

PCA 3D plot not generated yet

Generate PCA to explore interactive 3D rotation, zoom, and metadata tooltips.

t-SNE Controls

Perplexity (5–50): Max iterations: Show Column Labels Auto label font/color Label font size (manual): Label color (manual): Auto show arrows Show arrows (manual) Plot Width (px): Plot Height (px): Main title (optional): Max Samples: Max Features:

t-SNE works well for many samples; limit features for speed.

Data Preprocessing: Log10 Transform Column-wise Z-score Normalization Row-wise Z-score Normalization

t-SNE plot not generated yet

Click "Generate t-SNE Plot" to create the embedding and scatter plot

UpSet / Venn / Karnaugh

Each set is one sample column. A row (feature) belongs to a set if that cell passes the presence rule below. Three linked views (see UpSet.js components) use UpSet.js (AGPL-3.0). Each chart’s Pop out button (next to its title in the main panel) opens popout/upset_popout.html via postMessage (needed for file://); allow pop-ups. The Heatmap tab uses the same pattern with popout/heatmap_popout.html. Plot toolbar PNG/SVG/dump/VEGA stay on the default UpSet.js controls.

Presence rule

Present if value is a finite non-zero number (0 counts as missing)

Present if intensity is strictly greater than a threshold (low values = low confidence; 0 still counts as missing):

Threshold Default 0 ⇒ same as >0

Set combinations (UpSet plot)

Same idea as UpSet.js App — applies only to the UpSet bar view (see data / generateCombinations).

Ordering

Mode

Minimum set members (degree)

Maximum set members (degree)

Max # combinations (after sort)

Include empty combinations

Hover a region in any view to highlight the same intersection in the others (linked selection).

UpSet

Venn diagram

Karnaugh map

Differential analysis

Compare groups using a meta table column (join matrix columns to Sample_ID). Two groups: Welch / Student t-test; optional fudge factor volcano (SAM-style s₀, curved cutoff). Multiple groups: one-way ANOVA or Kruskal–Wallis. Requires matrix + meta (edit under Data PreProcess → Meta Table).

Comparison mode Group by (meta column)

Group A Group B

Volcano Y-axis

Top N features by raw p-value; matrix shows group means on the tested scale (optional row z-score). Run multi-group analysis first.

Showing only Significant Changed

Feature	mean A	mean B	log2FC	t	df	p	FDR	sig

SAINT

Uses the Data Preparation matrix and meta table. Map a meta column to control vs treatment, and choose a Bait column.

Sample grouping

Use Status column as-is (values T / C) Group column (when not using Status as-is) Control value → C Treatment value → T Bait column (required)

Input settings

Input level Protein column name Peptide column name Fragment column name

No results yet. Run SAINT after loading matrix + meta.

Enrichr

Open this tab to load libraries.

Plot options

Uses the full Enrichr term list (not only the current results page). Pick a view below; use Redraw plots after changing options.

Results heatmap

Same clustering pipeline and layout as the top-level Heatmap tab, but the plotted matrix comes from the Enrichr Results table: rows = ontology / library terms, columns = per-sample intensities attached to those terms (not the original protein × sample matrix). Statistics below only filter which terms are included.

Row filters

Leave a field empty to skip that filter.

Max adj. P-value Max P-value Min combined score Min odds ratio Min overlap (genes)

Clustering

Cluster ontology terms (rows) Cluster samples (columns) Distance metric Linkage

Colors

Color scale Reverse scale

Figure & dendrogram %

Title (optional)

Width px

Height px

Row dendro % Column dendro % Margin L Margin R Margin B

From the Results sub-tab, run enrichment with a loaded matrix so each term has per-sample intensities. Adjust filters, then Generate clustered heatmap (ontology terms × samples). Group bars use the same meta checkboxes as the Heatmap tab when columns match Sample_ID.

Document

Outline

Overview

Nebula is the application name for this single-page intensity-matrix QC and visualization tool (browser-based). This tab documents the key hard-coded behaviors in the app for data import, replicate parsing, and group completeness filtering (plus the technical-replicate algorithm where it still applies in code).

SAINT and Enrichr (v2.29): load matrix and meta in Data Preparation, then use SAINT to map control/treatment and bait columns, run analysis, and open Network / Scatter Plot sub-tabs. Enrichr accepts pasted genes, current matrix row labels, or SAINT prey names after a run. D3 v7 for SAINT is loaded on demand and does not replace Clustergrammer’s D3 v3.

Enrichr enrichment plots (v2.69+): After enrichment returns the term table, open the Enrichment plots sub-tab (next to Results table) for Plotly charts built from the full API result (not only the paginated table). The main panel uses tabs for the horizontal bar chart (rank by adjusted P, P-value, or combined score), the bubble plot (overlap count or odds ratio vs term; color −log10(adjusted P) or adjusted P), and the Jaccard heatmap between top terms using overlapping genes from each row. Plot options sit in the left sidebar; use Export results TSV for a tab-separated download of parsed columns (includes one TSV column per data matrix sample when row count matches and gene mapping is available).

Enrichr → Heatmap (v3.01+): Third sub-tab draws a Plotly heatmap of per-sample matrix intensities only (columns = samples). Adjusted P, P-value, combined score, odds ratio, and overlap are row filters in the sidebar; max adj. P defaults to 0.05 (clear to disable). Terms are ranked and capped by max rows. For hierarchical clustering of the full protein × sample matrix, use the top-level Heatmap tab.

Heatmap pop-out (v2.82+): After generating a heatmap, use Pop out in the left sidebar (to the right of Generate heatmap) to open popout/heatmap_popout.html in a new window with the same Plotly figure (full-window autosize), using postMessage and sessionStorage like the UpSet/Venn pop-out. The button appears only after a heatmap exists.

Data QC → Overall (v4.18+): Matrix-wide QC on the current matrix (v4.19+: Plotly per-column summary bars for all samples — mean raw/transformed or detection rate — plus D3 box plots and total Σ log10(1+I); transform / treat-zero follow Data QC → Column correlation shared controls). Use Refresh dashboard after loading or filtering; auto-refresh when Data Filter updates while Overall is active.

Data QC → Column correlation (v4.19+): Inner tabs Overall correlations (subset heatmaps) and Paired correlation (scatter, QQ, Bland–Altman, r vs X) use the same data-filter-inner-tabs-row / data-filter-inner-tab markup and styling as Data PreProcess → Data Filter nested tabs (e.g. Row Filter → Passed). Shared Transform / Treat 0 as missing at the top of the sidebar.

Use the outline on the left to jump to specific sections (including Column profile and Column correlation under Data PreProcess, sections 8–9).

Input Data

- Accepts clean matrix files (TSV/CSV/XLSX) and DIANN protein-group matrices.
- First row is treated as sample columns; first column is row ID.
- After loading a DIANN protein group matrix, use Data PreProcess → DIANN annotations for a read-only table of the first file columns; sort order and pagination match the Data Filter sub-tab.

DIANN Upload Rules

- For DIANN `results.pg*_matrix.tsv`, intensity values are read from column 7 onward; the first six columns are stored as per-row annotation (same file).
- **Data PreProcess → DIANN annotations** lists those columns in a read-only table (same row order, sort, and pagination as Data Filter).
- Default row-ID column is `Protein.Names` (plots and tables use this ID); you can choose `Genes`, `First.Protein.Description`, or `Protein.Group` instead.
- **Data PreProcess → Row Profile** search matches the primary ID plus all stored annotation fields (e.g. gene, protein names, descriptions). **Row label** dropdown chooses which DIANN annotation column labels the checklist and plot legend (`Protein.Group`, `Protein.Names`, `Genes`, `First.Protein.Description`); without DIANN pg annotations, matrix row IDs are used. Sidebar width is enlarged for the checklist. Quick selection: **Select all rows**, **Select top N** by row intensity sum (numeric input, default 5), **Clear selection**.
- If the selected ID column is missing from the header, fallback order is `Protein.Names` → `Genes` → `First.Protein.Description` → `Protein.Group` → first column.
- If selected ID is not `Protein.Group`, rows with blank selected IDs are removed.
- For DIANN `results.gg*_matrix.tsv`, intensity values are read from column 4 onward and row IDs always come from the first column (`Genes`).
- Sample names are simplified from raw file paths to concise informative names.

Name mapping (Data PreProcess)

Data PreProcess → Name mapping calls the free MyGene.info API (no key) to map matrix row IDs to gene_symbol, protein_name, uniprotkb_ac, and related fields. Choose species (human / mouse / rat / custom NCBI taxid), whether IDs are gene symbols, UniProt accessions, or auto (accession regex vs symbol), then Run mapping. Large tables are queried in batches; Stop keeps partial results.

Results are stored in the session as featureNameMapHeaders / featureNameMapRows and appear in feature label dropdowns (heatmap, Clustergrammer, Differential, column profile, column correlation, Row Profile) and in Row Profile search — similar to DIANN row annotations. Mapping is best-effort; ambiguous or missing hits are noted in mapping_note.

If the browser blocks cross-origin requests (e.g. opening the app as file://), serve the folder over http://localhost or another HTTP server so fetch to MyGene can succeed.

Replicate Parsing

- Parser detects replicate tags from sample names.
- Supports full `R#_T#`, or only complete `R1..RN`, or only complete `T1..TN`.
- A replicate axis is accepted only when all samples contain that axis and indices are contiguous from `1` to `N`.

Group completeness filter (Data Filter → Row Filter)

- Apply group completeness filter (Data PreProcess → Data Filter → Row Filter): choose one or two metadata columns to define groups (same combination of meta values ⇒ one group).
- A matrix cell counts as valid if it is finite and > 0.
- Threshold: minimum count of valid columns in the group, or minimum percentage — required valid count = ⌈groupSize × P / 100⌉.
- For each row, if a group fails its threshold, all columns in that group for that row are set to `0`. Rows that are then all non-positive across every sample are removed; heatmap/PCA/t-SNE/Clustergrammer caches are reset.
- Filtered out (Data PreProcess, sub-tab next to Data Filter; v3.12: the tab appears only after at least one row has been dropped): lists row IDs removed for that reason, with intensities at the moment of removal and a short note of the last filter run. The list is cleared when you load new data or a session snapshot. Technical replicate filter (console) updates the same list when it drops all-zero rows.
- If Biological_Replicate / Technical_Replicate cells are empty in the meta table but the matrix column id still contains R# / T# tokens (e.g. DIANN paths), those tags are read from the column id for grouping so replicate triplets are not merged into one huge group.
- The legacy T1..TN technical replicate rule is still implemented in code as applyTechnicalReplicateFiltering() (strict complete technical sets on Group + Biological_Replicate) but is not exposed as a sidebar button.

“Ignore groups with fewer than 2 samples”

Each unique key from your one (or two) grouping column(s) defines a group: the matrix columns whose metadata match that key. The checkbox controls whether single-sample groups — only one sample column shares that key — participate in the filter.

When the option is on (default): only groups with at least two columns are kept. Single-sample groups are excluded from validGroups in the implementation: the completeness rule (min count or min percentage of valid >0 values) is not evaluated for those columns. Their intensities are left as they were before you click Apply (they are not zeroed by this rule, and they are not part of any other group’s column set). Use this when you only want to enforce completeness on replicate groups (e.g. R1–R3) and to avoid orphan conditions or odd one-off samples being forced through a rule that expects several values per group. It also avoids pathological cases with mixed designs: if you required e.g. “min count 3” and the smallest group had only one column, the tool would report an error; dropping singletons from the group list can make the remaining “valid” groups all large enough for your threshold.

When the option is off: every meta key counts as a group, including singletons (group size = 1). Then a count threshold greater than 1 cannot be met for that “group”, and the filter will reject the run if the smallest group size is below your threshold. Percent mode still runs, but for a single column the required count of valid values is ⌈1 × P/100⌉, which is 1 for any P from 1–100, so a singleton always “passes” unless the value is invalid (not >0).

Edge case: if all groups are singletons (e.g. every sample has a unique combination in the chosen meta column(s)), with the option on there are no groups of size ≥ 2 and the filter shows an error like No groups match the chosen columns and minimum group size. Uncheck the option or change the grouping so at least one key has two or more samples.

Post-filter Behavior

- Table and row profile selectors are refreshed.
- Heatmap/PCA/t-SNE/Clustergrammer caches are reset to avoid stale results.

- Data PreProcess → Data Filter: shared matrix preview (Row Filter and Column Filter) uses a scroll area with sticky column headers. ID cells show a native tooltip with DIANN annotation fields (when loaded from a protein group matrix). Column Filter keeps or removes sample columns (per-column checkboxes, optional Choose by group using one or two meta columns—same rules as group completeness—with Check / Uncheck per group); Row Filter includes group completeness, global min-valid, and row ID include actions.

UpSet / Venn / Karnaugh

- Top-level tab UpSet / Venn / K-map uses UpSet.js (lazy-loaded from jsDelivr), following the linked components pattern. The library is licensed under GNU AGPL-3.0; consider license terms if you redistribute or use the app commercially.
- Each set is one matrix sample column. Each row (feature ID) is an element; membership uses the sidebar Presence rule: 0 is always treated as missing/NA. Either any finite non-zero value, or intensity > threshold (user-set cutoff, default threshold 0 ⇒ >0; values ≤ threshold treated as absent).
- Set chooser: checkboxes for columns (default: first five selected), filter box, Select all / Clear, and Refresh plots.
- Set combinations (UpSet plot) (collapsible sidebar block, aligned with UpSet.js App): ordering, mode (intersections / unions / distinct intersections), min and max set-members (degree), max number of combinations after sort, and whether to include empty combinations — implemented with UpSetJS.generateCombinations (docs). Venn and Karnaugh still use the full generated combination lattice for layout.
- Main area (scrollable): UpSet full width on top; Venn and Karnaugh map side by side below (stacks vertically on narrow viewports). Each section title row includes a Pop out button that opens popout/upset_popout.html for that view; membership and UpSet combination options are delivered by postMessage (and sessionStorage when shared, e.g. HTTP). Hover in any view updates linked highlighting in the others (by intersection/set name). The pop-out plot uses the same onHover / selection behavior for that single view and reflows on window resize. The built-in plot share control is hidden so PNG/SVG/dump/VEGA toolbar buttons work reliably. With many sets selected, Venn/Karnaugh can be crowded; a note appears when more than six sets are selected.

Column profile

- Data PreProcess sub-tab Column profile (after Row Profile): one button per sample column above the plots.
- For the selected column, finite numeric values are sorted high → low. Interactive Plotly views: ranked bars + line (linear intensity y-axis: scientific notation, e.g. 1.2e+6), cumulative % of total, treemap, and pie (top N features + Other; sidebar Max features). Optional log10(1 + intensity) for the bar/line panel uses fixed-decimal y ticks on the transformed scale.
- A full-width panel plots a histogram (probability density) and Gaussian KDE curve of log10(1 + intensity) for every feature in that column (not limited by Max features); KDE uses a Silverman bandwidth and subsamples very large matrices for speed.
- Row Profile bar/line chart: linear Value y-axis uses the same scientific tick style; log Y and Z-score modes keep their own tick formats. The plot uses the full width and height of the Row Profile plot column inside a framed container; sample names get extra bottom space and smaller tick fonts when there are many columns.

Column correlation

Data PreProcess sub-tab Column Correlation (after Column profile) explores sample–sample relationships: matrix columns are treated as vectors over rows (features). This is QC / replicate agreement, not differential expression.

Transforms and missing values: Choose none, log10(1 + x), or log2(1 + x). Optional Treat 0 as missing (default on) matches DIANN-style intensity QC. Pair plots and correlations use pairwise-complete rows only (finite values after transform and missing rules).

Nested tab strip: Uses the same data-filter-inner-tabs-row / data-filter-inner-tab styling as Data PreProcess → Data Filter (e.g. Row Filter → Passed). Overall correlations shows subset heatmaps plus a mixed matrix plot: lower triangle = pairwise scatter plots; upper triangle = color-coded correlation cells. Paired correlation shows scatter, QQ, Bland–Altman, and the bar chart of r(column X, j) for each column j in the heatmap subset.

Pair plots: Pick Column X and Column Y (sidebar on the Paired tab); scatter includes an identity line when scales match. QQ compares sorted quantiles. Bland–Altman: difference vs mean, or log2(Y/X) vs geometric mean when both raw values are strictly positive.

Correlation and distance matrices: Pearson or Spearman (ranks with average ties, then Pearson on ranks). Matrices use the first N columns in file/matrix order (Max columns in heatmaps, default 40, cap 80). Distance: 1 − r, √(2(1 − r)), or Euclidean on z-scored column vectors (pairwise-complete rows).

Per-column summary (Plotly): Moved to Data QC → Overall — bars for every sample column (mean raw, mean transformed, or detection rate). Uses the same transform and zero-as-missing rules as this tab’s Shared scale block.

Session JSON saves the sidebar control values (colCorr* ids) so imports restore your settings; reopen the sub-tab or use Refresh plots to redraw.

Differential analysis

The top-level Differential tab compares groups on the loaded intensity matrix using a metadata column (all columns except Sample_ID). Matrix column headers are matched to meta rows by Sample_ID (same join as technical-replicate filtering).

Two groups (default): Choose Group A and Group B. Per-feature Welch (default) or Student t-tests. Optional Log2(x + 1) before testing; with it on, log2FC is the difference of group means on that scale. Without log transform, log2FC is log2((meanB + pseudocount)/(meanA + pseudocount)) on linear means. Treat 0 / invalid as missing and Min valid values per group apply per group. Two-sided p-values use the Student t CDF (incomplete beta). Each group needs at least two samples.

Multiple groups: Mode Multiple groups (ANOVA / Kruskal–Wallis) shows a checklist of meta levels (all checked by default). Each included group must have at least two samples after the meta join. Tests run on the same transformed scale as two-group mode. One-way ANOVA uses the classical F-test (equal variances between groups). Kruskal–Wallis is rank-based; raw-p uses a χ² approximation (Wilson–Hilferty) on H with k−1 df. Effect summaries include partial η² (ANOVA: SSB/SST; KW: (H−df)/(N−1) as a simple effect-size style quantity).

Multi-group plots: Volcano uses η² or the test statistic (F or H) on the x-axis vs −log10(p) or −log10(FDR). Cutoffs in the sidebar (η², optional minimum statistic) combine with p/FDR for point coloring. Mean range (formerly MA for two groups) plots grand mean of group means vs (max − min) group mean. Group heatmap shows the top N features by raw p with optional row z-score. P-value histogram is unchanged.

Multiple testing: None, Benjamini–Hochberg FDR, or Bonferroni on the vector of raw p-values across tested features (same for t-test, ANOVA, and Kruskal–Wallis).

Results table (two-group) — t and df: t is the two-sample t-statistic (difference in group means relative to its standard error). Larger |t| means stronger separation relative to noise. df is the degrees of freedom used by the t-distribution to compute the raw p-value. With Student test (pooled variance), df ≈ n_A + n_B − 2. With Welch test (unequal variance), df uses the Welch–Satterthwaite approximation and may be non-integer.

Fudge factor volcano (two-group only): Optional SAM-style joint rule inspired by Giai Gianetto et al., Proteomics 2016 (uses and misuses of the fudge factor). On the tested scale, SE of the mean difference is approximated as |log2FC / t| (when t ≈ 0, the median SE across features is used). User s₀ adds to SE in the denominator: d = |log2FC| / (SE + s₀). The green guide uses median SE and median df: two line traces (negative and positive log2FC, small gap at 0). For |log2FC| ≥ need = t_★(median SE + s₀), y = y₀ = −log10(α) (flat tails). For |log2FC| < need, height rises in a 1/|x| (hyperbolic) way from y₀ at |x| = need to y_cap at the inner edge of the drawn branch, so the silhouette is flat at the sides and curved “wings” toward the center — not a smooth dome. Point colors (volcano, MA, results table) use the same median-based boundary: a feature is red/blue if its plotted y (−log10 p or FDR, same cap as the plot) lies on or above that curve at its log2FC and the fold direction matches the sign of log2FC. p_mod in the tooltip still uses each feature’s own SE and df (diffStudentTTwoTailP(d, df)) as a t-distribution shorthand — not full SAM permutation. Multi-group ANOVA/Kruskal–Wallis keeps the rectangular volcano; fudge controls are hidden in that mode.

Session JSON: Saves diffAnalysisMode, multi-group test, volcano x-axis and effect cutoffs, heatmap options, diffVolcanoFudgeFactor, diffFudgeS0, and the list of checked ANOVA levels (diffAnovaIncludedLevels) so imports can rebuild the checklist after the meta column is repopulated. formatVersion 3+ also stores the full diffAnalysisLastResult and table sort state so volcano / MA / heatmap / table restore without recomputation.

Not in scope (browser-only tool): Paired tests and full limma empirical Bayes or SAM permutation calibration. DESeq2 / edgeR / limma-voom require count matrices, normalization/dispersion, and an R/Bioconductor runtime — they are not implemented here. For RNA-seq–style DE, export intensities and design to R or a dedicated pipeline.

📊 Data matrix

Try example data

Copy & Paste

Clean Data Matrix File

DIANN Protein Group Matrix

DIANN Gene Group Matrix

📋 Meta table

📋 Example Meta Data

Upload Meta Table

Copy & Paste

Browse & Load File

💾 Session snapshot (JSON / report JS)

Meta Table

Export meta table

Batch Edit Selected Cells

Per-column summary

Distribution across features (box plot)

Total log signal per sample

Row Profile Bar Chart

Quick selection

Plot Settings

Column profile

Display options

Ranked intensity (bars + connecting line)

Cumulative % of total (sorted order)

Treemap (top N + Other)

Pie chart (top N + Other)

Intensity distribution (all features, log10 scale)

Column Correlation

Shared scale

Pair of columns

Matrices (subset)

Scatter-correlation matrix (lower scatter / upper corr)

Distance matrix (subset)

Pair scatter (Y vs X)

QQ plot (sorted quantiles)

Bland–Altman

Correlation with column X

Heatmap

Clustering

Colors

Sampling

Preprocessing

Row labels

Sample group overlays

Clustergrammer

PCA Controls

Group Annotation Labels

t-SNE Controls

Group Annotation Labels

UpSet / Venn / Karnaugh

Set chooser

Presence rule

UpSet

Venn diagram

Karnaugh map

Differential analysis

SAINT

Sample grouping

Input settings

Analysis settings

Analysis log

Network

Filter thresholds

Scatter plot

Enrichr

Gene list source

Plot options

Results heatmap

Row filters

Clustering

Sampling (after filters)

Preprocessing

Colors

Document

Outline

Overview

Input Data

DIANN Upload Rules

Name mapping (Data PreProcess)