Double helix of DNA stylized to look like a pencil sketch with one nucleotide highlighted in pink to represent minimal editing.

Programmable Genomics Laboratory

Deep Learning • DNA Design • Regulatory Genomics

Are you smarter than a design algorithm?

In the Programmatic Genomics Laboratory (PGL), we use deep learning models to predict characteristics of DNA sequences, and then use design methods to minimally edit sequences such that the models predict our desired characteristics. Sometimes, the edits chosen by these procedures are surprising.

Try it yourself to build your intuition: click any nucleotide below to cycle through A → C → G → T and watch the "neural network" evaluate your design in real time. Right-click any edited nucleotide to restore it to its original base, or use the "reset" button to undo all edits. Note: for now, there is no neural network, only a PWM scan.

Objective: Maximize the score by editing the sequence to contain GATA1 binding motifs (MA0035.2). The score of each motif is calculated from a position weight matrix (PWM) derived from JASPAR2026, with only matches with high enough scores counting as hits. But, avoid using too many edits! Each edit subtracts a point from your score, so you will need to carefully choose which ones you want to make. Motif hits will be underlined, with high information content positions being underlined in pink and low information content positions (usually the flanks) being underlined in gray.

What is the highest score you can get with one GATA motif? What is the highest score you can get with one edit? Did you find any unexpected ways to make the score high?

250 bp Template Sequence — click any nucleotide to edit
Neural network evaluation
0.00
Score
PWM hits (score ≥ ): 0
Edited positions: 0 / 250
No PWM hits yet — try editing toward GATAA.