Genome replication progression and
epigenetic regulation in mammalian cells

Sunil Kumar Pradhan


Genome Replication Progression and Epigenetic Regulation in Mammalian Cells 

Fortschreiten der Genomreplikation und epigenetische Regulierung in Säugetierzellen 

 
Vom Fachbereich Biologie der Technischen Universität Darmstadt  

zur Erlangung des akademischen Grades  

Doctor rerum naturalium  

genehmigte Dissertation von  

 
Sunil Kumar Pradhan 
  

Master of Science (Research) in Biological Sciences  

 
1. Referentin: Prof. Dr. M. Cristina Cardoso  

2. Referent: Prof. Dr. Heinrich Leonhardt  

 
 Darmstadt, Technische Universität Darmstadt 

2025 

 
1 


Genome Replication Progression and Epigenetic Regulation in Mammalian Cells Fortschreiten der 

Genomreplikation und epigenetische Regulierung in Säugetierzellen © 2025 by Sunil Kumar Pradhan 

is licensed under Creative Commons Attribution-ShareAlike 4.0 International. To view a copy of this 

license, visit https://creativecommons.org/licenses/by-sa/4.0/ 

Jahr der Veröffentlichung der Dissertation auf TUprints: 2025 

Tag der Einreichung: 09.12.2024 

Tag der mündlichen Prüfung: 31.01.2025 

 
2 


“तस्मादसक्तः सतत ंकाय� कम� समाचर।” 
“Tasmad asaktah satatam karyam karma samachara” 

-Bhagavad Gita 

 
3 


Preface 

Welcome to my doctoral thesis, “Genome Replication Progression and Epigenetic Regulation 

in Mammalian Cells.” This work explores the intricate processes of genome replication and the role of 

epigenetics in human and mouse cells across different stages of development. A general introduction 

is followed by a section on common methods. Then, the thesis is structured into three chapters, each 

delving into various aspects of the aforementioned subjects. 

For clarity and coherence, each chapter presents the introduction, results alongside a 

comprehensive discussion/perspectives allowing for a more integrated and fluid understanding of the 

findings.    Chapter 1 extensively builds upon the research presented in Pradhan et al. (2024), while 

some methodologies detailed in this thesis draw both from Pradhan et al. (2023) and Pradhan et al. 

(2024).  

The thesis aims to deepen our understanding of how genome replication and epigenetic 

mechanisms interact within mammalian cells.  

Part of the data are made available here:tudatalib. The rest of the data would be made public 

upon preparation of publication in due course of time.   

 
4 


Summary 

Among all mammalian cellular organelles, the nucleus gets special attention as it contains the 

genetic information of the cell, the DNA. This is also the site where all DNA-dependent functions, 

such as genome duplication prior to cell division, DNA transcription, and damage repair upon various 

DNA breakages undergo. The nucleus also contains the subcompartment, an assembly site for the 

ribosome, the nucleolus. Among all the DNA-templated events, genome duplication is the most 

eventful, involving a large array of proteins simultaneously unwinding and synthesizing a new pair of 

DNA at thousands of locations of the same long polymer, yet tightly regulated in time and space. It is 

an event on a scale hitherto undreamt of.  

While the faithful reconstitution of cellular identity involves the accurate inheritance of 

epigenetics, the information that sits atop the genetic information, the dynamic nature of epigenetics 

also influences the chromatin structure and regulates various aspects of DNA-templated events, 

including genome replication programs. The progression of development also triggers changes in the 

epigenetic landscape, thereby influencing the replication program. During genome replication, the 

chromatin decondenses, the DNA unwinds, and new DNA is synthesized and rewrapped around the 

histone core. The spatio-temporal regulation of the replication program also ensures the faithful 

inheritance of the epigenetic nature, specifically the histone post-translational modifications. However, 

the intricate interplay between epigenetics and the genome replication program remains a mystery, 

awaiting further exploration.   

The present work first sheds light on the developmental changes in genome replication 

programs, features of replication forks, and underlying chromatin dynamics in human cells. It also 

uncovered the change in replication time of less-studied tandem and interspersed repeats, which 

constitute a large chunk of the genome, using repli-FISH. As a prerequisite, a novel approach to 

analyze the repli-FISH was developed. A switch in the replication timing of ribosomal DNA and 

differential replication program of Alu, LINE1, and Centromere were observed. Interactions between 

genome replication machinery and RNA polymerase I responsible for rDNA transcription were 

observed in pluripotent stem cells. Overall, the study also complements and expands our 

understanding of the developmentally regulated genome replication program in human cells.  

High-throughput image analysis was used to quantify the histone modification marks in 

different cell cycle phases to characterize the relationship between genome replication and 

epigenetics. Furthermore, the statistical tool nucim was used to map the dynamic localization of 

histone marks to different compaction classes with cell cycle and sub-S phase progression in 

pluripotent stem cells. Dynamic localization in constitutive heterochromatin and its replication timing 

regulation by H3K36me3 was unveiled. Furthermore, genome-wide dynamics of H3K27me3 and 

5 


H3K4me3, which also mark the poised/bivalent chromatin in embryonic stem cells, unveiled a peculiar 

pattern. Using synthetic biology, the replication program was disrupted by localizing the constitutive 

heterochromatin next to nuclear lamin, and its effect on epigenetics was studied.  

Furthermore, the earliest replication origins and their (epi)genetic/chromatin characteristics are 

explored to validate the Domino model of replication progression. The approach to finding the cells 

with the earliest origins is developed in living and fixed cells was established for a detailed 

exploration. The ribosomal DNA tandem repeat was discovered as one of the preferred locations to 

start the replication across mammalian cell lines/species. To investigate the genetic nature of these 

origins, methods to identify, collect, and amplify single-cell genomes linearly were established to 

measure the micro-copy number gain.  

Finally, a model was presented describing the genome replication progression and interaction 

of epigenetic and replication programs.  

6 


Zusammenfassung 

Unter allen zellulären Organellen von Säugetieren erhält der Zellkern besondere Aufmerksamkeit, da 

er die genetische Information der Zelle, die DNA, enthält. Er ist auch der Ort, an dem alle 

DNA-abhängigen Prozesse stattfinden, wie die Genomduplikation vor der Zellteilung, die 

Transkription der DNA und die Reparatur von DNA-Schäden durch verschiedene Brüche. Der 

Zellkern enthält zudem ein Subkompartiment, den Nukleolus, der als Assemblierungsort für 

Ribosomen dient. Unter den DNA-abhängigen Prozessen ist die Genomduplikation die 

ereignisreichste: Sie umfasst eine Vielzahl von Proteinen, die gleichzeitig die DNA an Tausenden von 

Stellen eines langen Polymers entwirren und eine neue DNA-Doppelhelix synthetisieren – ein zeitlich 

und räumlich streng reguliertes Ereignis von bisher unvorstellbarem Ausmaß. 

 
Während die getreue Wiederherstellung der zellulären Identität die präzise Vererbung epigenetischer 

Informationen – jener Informationen, die über den genetischen Informationen liegen – erfordert, 

beeinflusst die dynamische Natur der Epigenetik auch die Chromatinstruktur und reguliert 

verschiedene Aspekte der DNA-abhängigen Prozesse, einschließlich der Programme der 

Genomreplikation. Die Entwicklung eines Organismus löst zudem Veränderungen in der 

epigenetischen Landschaft aus, die wiederum das Replikationsprogramm beeinflussen. Während der 

Genomreplikation dekondensiert das Chromatin, die DNA wird entwunden, und neue DNA wird 

synthetisiert und erneut um die Histonkerne gewickelt. Die räumlich-zeitliche Regulation des 

Replikationsprogramms gewährleistet auch die getreue Vererbung der epigenetischen Eigenschaften, 

insbesondere der posttranslationalen Modifikationen der Histone. Dennoch bleibt das komplexe 

Zusammenspiel zwischen Epigenetik und dem Genomreplikationsprogramm ein Rätsel, das weitere 

Erforschung erfordert. 

 
Die vorliegende Arbeit beleuchtet zunächst die entwicklungsbedingten Veränderungen in den 

Genomreplikationsprogrammen, die Eigenschaften von Replikationsgabeln und die zugrunde 

liegende Chromatindynamik in menschlichen Zellen. Mithilfe von Repli-FISH wurde zudem eine 

Veränderung der Replikationszeit weniger erforschter tandemartiger und verstreuter 

Wiederholungssequenzen, die einen großen Teil des Genoms ausmachen, untersucht. Im Vorfeld 

wurde ein neuartiger Ansatz zur Analyse von Repli-FISH entwickelt. Dabei wurden ein Wechsel im 

Replikationszeitpunkt ribosomaler DNA sowie unterschiedliche Replikationsprogramme von Alu-, 

LINE1- und Zentromer-Wiederholungen beobachtet. In pluripotenten Stammzellen konnten 

Interaktionen zwischen der Genomreplikationsmaschinerie und der RNA-Polymerase I, die für die 

Transkription ribosomaler DNA verantwortlich ist, nachgewiesen werden. Insgesamt ergänzt und 

7 


erweitert die Studie unser Verständnis der entwicklungsregulierten Genomreplikationsprogramme in 

menschlichen Zellen. 

 
Hochdurchsatz-Bildanalysen wurden eingesetzt, um Histonmodifikationen in verschiedenen 

Zellzyklusphasen zu quantifizieren und die Beziehung zwischen Genomreplikation und Epigenetik zu 

charakterisieren. Zusätzlich wurde das statistische Tool nucim verwendet, um die dynamische 

Lokalisierung von Histonmodifikationen in verschiedenen Kompaktionsklassen während des 

Zellzyklus und des Sub-S-Phasen-Fortschritts in pluripotenten Stammzellen zu kartieren. Dabei 

wurde eine dynamische Lokalisierung im konstitutiven Heterochromatin sowie dessen 

Replikationszeitregulation durch H3K36me3 aufgedeckt. Darüber hinaus wurden genomweite 

Dynamiken von H3K27me3 und H3K4me3, die auch das poised/bivalente Chromatin in embryonalen 

Stammzellen markieren, in einem auffälligen Muster dargestellt. Mithilfe der synthetischen Biologie 

wurde das Replikationsprogramm durch die Lokalisierung des konstitutiven Heterochromatins in die 

Nähe der Kernlamina gestört, und die Auswirkungen auf die Epigenetik wurden untersucht. 

 
Ferner wurden die frühesten Replikationsursprünge sowie deren (epi-)genetische und 

chromatinbezogene Merkmale untersucht, um das Domino-Modell der Replikationsprogression zu 

validieren. Ein Ansatz, um lebende und fixierte Zellen mit den frühesten Ursprüngen zu identifizieren, 

wurde etabliert, um eine detaillierte Untersuchung zu ermöglichen. Es wurde festgestellt, dass 

ribosomale DNA-Tandemwiederholungen bevorzugte Startpunkte für die Replikation in 

Säugetierzelllinien und -arten darstellen. Um die genetische Natur dieser Ursprünge zu untersuchen, 

wurden Methoden entwickelt, um Einzelzellgenome linear zu identifizieren, zu sammeln und zu 

amplifizieren, um mikro-kopienzahlenabhängige Gewinne zu messen. 

 
Abschließend wurde ein Modell vorgestellt, das die Progression der Genomreplikation und die 

Interaktion zwischen epigenetischen und Replikationsprogrammen beschreibt. 

 
8 


List of Figures:  
 

Figure 1: A generalized blueprint of a human chromosome. 

Figure 2: The rDNA and (peri)centromere placement in mouse and human chromosomes. 

Figure 3: Different states of the chromatin. 

  Figure 4: Genome replication at different resolutions. 

Figure 5: The Domino model of genome replication progression. 

Figure 6: Metaphase and interphase fluorescence in situ hybridization (FISH) of the repetitive 

genomic elements. 

Figure 7: Image analysis pipeline for RFi detection, characterization, and measurements. 

Figure 8: Image analysis pipeline for mapping RFis to chromatin compaction classes using Nucim 

package on statistical analysis platform R. 

Figure 9. Cell cycle and replication dynamics analysis of pluripotent and somatic cells. 

Figure 10. Feature analysis of the replication foci (RFi) in different S phases. 

Figure 11. Quantification of the number of replicons and fork speed in S phase stages. 

Figure 12. Genome-wide replication origins distribution in selected human cell lines based on the 

SNS-seq origin mapping method. 

Figure 13. Quantification of chromatin compaction with replication progression across cell lines. 

Figure 14. Replication timing of genomic repeat elements.  

Figure 15. Developmental difference in replication timing of rDNA repeats. 

Figure 16. The developmental difference in replication timing of Y chromosome. 

Figure 17. A summary of the developmental difference in genome replication features in pluripotent 

stem cells (PSC) and somatic cells.  

Figure 18. Cell cycle-dependent dynamics of the histone modification levels.  

Figure 19. Mapping the chromatin compaction association in cell cycle stages reveals dynamic 

subnuclear localization of individual histone marks.  

Figure 20. H3K36me3 dynamically localizes to the pericentromeric heterochromatin prior to its 

replication.  

Figure 21. Histone modifications mapped to chromatin compaction classes in human cells. 

Figure 22. Knockdown approach for H3K36me3. 

Figure 23. Loss of H3K36me3 influences DNA-dependent processes.  

Figure 24. The MajSat forward RNA influences the pericentromeric replication program. 

Figure 25. The model depicts the role of H3K36me3 in MajSat RNA-led maintenance of the 

replication program.  

9 


Figure 26. Capturing the earliest RFi in single cells.  

Figure 27. Earliest RFi are randomly fired throughout the nucleus to create a Domino-like origin firing. 

Figure 28. Individual replicons induce a Domino-like replicon cluster.  

Figure 29. Mapping individual replicons to chromatin compaction classes reveals higher enrichment in 

open compartments. 

Figure 32. Repli-FISH reveals the repeat elements associated with the earliest origins. 

Figure 30.  Earliest origins are fired from gene-rich open chromatin.  

Figure 31. The image shows the location of the gene-rich elements enriched in the open chromatin. 

Figure 33. Feasibility of sequencing the earliest replicating regions.  

 
List of Tables:  
Table 1: Overview of the human genome 

Table 2: Comparative overview of replication origins across domains of life 

Table 3: List of all the model cell lines used and their properties 

Table 4: The list of nucleotide and nucleoside analogs 

Table 5: List of the antibodies used for immunostaining 

Table 6: A list of the probes and preparation methods 

Table 7: Image acquisition microscopes 

Table 8: Image analysis, data analysis, and visualization   

Table 9: Publicly available datasets for the replication origin mapping  

 
10 


Contents 

Preface...................................................................................................................................................3 
Summary............................................................................................................................................... 4 
Zusammenfassung...............................................................................................................................6 
1. Introduction.....................................................................................................................................12 

1.1 DNA: The blueprint of life......................................................................................................... 12 
1.2 Chromatin and the role of epigenetics in chromatin states.......................................................18 
1.3 Genome replication program and its regulation........................................................................22 
The questions................................................................................................................................. 32 

2. Methods...........................................................................................................................................33 
2.1 Cell culture and transfection..................................................................................................... 33 
2.2 Doubling time and (sub)S phase duration:............................................................................... 35 
2.3 Genome replication labeling, visualization, and immunostaining............................................. 36 
2.4 Probe generation, metaphase spread, repli-FISH, and immuno repli-FISH............................. 40 
2.5 Knockdown experiments, immunostaining, and RNA FISH..................................................... 45 
2.6 Microscopy............................................................................................................................... 45 
2.6 Image Analysis......................................................................................................................... 47 
2.7 Genome-wide origin mapping...................................................................................................51 
2.8 Single-cell collection, genome amplification, and analysis....................................................... 53 
2.9 Data analysis, statistical analysis, and visualization.................................................................54 

3. Results.............................................................................................................................................55 
3.1 Developmental changes in genome replication progression in human cells*...........................55 

3.1.1 Introduction...................................................................................................................... 55 
3.1.2 Developmentally Conserved Spatio-Temporal Replication Pattern in Humans...............56 
3.1.3 Characterization of Spatio-Temporal RFi Reveals a Change in Late-Replicating RFi 
Distribution................................................................................................................................59 
3.1.4 Replicon Quantification, Fork Efficiency, and Genome-Wide Origin Mapping Unravel 
Alterations in the Genome Replication Program across Developmental Transitions............... 63 
3.1.5 Chromatin Compaction Analysis, and RFi- Associated Histone Modification 
Measurements Reveals Differential Chromatin Dynamics....................................................... 68 
3.1.6 Repli-FISH Reveals Developmental Changes in the Replication Timing of Tandem and 
Interspersed Repeats............................................................................................................... 69 
3.1.7 rDNA Tandem Repeats Show a Switch in Replication Timing and Change in Replication, 
Transcription Interaction........................................................................................................... 74 
3.1.8 Sex Chromosome Y Replicates Throughout the S phase and Shows a Developmental 
Switch in Replication Timing.....................................................................................................75 
3.1.9 Conclusions/Discussion...................................................................................................77 

3.2 Correlation of genome replication progression and epigenetics in pluripotent stem cells........ 79 
3.2.1 Introduction...................................................................................................................... 79 
3.2.2 Histone modification levels are cell cycle dependent...................................................... 80 

11 


3.2.3 Mapping histone modification to chromatin compaction classes reveals distinct cell 
cycle-dependent dynamics of individual histone marks............................................................84 
3.2.4 H3K36me3 dependent transcription fidelity of pericentromeric forward RNA regulates 
chromatin structure and replication features.............................................................................91 
3.2.5 Conclusions/Discussion...................................................................................................98 

3.3 Mammalian earliest genome replication origins stochastically activate from (in)active nuclear 
compartments to create a domino-like replication progression.................................................... 100 

3.3.1 Introduction.................................................................................................................... 100 
3.3.2 Capturing the cells with earliest origins......................................................................... 102 
3.3.3 Mammalian earliest replication origins fire randomly throughout the nucleus and create a 
Domino like replication progression........................................................................................104 
3.3.4 Earliest replicons are fired in (in)active nuclear compartments and cluster to form 
replication foci/timing domain................................................................................................. 108 
3.3.5 Earliest RFi fire from conserved gene-rich open chromatin and repetitive elements.....111 
3.3.6 Conclusion..................................................................................................................... 117 
3.3.7 Perspectives: sequencing earliest RFi/replication domain and its chromatin structure in a 
single cell................................................................................................................................ 119 

4. Annex.............................................................................................................................................142 
4.1 Honorary Declaration - Ehrenwörtliche Erklärung.................................................................. 142 
4.2 CV...........................................................................................................................................143 

 
12 


1. Introduction 

Chromatin, composed of DNA and histone, carries hereditary information in eukaryotes and is 

spatially organized within the nucleus. The role of chromatin in cellular development and 

differentiation is pivotal, making the study of chromatin duplication and its intricate regulation through 

epigenetic mechanisms a fundamental area of research.  The second half of the last century has 

answered many fundamental questions about chromatin duplication. The discoveries from the 

Hershey-Chase experiments showed that DNA is the genetic material; the X-ray diffraction by 

Rosalind Franklin and Maurice Wilkins led to the elucidation of the DNA double-helical structure by 

Watson and Crick, followed by Meselson and Stahl’s experiment that established the 

semi-conservative nature of DNA replication 1–3. This encouraged further discoveries such as the 

isolation and purification of the first DNA polymerase, followed by pioneering work on DNA replication 

organization using in vivo pulse labeling with radioactive thymidine, and DNA fiber analysis led to the 

establishment of the concept that DNA replication proceeds bidirectionally from an opening site or 

active origin of replication (“ori”) 4,5. Further developments with model systems like fission yeast and 

Xenopus egg extract revealed key cell cycle regulation of genome replication 6,7. Development of 

conjugated and analogs of nucleotide such as BrdU, alongside the advent of confocal microscopy, 

unraveled some of the key features of the spatio-temporal genome replication progression 8,9. The 

human genome project and the development of DNA sequencing technologies further complemented 

our understanding of the genome replication program 10. However, the chromatin duplication program 

and its regulation are yet to be fully understood, especially in mammals. 

1.1 DNA: The blueprint of life 
DNA (deoxyribonucleic acid) serves as the hereditary material in all known living organisms 

and many viruses. Its structure, a double helix, comprises two strands that store genetic information, 

which is passed during cell division or across generations. Dissecting the structural and functional 

units of the genome is crucial in understanding the life forms and their development. The human 

genome project, where the genome was sequenced using bacterial artificial chromosomes (BACs) 

that were further ordered and oriented along the human genome utilizing radiation hybrid, genetic 

linkage, and fingerprinting, led to a quantum leap in our understanding of the human genome and 

encouraged other species to be sequenced 11–13. With the emergence of long-read sequencing 

methods and stronger aligning algorithms, these genome assemblies have been constantly improved 

and now represent the telomere-to-telomere (T2T) sequence 14. However, the study was conducted in 

a complete hydatidiform mole (CHM) haploid cell, which retains only the paternal genetic material. 

This model lacks the full complexity and diversity present in normal diploid cells, particularly in the 

context of tandem repeats such as ribosomal DNA and centromeres. Furthermore, a fully assembled 

13 


mouse genome sequence remains elusive. Despite these limitations, fundamental structural and 

functional insights into the genome have been achieved and are well-documented.  

Table 1: Overview of the human genome 

Genomic element Percentage of the genome (%) 

protein-coding regions 1.5 - 2.0 

segmental duplications 5.0 

Alu 10.9 

LINE1 (L1) 20.9 

satellite DNA 3 

telomere 0.3 

rDNA 0.32 

non-coding 2 

other promoters 0.5 

other repeats 10 

 
A major surprise of the human genome project was that only a small fraction of the human genome is 

protein-coding, whereas a large fraction is non-coding sequences and repetitive elements 13. In the 

T2T assembly, the protein-coding regions increased slightly (19,890 to 19,969), showing the accuracy 

of the previous assembly when it came to characterizing protein-coding regions. However, the 

repetitive fraction was further increased from 50% to 54% of the total genome 14. Altogether, the long 

and short interspersed nuclear elements (LINE and SINE) comprise more than 60%, satellite DNA 

makes up 10%, and the rDNA tandem repeats from five acrocentric chromosomes are around 1 % of 

the total repeats. Table 1 describes the constituents of the human genome.  

Protein-coding regions: These sequences of the genome are transcribed into RNA and translated 

into proteins. Even though these constitute less than 2% of the total genome, they provide the 

blueprint for synthesizing proteins, which perform a vast array of functions within the cell 14. Each 

contains a specific sequence of nucleotides that encodes instructions for building a protein or, in 

some cases, functional RNA molecules. These regions, also known as exons, are often intervened by 

intronic sequences or introns (except some histone genes) and contain an open reading frame with a 

start and stop codon defining the regions to be transcribed/translated. Each gene is also flanked by 

the upstream enhancer and repressor sequences, which are subject to being (in)active and, hence, 

14 


controlling the differential gene expression. Based on developmental stages, or cellular identity, a 

combinatorial subset of genes is activated and repressed by epigenetic regulation, hence regulating 

the expression of the proteins required for differentiating or maintaining the cellular identity.  

Non-coding regions: A significant proportion of the genome is made up of non-coding sequences, 

which do not code for proteins yet play crucial roles in regulating gene expression or maintaining 

genome integrity. These can be placed between genes and often contain regulatory elements for 

various DNA-dependent processes, which is crucial for 3D genome organization. A decent fraction of 

the non-coding regions also contain sequences for non-coding RNA (ncRNAs), the RNA molecules 

that are not translated into proteins yet have regulatory functions such as microRNAs (miRNA) and 

long non-coding RNA (lncRNA). The miRNAs play regulatory roles post-transcriptionally by targeting 

alternative splicing, mRNA stability, and protein translation 15. The lncRNAs play key roles in gene 

regulation and chromatin organization by (in)activating target genomic regions or a whole 

chromosome 16.  

Tandem repeats: These repeats are organized as multiple copies of a homologous DNA sequence, 

which are arranged in a head to tail pattern to form tandem arrays and can be of varied sizes and 

repeat units. Initially relegated as “junk DNA,” these sequences are now recognized for their crucial 

role in regulating some key structural and functional aspects of basic cell operations. For example, 

the telomeric repeats serve as the cap of individual chromosomes, maintaining genome stability and 

preventing chromosome degradation 17. This underscores the importance of these once-dismissed 

genomic elements in our understanding of genome function and stability.    

 
15 


Figure 1: A generalized blueprint of a human chromosome. (A) Illustration of a human chromosome 

representing various repeat elements in both p and q arms. The magnified region shows the placements of the 

(peri)centromeric repeats. In most chromosomes, including acrocentric ones,  the p arm is followed by the 

centromeric ɑSat, where the kinetochore is formed and spindle fibers are assembled. This is followed by some 

non-satellites, mostly transposable elements, before pericentromeric satellite repeats. (B) Illustration of a 

generalized (peri)centromeric region showing varied placements of varied HSats with respect to ɑSat higher 

order repeats (HOR) in different chromosomes, size of the repeat, and their genetic features. (C) Scheme 

shows the chromosome 9 containing the largest HSat repeat. In chromosome 13, the rDNA tandem repeat is 

sandwiched between HSat1A and HOR. The figure is modified from 18,19. 

16 


Figure 2: The rDNA and (peri)centromere placement in mouse and human chromosomes. Illustration shows the 

differential placement of rDNA with respect to centromeres in two species. In humans, the centromere is flanked 

by rDNA and q arm, whereas in mouse chromosomes, the rDNA is placed in between the centromere and q 

arm. Representative FISH images of rDNA and centromere in metaphase (scale bar 10 µM) (modified from 20).

17 


In the mammalian genome, (peri)centromeric repeats form the largest tandem repeat and act 

as an axis for genome organization, stability, and chromosome segregation. In humans, centromeric 

regions contain the alpha satellite (ɑSat) repeats consisting of a 171 bp monomeric unit in large 

homologous arrays (85.2 mb genome-wide) (Figure 1A, & 1B) 21. In addition to this, human satellites 

(HSats, HSat2 and HSat3) comprise CATTC repeats and form one of the largest contiguous satellite 

arrays (27.6 Mb array of HSat3 in Chr 9) (Figure 1C). The AT-rich HSat1A and HSat1B (Y and 

acrocentric specific) are also found in multiple chromosomes. This is further flanked by the 

pericentromeric regions extending toward the p and q arms 14,18. The ɑSat monomer also has 

variations, and multiple subtypes often form higher-order repeats (HOR) existing next to each other, 

forming large, homogeneous repeats of the HORs.The kinetochore proteins are usually associated 

with a subset of these HOR arrays called the active array 22. In mouse chromosomes, the AT-rich 

centric and pericentromeric structure is formed by tandem repeats of minor and major satellite 

sequences, respectively 23.  

In mice and humans, the rDNA tandem repeats are present in the short-arm/acrocentric 

chromosomes (Chromosomes 12, 15, 16, 18, and 19 in mice and 13, 14, 15, 21, and 22 in humans) 

(Figure 1A, and Figure 2A). The rDNA repeat can be of varied structures, and the number of repeat 

units varies from chromosome to chromosome. In humans, Chr 13 has the highest fraction of rDNA, 

whereas Chr14 has the lowest units of the rDNA array. In Chr13, the rDNA is positioned between two 

HSat repeat arrays, the large array of HSat1A and ɑSat 18,20 (Figure 1C). The rDNA repeat is flanked 

by the proximal and distal junction, together forming the nucleolar organizer regions (NOR) 24.  

The interspersed repeat elements: Unlike tandem repeats, which occur consecutively one after 

another, interspersed repeats are distributed throughout the genome (Figure 1A). Most of these 

interspersed repeats are transposable elements, sequences capable of relocating within the DNA. 

One of the most notable of these is the Alu retrotransposon family, the most abundant short 

interspersed nuclear element (SINE) in primates, including humans. Alu elements are typically around 

300 base pairs long.  

In the human genome, Alu elements make up approximately 11% of the total DNA. These 

elements play crucial roles in the regulation of gene expression and the evolutionary dynamics of the 

genome in all primates 25. The SINEs transcribe non-protein coding RNA but are reverse transcribed 

and incorporated back to another location, hence believed to be dependent and co-evolved with long 

interspersed nuclear element (LINE) (20% of the human genome), which produce open reading frame 

(ORFs) proteins and enable them to be reverse-transcribed and incorporated back into the genome 
26. The increased activities of LINE and SINE are also correlated during early embryo development 27.  

18 


The LINE1 (also L1) is the most abundant LINE element and encodes for ORF 1, which helps 

non-autonomous Alu and other SINE elements in humans 28.  

1.2 Chromatin and the role of epigenetics in chromatin states 
Chromatin is the highly organized structure of DNA and protein packaged inside the nucleus 

of eukaryotic cells. The compacted structure of DNA wrapped around histones is inherently 

repressive for all DNA-dependent processes. The fundamental unit of chromatin, the nucleosome, 

consists of a segment of DNA wrapped around a core of histone proteins. The core of the histones is 

formed by a tetramer of H3:H4 dimer sandwiched between two H2A:H2B dimers. This fundamental 

unit of DNA and histone octamer form the nucleosomes which are further folded and coiled into 

hierarchical higher-order structures that ultimately form the chromosomes 29. Modifications in DNA 

and histone dynamically modulate the nucleosome compaction, regulating the nature of chromatin 

and, hence, the access to different machinery involved in DNA processes. This information on top of 

the DNA, called epigenetics, also regulates differential gene expression, playing a critical role in 

development. Such epigenetic marks need to be maintained over cell division cycles, and, on the 

other hand, such marks need to be reprogrammed during cellular (retro)differentiation. Hence, despite 

all cells in multicellular organisms having an identical genome, persistent yet plastic epigenetic 

information regulates when and where cells commit to different lineages with distinct phenotypes 30,31. 

Aberration in the epigenetic information leads to disrupted chromatin regulation, causing genome 

instability and leading to various diseases, including cancer 32,33.  

Different states of the chromatin: The plastic nature of epigenetics allows the chromatin to be 

relatively decondensed (euchromatin) or condensed (heterochromatin). Euchromatin is generally 

associated with actively transcribed genes and is more accessible to transcription factors and other 

regulatory proteins. In contrast, heterochromatin is typically transcriptionally repressed and serves to 

protect and stabilize the genome by maintaining structural integrity 34. Furthermore, the 

heterochromatin that is repressed inherently is called constitutive heterochromatin, and that is 

repressed during development in order to achieve differential gene expression or dosage 

compensation is called facultative heterochromatin (Figure 3A). Each chromatin state is distinct by its 

epigenetic marks, and even facultative and constitutive heterochromatin are marked by distinct 

epigenetic marks 35. The classic example of constitutive heterochromatin, the ɑSat in human and 

(peri)centromeric repeats in mouse cells which are repressed at the very early stages of development 

and remain repressed throughout the lifetime in all types of cells 36. A combinatorial subset of the 

genome gets repressed in order to express/repress a combination of genes during developmental 

stages or during terminal differentiation in order to achieve the tissue-specific identity. The inactivation 

19 


of one of the X chromosomes in female cells is one of the classic examples of facultative 

heterochromatin, where the whole chromosome is repressed 37.  

 
Figure 3: Different states of the chromatin (A) The mouse myoblast nuclei stained with DAPI represent the 

spectrum of condensation levels of the chromatin in 3D. The left zoomed box represents loose euchromatin, the 

middle represents facultative heterochromatin (inactivated X chromosome), and the right represents constitutive 

heterochromatin (pericentromeric repeats).  
Epigenetics, and its role in regulating chromatin states: Epigenetics is the consistent but 

reversible changes that occur to the chromatin on top of the genetic information. Three pillars of 

epigenetics are DNA modification, histone variants/post-translational modification, and non-coding 

RNA. While in lower eukaryotes such as yeast, DNA modification is absent, in higher eukaryotes, this 

plays a prominent role in chromatin regulation.  

The most direct mechanism of epigenetic regulation is the modifications or variants of histone 

proteins. The modifications, which include methylation, acetylation, phosphorylation, and 

ubiquitination, occur on the N-terminal tails of the histones that protrude from the nucleosome. The 

pattern of these modifications constitutes the "histone code," which is read by specific effector 

proteins that alter chromatin structure and regulate gene expression 38. Histone acetylation involves 

the addition of an acetyl group to lysine residues, mediated by histone acetyltransferases (HATs). 

Acetylation neutralizes the positive charge of lysine, reducing the affinity between histones and DNA. 

20 


This results in a more relaxed chromatin structure that is accessible to transcriptional machinery, 

thereby promoting gene expression 39. Histone deacetylases (HDACs) remove these acetyl groups, 

leading to chromatin compaction and transcriptional repression 40. The property of the methylation of 

lysine is site-specific. Methylation of histone H3 on lysine 9 (H3K9me) is a hallmark of 

heterochromatin and gene repression, facilitating the binding of proteins that compact chromatin and 

silence gene expression 41. However, tri methylation of H3 K4 (H3K4me3) is usually associated with 

transcriptionally active regions. Histone methyltransferases (HMTs) and demethylases (HDMs) 

regulate these marks. 

The list of canonical histones also includes the linker histone H1 (including its variants). 

Instead of associating with the octamer core, it sits on top of the nucleosome structure, binding both 

the entry and exits of the DNA fiber, keeping in place the DNA that was wrapped around the histone 

octamer. By interacting with the linker DNA between nucleosomes, H1 promotes the folding and 

packing of nucleosomal arrays into a more condensed form, which is essential for fitting the vast 

length of DNA into the confined space of the nucleus. This makes H1 an integral factor of the 30 nm 

fiber 42.  

The histone variants of multiple canonical histones play crucial roles in regulating various 

specialized chromatin functions in addition to gene regulation. Variants of the H2A are H2A.Z, H2A.X, 

and macroH2A. The H2A.Z is involved in regulating gene expression, is often found at the promoters 

of active genes, and is associated with both activation and repression of transcription, depending on 

the context.   H2A.Z incorporation into nucleosomes alters the stability and structure of chromatin, 

facilitating the binding of transcription factors and chromatin remodelers 43. H2A.X is critical for the 

DNA damage response and is distributed throughout the chromatin. H2A.X is phosphorylated at 

serine 139 (γH2A.X) upon DNA double-strand breaks and serves as a signal for the recruitment of 

DNA repair machinery, playing a crucial role in maintaining genome integrity 44. MacroH2A is known 

for its role in X-chromosome inactivation and the formation of facultative heterochromatin. 

MacroH2A-containing nucleosomes are associated with transcriptional repression and chromatin 

compaction 45.  Unlike the canonical H3, which is incorporated into chromatin during DNA replication, 

H3.3 is deposited into chromatin independently of DNA synthesis. It is often found at active gene loci 

and regulatory regions, such as enhancers, and is associated with transcriptional activity and the 

maintenance of open chromatin states 46. A centromere-specific H3 variant, CENP-A replaces H3 in 

nucleosomes at centromeres. This variant is essential for the assembly and function of the 

kinetochore, the structure responsible for chromosome segregation during cell division 47.  

DNA methylation is the methylation of cytosine residues in DNA, typically occurring at CpG 

dinucleotides. It is the most studied and prominent DNA modification associated with the repressed 

21 


chromatin and plays key roles in gene expression during development, X-chromosome inactivation, 

and imprinting   genome stability 48. Regions of the genome with high levels of 5mC are often found in 

the gene promoters and repetitive sequences, including transposable elements, leading to a compact 

chromatin structure that is less accessible to transcriptional machinery. This results in the stable 

silencing of gene expression and maintains genome stability. In mammalian genomes, approximately 

70-80% of CpG dinucleotides are methylated, and the loss of global DNA methylation is often 

associated with cancerous cells, highlighting the pervasive nature of this modification in maintaining 

global genomic stability 49. The establishment and maintenance of DNA methylation patterns are 

governed by DNA methyltransferases (DNMTs). The DNMT1 is primarily responsible for maintaining 

methylation during DNA replication by recognizing hemimethylated DNA and restoring the methylation 

mark on the newly synthesized strand 50. DNMT3A and DNMT3B, on the other hand, are de novo 

methyltransferases that establish new methylation patterns during development 51. Dysregulation of 

these enzymes can lead to aberrant methylation patterns, contributing to developmental disorders or 

diseases such as cancer, where hypermethylation of tumor suppressor genes and hypomethylation of 

oncogenes disrupt normal gene function 49,51. 

In addition to static methylation marks, the DNA methylation landscape is dynamically 

regulated through the cycle of methylation and demethylation, allowing for the fine-tuning of gene 

expression in response to developmental and environmental changes. Hydroxymethylation is the first 

step in the active DNA demethylation pathway, which involves the conversion of 5mC to 

5-hydroxymethylcytosine (5hmC) by the Ten-Eleven Translocation (TET) family of enzymes (TET1, 

TET2, and TET3). 5hmC serves as an intermediate in the demethylation process and is also 

increasingly recognized as an epigenetic mark associated with gene activation and enhancer regions 
52. The presence of 5hmC is particularly enriched in the brain and stem cells, suggesting a role in 

neurodevelopment and cellular differentiation 53. Following the formation of 5hmC, TET enzymes can 

further oxidize this intermediate to form 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). These 

oxidized derivatives are substrates for the base excision repair (BER) pathway, which recognizes and 

removes these modified bases, leading to the insertion of an unmodified cytosine and completing the 

demethylation process 54. This mechanism allows for the active removal of methylation marks and the 

dynamic regulation of DNA methylation in response to cellular signals. 

The implications of DNA modifications are widespread. For example, (de)methylation of 

cytosine directly affects the stability of the DNA duplex and DNA metabolism 55. DNA modifications 

interact with histone modifications to shape the chromatin landscape. For instance, methylated DNA 

often recruits proteins with methyl-CpG binding domains (MBDs) that further recruit histone 

deacetylases (HDACs) and other chromatin remodelers to establish a repressive chromatin state 56. 

22 


Hydroxymethylation, on the other hand, is associated with open chromatin and active transcriptional 

regions 57,58.  

Non-coding RNA is the third pillar of epigenetics. Despite not encoding any proteins, ncRNAs 

are integral to regulating chromatin dynamics, transcriptional activity, and the maintenance of 

epigenetic memory. Long non-coding RNAs (lncRNA) are the most prominent transcript class (usually 

longer than 200 nucleotides) and play roles in chromatin modulation and epigenetic memory. They 

can interact with chromatin-modifying complexes and guide them to specific genomic loci, influencing 

histone modifications and chromatin states. For instance, the lncRNA HOTAIR interacts with 

Polycomb Repressive Complex 2 (PRC2), targeting it to HOX gene clusters to mediate histone H3 

lysine 27 methylation (H3K27me3) and transcriptional repression 59. They can also facilitate the 

recruitment of chromatin modifiers that establish heritable epigenetic marks, such as DNA methylation 

and histone modifications, ensuring stable gene silencing or activation over time 60. Moreover, the 

orientation of the ncRNA defines its role in regulating the targeted chromatin states, and XCI is a 

prime example of this. XIST (X-inactive specific transcript) and TSIX are two lncRNAs involved in this 

process 16,61. XIST is a sense lncRNA expressed exclusively from the X chromosome that will be 

inactivated. It spreads across the chromosome in cis, attracting silencing complexes like DNMTs and 

PRC2, which deposit H3K27me3 marks to initiate chromatin compaction transcriptional silencing, 

transforming the whole chromosome into a compacted structure 62,63. XIST’s role is central to initiating 

and maintaining XCI and ensuring dosage compensation. TSIX is an antisense transcript of XIST that 

is transcribed from the same region but in the opposite direction. TSIX counteracts XIST by 

preventing its accumulation and function. It regulates XIST expression through chromatin remodeling 

and transcriptional interference. By binding to the Xist locus, TSIX helps maintain the active state of 

the chromosome from which it is expressed 64. This interplay ensures that only one X chromosome in 

each cell is inactivated, preserving the balance of gene dosage.  

All three components of epigenetics (Histone modifications/variants, DNA modification, and 

ncRNA) contribute uniquely to the control of gene expression and the maintenance of cellular identity, 

but their interactions create a dynamic and interconnected regulatory network that fine-tunes the 

epigenome. Mapping and understanding these interactions is crucial for comprehending how cells 

respond to developmental cues and environmental changes or even basic cellular functions.  

1.3 Genome replication program and its regulation 
Despite divergent blueprints, the fundamental process of genome replication is functionally 

conserved across domains of life. Bacteria and archaea possess circular DNA, while eukaryotes have 

linear DNA packed within a nucleus. Unlike the almost naked DNA in bacteria and archaea, the 

eukaryotic genome is packed with histone proteins. Yet, all the life forms follow the conserved  

23 


Table 2: Comparative overview of replication origins across domains of life 

Features Prokaryotes Eukaryotes 

 Bacteria Archaea Yeast Metazoa 

Chromosome 
structure 

Circular Circular Linear Linear 

Number of 
Origins 

Typically one One/multiple Multiple Multiple 

Origin 
directionality 

Bidirectional from 
one site 

Bidirectional from 
one site 

Bidirectional from 
many sites, 
Sometimes 

unidirectional 

Bidirectional 
from many sites, 

sometimes 
unidirectional 

Initiation 
Proteins 

DnaA Orc1/Cdc6 Origin Recognition 
Complex (ORC) 

ORC 

Regulatory 
Mechanisms 

DnaA activity, 
methylation 

Protein-DNA 
interactions 

Cell cycle kinases Cell cycle 
kinases, 

regulatory 
proteins, 

Epigenetics 

Origin 
Sequence 

Sequence-specific, 
AT-rich 

Sequence-specific  Sequence-specific Sequence 
independent 

 
principle of genome duplication, where, in all cases, the replication starts from an opening site called 

the origin of replication (Ori) in order to give access to DNA polymerase and other replication 

machinery 65.  Both bacteria and archaea have circular DNA and have one of such Ori (with a few 

exceptions in archaea having multiple origins). These Oris are sequence-specific, such as the E. coli 

genome replicates from a single origin known as OriC. These origins often consist of specific DNA 

sequences recognized by initiator proteins, such as DnaA in bacteria, which initiates the unwinding of 

DNA at the origin 65. Most of the archaeal genomes carry one copy of oriC, yet several genera carry 

multiple oriC copies, which may respond to distinct initiator complexes 66. Eukaryotic cells possess 

multiple replication origins along their linear chromosomes to facilitate the replication of larger 

genomes. With DNA autoradiography, Huberman and Riggs (1968), for the first time, studied the 

replication origins and bidirectional fork movement in mammalian chromosomes 5 (Figure 4A). This 

showed each chromosome has many active origins, allowing simultaneous initiation at several points. 

In simpler eukaryotes like yeast, origins are defined by specific sequences (e.g., ARS in 

Saccharomyces cerevisiae) 67. In comparison, in metazoans, these vaguely defined origins are more 

flexible and often influenced by multiple factors, including DNA structural arrangements, regulatory 

24 


proteins, the chromatin environment, and epigenetic marks, rather than one single factor, making the 

process more stochastic and complex 68. A summary can be found in Table 2.  

As the number of replication initiation sites within the cell increases, managing their precise 

activation becomes critically important. The cell must ensure that DNA replication is accurate and 

timely and that each genomic region is duplicated only once per cell cycle. The uncoordinated firing of 

replication origins can lead to re-replication, causing genomic instability and potentially acquiring 

harmful mutations. Furthermore, DNA replication occurs within the dynamic environment of the 

chromatin and must be in sync with other chromatin-templated processes, particularly transcription.  

These two processes share the same DNA template and can influence each other in significant ways. 

Transcription can modify chromatin structure, affecting the accessibility of replication origins, while the 

process of DNA replication can impact the transcriptional activity of genes 69. At a global level, the 

eukaryotic genome replicates in a highly organized and non-random fashion, with specific genomic 

regions replicating earlier or later in the S-phase. This concept initially described over 60 years ago, 

highlights two critical aspects of eukaryotic DNA replication. First, not all replication origins are 

activated simultaneously, and second, origins that do fire together are not evenly distributed across 

the genome 70. These characteristics create distinct replication patterns that evolve in a 

spatiotemporally conserved pattern throughout the S-phase and can be observed under fluorescence 

microscopy 71 (Figure 4B). On a finer scale, at the level of individual replicons, the initiation of 

replication appears to be a stochastic process. This means that not every potential origin is activated 

in every cell cycle. Instead, each origin has a variable firing efficiency, with some origins being more 

likely to initiate replication than others 72–74.  

The regulation and features of DNA replication are dynamic processes, adapting throughout 

development, differentiation, and stress 75–77. Epigenetic modifications, such as DNA methylation, 

histone modifications, and the presence of non-coding RNAs, play crucial roles in shaping the 

chromatin landscape. These modifications are already known to influence transcription and are 

believed to affect any chromatin-based event, including DNA replication. Over the past decades, 

studies have shown correlations between specific epigenetic marks and the timing of replication in 

various organisms 78–80. Furthermore, dynamic changes of the epigenetic profiles during development 

rewires the replication program 76,81. This adaptability and other observations suggest that replication 

control cannot be fully understood through genetic sequences alone. Possible candidates influencing 

these are epigenetics, chromatin state, DNA secondary structures, etc 68. Despite significant 

advances in the tools and techniques available to study epigenetic regulation, the precise influence of 

epigenetic modifications on the firing of replication origins, particularly in mammals, remains only 

partially understood. Most of the research is driven by the need to unravel how specific epigenetic 

25 


modification is inherited or affects replication. Yet the challenge lies in dissecting how epigenome as a 

whole affects the replication process, as individual modifications do not act alone but rather are 

deeply interwoven in complex crosstalk with each other. Naturally, this interconnectedness makes it 

difficult to isolate the effects of individual modifications on chromatin dynamics. This complexity is 

more pronounced in higher metazoans, which possess a more sophisticated and diverse array of 

epigenetic modifications compared to unicellular organisms or simpler metazoans. Additionally, the 

epigenetic landscape provides a flexible regulatory framework that can vary significantly within the 

same cell population. This variability adds another layer of complexity to experimental studies, making 

it challenging to discern consistent patterns or draw definitive conclusions about the role of 

epigenetics in regulating replication origin activity. To effectively tackle these challenges, employing a 

combination of methodologies is crucial. High-throughput approaches can provide broad insights into 

the general principles governing epigenetic influence on replication. However, these methods must be 

complemented with in vivo studies at the single-cell level to capture the nuanced and dynamic nature 

of epigenetic regulation in a living organism. Only through such integrative and multi-scale 

approaches can the role of epigenetics be fully elucidated in the regulation of DNA replication origin 

firing in complex mammalian systems.  

Genome replication program in mammalian cells: The eukaryotic genome is duplicated in a tightly 

regulated replication program. The number of origins, the firing time of these origins, and their fork 

speed are the prominent aspects determining the replication program.  Earlier studies on genome 

replication in eukaryote models (like S. pombe or S. cerevisiae) reveal well-defined replication origins 

are fired stochastically yet follow a generally conserved replication timing 72,73. Higher eukaryotes, 

despite having vaguely defined origins, maintain a conserved replication timing, suggesting the 

ordered origin firing, rather than the origin, plays a significant role in faithful chromatin duplication 82,83. 

Replication timing was initially visualized in mammalian cells as distinct spatial patterns of RFi, 

emerging progressively as the cell advanced through the S phase. Dissection of these RFi in 

super-resolution microscopy reveals, these RFi correspond to a cluster of replicons (Figure 4D). 

Further insights from fluorescence in situ hybridization (FISH) or hybridization of target sequence on 

blots having BrdU immunoprecipitated DNA demonstrated that specific chromosome regions are 

replicated at defined times within the S phase rather than randomly 8,84. With the human/mouse 

genome project and advances in microarray technology, a broad view of the conserved replication 

timing was revealed 11,28,85–87. These studies unraveled that gene-rich chromosomes (like human 

chromosomes 22, 19, and 17) replicate earlier in the S phase, while others (for example, human 

chromosomes 18, 21, and Y) replicate later. Furthermore, while the GC and Alu-rich regions were 

replicated earlier, the LINE-rich regions were replicated later. With the advance of next-generation 

26 


sequencing, a comprehensive view of the replication timing was revealed by sequencing the nascent 

DNA 10. It showed the genome duplicates in megabase-sized replication domains, where the stretch 

of the DNA shares the same replication timing 76. In addition, these studies also characterized the 

constant and plastic replication domains, where the replication timing of the constant domains 

remains the same for different cell types, and the plastic domains change with developmental stage 

or cell type. However, these studies, as based on population, did not reveal the cell-to-cell variation or 

stability in replication timing until advances in single-cell genome amplification methods and copy 

number analysis paved the way to investigate these replication domains in individual cells 88. By 

collecting the cells in the middle of the S phase based on the DNA content and performing the copy 

number gain due to genome duplication, the replication timing was revealed 89,90. These studies 

revealed the genome-wide stability of these replication domains among cells, yet a certain degree of 

stochastic variation from cell to cell exists, especially in the late S phase (Figure 4C).  

Parallel to our understanding of replication timing, advancements in characterizing the 

replication origins in mammalian cells also happened using microarrays and next-generation 

sequencing. In Saccharomyces cerevisiae, origin selection is guided by the binding of origin 

recognition complex (ORC) to well-defined DNA sequences near ARS elements 91. However, 

metazoan replication origins do not conform to a conserved consensus sequence, and the ORC from 

higher eukaryotes exhibits no sequence specificity in vitro 92. Development in tools to map these 

origins, such as short nascent strands (SNS) isolation in mammalian cells, supported these findings 
93. The sequencing of the Okazaki fragments (OK-seq) revealed the initiation zones located primarily 

within non-transcribed, broad (up to 150 kb) zones that often abut transcribed genes, directions of 

these origins, and the termination sites 94. Further investigations relate the origin positioning with 

respect to the transcription start site and the secondary structure of the DNA without revealing shared 

features among the origins 95,96. Isolating a sufficient amount of the Okazaki fragments or the SNS has 

been challenging, making it difficult to infer the origin feature in a single molecule scale, as seen in 

the combed fibers and cell-to-cell variation 97. Recent advancements in next-generation long-read 

sequencing (e.g., nanopore & pacbio) and detection of the incorporated nucleotide analog (BrdU) can 

reveal the single molecule information and more features about these origins, especially in the highly 

repetitive genomic regions 98,99.  Nonetheless, the origin of mammalian replication is still an active and 

challenging research topic, and a consensus has yet to be reached about its features.  

  
27 


28 


Figure 4: Genome replication at different resolutions. (A) The scheme shows a classic pulse-pulse experiment 

on the stretched DNA fibers: cell population treated with IdU followed by CldU incorporate these nucleoside 

analogs to the nascent DNA and are lysed, DNA fibers were stretched, and analogs are detected, revealing the 

replication origins and inter-origin distances. The lower scheme shows the origin progression in a 

semiconservative way after DNA melting in both directions. The leading strand is continuously replicated, 

whereas the lagging strand progresses discontinuously, producing shorter Okazaki fragments. (B) Schematic 

representation of the experimental workflow combining fixed-cell and live-cell microscopy to investigate the 

spatiotemporal progression of genome replication at different time resolutions. Fixed-cell microscopy employs 

short nucleoside analog pulses followed by long chases to resolve spatial patterns within sub-stages of the S 

phase. These patterns are visualized by detecting nucleoside analog incorporation and replication machinery 

components (e.g., PCNA). Live-cell microscopy tracks the dynamics of the replisome machinery (e.g., 

GFP-PCNA), with image registration applied to distinguish pre-existing and newly activated replication foci 

(RFi). This analysis reveals a sequential, Domino-like DNA replication progression, where stochastic or existing 

RFi initiate the firing of nearby RFi. (C) Replication timing profiling by repli-seq methods shows genome-wide 

stability at the population level and cell-to-cell stochastic variation in early and late replicating chromatin 

domains. The repli-seq data plotted were accessed from GEO: GSE108556 90. (D) Correlative microscopy of 

wide-field and superresolution followed by (nano)RFi segmentation reveals multiple replicons corresponding to 

one RFi or stable unit of the replication domain. 

 
 Characterization of the replication timing and origin revealed that while the replication timing 

remains globally conserved, the replication origins are plastic. This suggested that the temporal order 

of firing the initiation sites, rather than the sites themselves, maintains the replication program, yet its 

mechanism and significance were unknown. Overall, the broader understanding of the replication 

program narrowed down to the understanding of the principles behind the progression of the 

chromatin replication. Furthermore, most of these studies did not reveal the information associated 

with the large repetitive elements (due to the inherent limitation of mapping repetitive elements using 

NGS), covering only a small fraction of the genome. The microscopic dissection of the spatiotemporal 

replication program in vivo has bridged many such gaps in our understanding. This approach already 

showed the conserved spatiotemporal propagation of replication foci/RFi in S phase stages 9,100,101. 

Live cell imaging of the GFP-tagged replication protein PCNA showed the dynamic nature of the 

replication foci that assemble and disassemble (rather than moving, merging, or dividing), creating 

differential spatial patterns in respective S phase stages 102. A high time-resolution microscopy, 

coupled with fluorescence recovery after photobleaching (FRAP) reveals the stable association of the 

replisome machinery with the RFi, and replication progresses in a Domino like next-in-line model 103. 

Quantitative analysis of human genome replication combining DNA combing, which reveals local 

origin firing and replication fork progression on single DNA molecules, with  

29 


Figure 5: The Domino model of genome replication progression. The illustration shows the stochastically fired or 

existing cluster of replicons arranged as replication focus or a megabase-sized replication domain induces firing 

of the nearby origin. Once fired, the origin also blocks the licensed origin present within the loop distance (~ 55 

kb).  
massive sequencing of newly replicated DNA to generate population-averaged replication timing 

profiles showed that origins are activated synchronously in regions of shared replication timing, but 

gradually in temporal transition zones, and that the rate of origin firing increases as replication 

progresses 104. However, origin interference occurs when the distance between the two origins is low, 

usually less than 100 kb, and an average of ~ 40 kb 105. Based on the existing observations, the 

Domino model of replication progression was proposed, where stochastic activation of the first origin 

clusters leads to a chain reaction of sequential activation of later origin clusters depending on the 

relative spatial distribution in the genome within the nucleus. Yet the origin interference kicks in within 

a short distance, usually when the next origin is present in the same chromatin loop 106. The model 

could capture the spatiotemporal replication progression and replication timing as observed in 

microscopic and replication timing analysis (Figure 5). Targeting the late replicating major satellite 

repeats to the nuclear lamin in mouse myoblasts induced the early replication of these repeats, where 

the licensed origins in both lamin-associated facultative heterochromatin and targeted constitutive 

heterochromatin replicated concomitantly, validating the model 107.  

1.4 Effect of chromatin and developmental state on replication program: It was already 

observed that different types of heterochromatin synthesize their DNA at varying periods from 

euchromatin 108. With the advances in multicolor fluorescence microscopy and digital image analysis, 

a strong correlation between the chromatin nature and replication timing was observed 23,109,110. While 

the correlation suggested the role of the chromatin in regulating the replication timing, the underlying 

mechanism was unclear. Furthermore, earlier studies using repli-seq concluded the correlation of 

30 


gene density, Alu, and GC content or LINE1or (peri)centromeric repeat-rich in determining early or 

late replication timing, respectively 85. Early-replicating regions are often associated with open 

chromatin states, enriched in histone modifications such as H3K9ac and H3K27ac, indicative of 

transcriptionally active euchromatin. In contrast, late-replicating regions are linked to heterochromatin, 

marked by modifications like H3K27me3 and H3K9me3, reflecting repressive chromatin environments 
111. The development of chromatin conformation capture techniques, such as Hi-C, enabled the 

identification of topologically associating domains (TADs) 112. TADs are genomic regions where DNA 

sequences preferentially interact within the same domain, organizing chromatin into functional units 

that regulate gene expression and replication timing. These megabase-sized domains are broadly 

categorized into A and B compartments. The A compartment is enriched in active chromatin marks 

and gene-dense regions, while the B compartment contains repressive chromatin and is gene-poor, 

reflecting their roles in chromatin accessibility and functional organization 113. Correlation analysis of 

the replication timing with chromatin architecture has revealed a striking correspondence between 

early-replicating domains and the A compartment (active, euchromatic regions) and late-replicating 

domains and the B compartment (repressive, heterochromatic regions). Hi-C data shows that A and B 

compartments are formed by segregating TADs into mutually exclusive spatial regions, paralleling the 

spatial patterns of early and late replication foci observed through microscopy 114. 

Diffraction-unrestricted super-resolution nanoscopy could successfully capture various chromatin 

domains 115. Capturing the nascent chromatin and mapping these to chromatin domains reveals the 

temporal order of replication follows the hierarchical organization of chromosome territories, even 

though heterochromatin exerts local decompaction during replication 116,117.  

The interplay between genetic and epigenetic factors significantly influences the regulation of 

replication timing, with accumulating evidence pointing to the prominent role of (epi)genetic 

mechanisms. Multiple observations from manipulation of the histone acetylation level, including in 

mouse fibroblasts, where Trichostatin A (TSA) treatment advanced the timing of replication of the 

typically late replicating pericentromeric heterochromatin, suggested that the epigenetic (especially 

histone acetylation) plays a more critical role than the genetic in fine-tuning the replication timing  
110,118. This was also consistent with earlier observations in eukaryotes 119,120. Nonetheless, the 

influence of epigenetics and replication timing seems to be both ways. Apart from epigenetics, several 

chromatin regulators also influence the replication program. One such factor is the replication timing 

regulatory factor 1 (Rif1), which binds to chromatin during early G1 and acts as a negative regulator 

of origin firing of late replicating chromatin by recruiting protein phosphatase 1 (PP1), which 

dephosphorylates key initiation factors like MCM complexes 121. Knock out of the Rif1, hence 

aberrates the replication timing 122. However, this aberration in the replication program also leads to 

31 


inefficient epigenetic inheritance, a process key to maintaining genome stability and cellular identity 
123. Advancing the replication timing of pericentromeric heterochromatin by repositioning to the 

nuclear lamin (as a Domino effect) also disrupted the faithful epigenetic inheritance, and it mainly 

affected the heterochromatin constituents 107.  

The replication program is closely tied to developmental progression, adapting to shifts in 

chromatin architecture, replication origin dynamics, and fork progression. Early studies of DNA 

replication in Drosophila and Xenopus embryos reveal rapid cell cycles driven by unique 

characteristics 124,125. These early divisions lack gap phases (G1 and G2), consisting solely of 

alternating S-phases and mitosis. This abbreviated cycle relies on the activation of a high density of 

replication origins, reducing inter-origin distances and decreasing replicon sizes, which enable 

remarkably short S-phases essential for rapid embryonic development 126,127. In contrast, early 

mammalian development exhibits slower division rates. For example, while a Xenopus egg can 

produce up to 20,000 cells within hours, a mammalian zygote undergoes only a single division in the 

same timeframe 128. Despite these differences, early mammalian cells share some features with 

Xenopus and Drosophila, such as shortened gap phases and a nearly transcriptionally inactive state 

during the first zygotic cleavages. The absence or extreme reduction of gap phases in early 

embryonic cycles is a conserved feature across species, emphasizing the reliance on maternal stores 

to drive DNA replication and cell division before zygotic genome activation 129. This streamlined 

replication program is an adaptation tailored to meet the demands of early embryogenesis. 

Nonetheless, mouse embryos show spatial replication patterns already in the 2-cell, and a 

well-defined replication timing emerges in 4-cell stages, and the zygote genome activation and 

replication timing happen and evolve in parallel along with chromatin organization 36,128,130. In later 

stages of development, the replication timing is well established, and short gap phases emerge 131. 

Both mouse and human embryonic stem cells exert different replication programs when compared to 

differentiated cells 76,77,132. In mESC, in addition to the developmentally regulated genomic locus, the 

usually late replicating constitutive heterochromatin (major satellite repeats) shows a shift of 

replication timing earlier to the mid-S phase, before the nuclear lamin-associated chromatin replicates 

in late. Yet upon targeting the HDAC1 to the constitutive heterochromatin, the replication gets delayed 
77.  

 Furthermore, apart from the replication timing, the number of origins and fork speed play 

crucial roles in replication compilation in early embryogenesis. The Xenopus and Drosophila embryos 

show a higher origin activation and fork speed to achieve a higher rate of cell division, and the 

number of origins and fork rate go down upon transition to the later developmental stage 133. The 

32 


mammalian zygotes and embryonic stem cells possess comparatively higher replication origins than 

differentiated cells, reflecting the demands of rapid divisions yet exerting a slower fork speed 77,134,135.  

 Chromatin nature, genome organization, and developmental transitions are critical in shaping 

the replication program and are intricately intertwined. Early developmental cells differ from 

differentiated or somatic cells in many aspects despite sharing the same genome. While epigenetics 

is the prominent player in shaping these trajectories, the chromatin duplication program must adapt 

plastically to this rewiring. It is key to understand these processes in detail, especially of the less 

studied repetitive elements, which are emerging as prominent players in genome organization and its 

regulation.  

 
The questions 
Our understanding of mammalian genome duplication and underlying mechanisms has advanced 

significantly, yet many remain unanswered. While recent approaches have revealed the replication 

program of various cell types, even in single cells during their developmental transitions, little is 

known about the replication program of the repetitive elements, which constitute more than 50 % of 

the human genome. The role of histone acetylation in regulating the replication program and the 

mechanisms of epigenetic inheritance is addressed comprehensively; it remains unknown if other 

epigenetic modifications like histone methylation or non-coding RNA influence the replication timing. 

Furthermore, while the Domino model explains how the replication progresses, it remains unclear 

which genomic regions are replicated the earliest in the S phase (or the first falls of the Domino). 

Hence, the questions this thesis aimed to answer are: 

 
- What are the spatiotemporal changes in the replication program in developmentally different 

human cells? 

- How do the cells use various epigenetic tools to regulate chromatin duplication? 

- Where are the earliest origins located, and what are their features? 

 
33 


2. Methods 

2.1 Cell culture and transfection 
Table 3 describes the details of the cell lines used. All cells were grown in a humidified atmosphere of 

5% CO2 at 37 °C. Cells were grown in Dulbecco’s modified Eagle medium (DMEM) (Cat.No.: D6429, 

Sigma-Aldrich Chemie GmbH, Steinheim, Germany) supplemented with varied percentages of fetal 

calf serum (Cat.No.: FBS 11A, Capricorn Scientific GmbH, Hessen, Germany), 1x glutamine (Cat.No.: 

G7513, Sigma-Aldrich, St Louis, MO, USA), 1 µM gentamicin (Cat.No.: G1397, Sigma-Aldrich, St 

Louis, MO, USA), and 0.01 mg/ml hygromycin B (Cat.No.: 843555, Roche, Basel, Switzerland). HeLa 

Kyoto, hTERT RPE1, and BJ-5ta were grown in 10% FCS. HeLa Kyoto GFP mPCNA and HeLa 

Kyoto mCherry PCNA cells were grown in media supplemented with 600µg/ml G418 

(Cat.No.:CP11.3, Carl Roth, Karlsruhe, Germany) and  2.5 µg/ml Blasticidin (Cat.No.:anti-bl-1, 

InvivoGen, Toulouse, France) respectively. To grow the hiPSC A4 and hiPSC B4, surfaces were first 

coated with vitronectin (Cat.No.: A14700, ThermoFisher Scientific, MA, USA) for one hour. The hiPSC 

A4 and hiPSC B4 were grown in iPSC Brew Basal medium (Cat.No.: 130-107-086, Miltenyi Biotec) 

supplemented with iPSC-Brew 50x (Cat.No.: 130-107-087, Miltenyi Biotec). The hESC H1 was grown 

in mTeSR™（Cat.No.: 85850, STEMCELL Technologies, CA）on Matrigel（Cat.No.:354277, corning, 

USA）-coated plates. All hESC and hiPS cells were grown till they started forming colonies before 

performing experiments.  

  To study human replication dynamics using live cell time-lapse microscopy, hTERT RPE1 and 

hiPSC A4 cells were transfected with the plasmid pENeGFPCNAL2mut (pc0653, 

https://www.addgene.org/167564/) 102. The hTERT RPE1 cells were transfected with the AMAXA 

Nucleofector II system (Lonza, Cologne, Germany) using a self-made buffer (5 mM KCl, 15 mM 

MgCl2, 120 mM Na2HPO4/NaH2PO4 pH 7.2, 50 mM Mannitol) with the program A024 and seeded 

on a polymer coverslip bottom µ-slide 8 well plate (Cat.No.: 80826, Ibidi, WI, USA). The hiPSC A4 

cells were first seeded till they form colonies on vitronectin-coated polymer coverslip bottom µ-slide 8 

well plate transfected with Lipofectamine™ Stem Transfection Reagent (Cat.No.: L300015, 

ThermoFisher Scientific, MA, USA) using the manufacturer’s recommended protocol. 

The mESC J1, mESC J1 msTALE GFP cells were grown on gelatin-coated plates Dulbecco’s 

modified Eagle’s medium (DMEM) high glucose (Cat. No.: D6429, Sigma-Aldrich Chemie GmbH, 

Steinheim, Germany) supplemented with 15% fetal calf serum (FCS), 1× non-essential amino acids 

(Cat. No.:M7145, Sigma-Aldrich Chemie GmbH, Steinheim, Germany), 1×penicillin/streptomycin 

(Pen/Strep) (Cat. No.:P4333, Sigma-Aldrich Chemie GmbH, Steinheim, Germany),1× L-glutamine 

(Cat. No.: G7513, Sigma-Aldrich Chemie GmbH, Steinheim, Germany),  

34 


Table 3: List of all the model cell lines used and their properties 

Name Species Type Ploidy Gender Reference 

hESC H1
  

Homo 
sapiens  

Embryonic Diploid Male 136 

hiPSC A4 Homo 
sapiens  

iPSC from human neonatal foreskin 
fibroblast (HFF1)  

Diploid Male 137 

hiPSC B4
  

Homo 
sapiens  

iPSC from human neonatal foreskin 
fibroblast (HFF1)  

Diploid Male 137 

hTERT RPE1
  

Homo 
sapiens  

hTERT immortalized retinal pigment 
epithelial cell  

Diploid Female 138 

BJ-5ta  Homo 
sapiens  

hTERT immortalized foreskin 
fibroblasts  

Diploid Male 138 

HeLa Kyoto Homo 
sapiens  

cervical cancer cell derivative Quasi 
tetraploid 

Female 139 

HeLa Kyoto 
GFP mPCNA 

Homo 
sapiens  

HeLa Kyoto cell line stably 
expressing GFP PCNA 

Quasi 
tetraploid 

Female 140 

HeLa Kyoto 
mCherry 
PCNA  

Homo 
sapiens  

HeLa Kyoto cell line stably 
expressing mCherry PCNA 

Quasi 
tetraploid 

Female 140 

C2C12  Mus 
musculus 

Immortalized mouse myoblast Quasi 
tetraploid 

Female 141 

C2C12 
mRFP PCNA 

Mus 
musculus 

C2C12 stably expressing PCNA 
tagged with mRFP 

Quasi 
tetraploid 

Female 107 

MEF w8 Mus 
musculus 

Mouse embryonic fibroblasts Diploid Male 142 

mESC J1 Mus 
musculus 

Embryonic stem cells isolated from 
the inner cell mass of a mouse 

Diploid Male 143 

mESC J1 
msTALE 

Mus 
musculus 

mESC J1 stably expressing a dTALE 
protein targeting the major satellite 
(msTALE) DNA fused with GFP  

Diploid Male 144 

 
0.1 mM beta-mercaptoethanol (Cat. No.: 4227, Carl Roth, Karlsruhe, Germany), 1000 U/ml 

recombinant mouse LIF (Millipore) and 2i (1 M PD032591 and 3 M CHIR99021 (Cat. Nos.: 1408 and 

1386 respectively, Axon Medchem, Netherlands)) on gelatin-coated culture dishes (0.2% gelatin; Cat. 

No.:Sigma-Aldrich Chemie GmbH, Steinheim, Germany). The culture medium was changed every 

day, and cells were split every 2 days. To target the chromocenters to the lamin, mESC J1 msTALE 

GFP cells were transfected with GBP-Lamin B1 (pc1467), an expression vector encoding the 

35 


sequence of the GFP-binding VHH domain fused to the human Lamin B1 coding sequence 145. As a 

control for the targeting assay, the GFP-binding VHH domain was removed to establish an expression 

vector with human Lamin B1 alone (pc2809) 107.  

To capture the dynamics of the earliest origins, multiple 100 mm plates of individual cell lines 

were washed with PBS 1x, and 6 ml of media was added to each plate before being shaken for 2 

minutes on a shaker. The cells should be cultured for at least 24 hrs before starting the experiment. 

The media was pooled from all plates and centrifuged for 5 minutes at 1500 rpm. The extra medium 

was discarded, and mitotic cells were resuspended in the rest of the media before being seeded. For 

live cell imaging, mitotic cells of HeLa Kyoto GFP PCNA and C2C12 mRFP PCNA were collected and 

seeded on a high-precision glass bottom plate (self-made).  

2.2 Doubling time and (sub)S phase duration:  
For doubling time/cell cycle length quantification, two time points falling within the logarithmic phase 

of cell proliferation (cell confluency between 30 and 70%) were used. First, 1 × 105 hTERT RPE1, 

hiPSC A4, hiPSC B4, and hESC H1 cells were seeded as technical triplicates. The counting started 

once the cells became adherent to and started forming colonies. Cells were trypsinized and 

resuspended in 1X PBS. Cell numbers were counted with a Neubauer hemocytometer for multiple 

time points within a 24-hour interval. Doubling time (𝑑𝑡) of the cell culture was then calculated by 

𝑑𝑡=(𝑙𝑜𝑔2×Δ𝑡)÷(𝑙𝑜𝑔𝑁2−𝑙𝑜𝑔𝑁1), where N1 and N2 are the numbers of cells counted at time point 1 and 2, 

respectively, and ∆t is the duration between these two time points.  

To determine the percentage of cells in the S phase, asynchronously growing cell populations 

were pulse-labeled with 10 µM of nucleoside analog 5-ethynyl-2’-deoxyuridine (EdU) (Cat.No.: 

7845.1, ClickIT-EdU cell proliferation assay, Carl Roth, Karlsruhe, Germany) for 15 min, fixed, and 

was detected along with 4’,6-diamidino-2-phenylindole (DAPI) as described below. High-throughput 

images were acquired and analyzed as described below. Based on EdU and DAPI intensity, the cell 

cycle profile was plotted, and the fraction of cells in each cell cycle was determined. To determine the 

duration of the S phase, the fraction of cells in the S phase was multiplied by cell cycle duration. To 

determine the percentage of cells in each sub-S phase, the number of cells in each S phase was 

manually counted by scoring the EdU spatial patterns from images acquired using high throughput 

microscopy with a 40x objective. The fraction of cells in each S phase was multiplied by doubling time 

duration to calculate the duration of each S phase stage.  

 
36 


2.3 Genome replication labeling, visualization, and immunostaining 
A list of all the nucleotide/nucleoside analogs used is described in Table 4. 

Pulse labeling: The cells were seeded on sterilized coverslips with respective media for the 

replication labeling and visualization experiments. The cells were pulse-labeled with 10 µM of EdU for 

15 min before washing with PBS 1× and fixing with 3.7% formaldehyde in PBS 1× for 10 min.  

Table 4: The list of nucleotide and nucleoside analogs 

Name Application Detection Catalog Company 

5-ethynyl-2′-deo
xyuridine (EdU)
  

Labeling of nascent 
DNA in pulse-chase 
experiments 

ClickIT 
chemistry 

7845.1 Carl Roth, 
Karlsruhe, Germany 
 

5-TAMRA-Azide Detection of EdU - CLK-FA008-1 Jena Bioscience, 
Jena, Germany 

5-bromo-2′-deox
yuridine (BrdU) 

Labeling of nascent 
DNA in pulse-chase 
experiments 

Antibody 
detection 

B5002 Sigma-Aldrich 
Chemie GmbH, 
Taufkirchen, 
Germany 

Biotin-16-dUTP Labeling of FISH 
probes 

Streptavidin 11093070910 Roche Diagnostics 
Deutschland GmbH, 
Mannheim, 
Germany 

Cy3-dUTP Labeling of FISH 
probes 

- ENZ-42501
  

Enzo Life Sciences, 
Lörrach, Germany 

Thymidine Labeling of nascent 
DNA in pulse-chase 
experiments added 
only in the chase 
period 

- T1895 Sigma-Aldrich 
Chemie GmbH, 
Taufkirchen, 
Germany 

 
Pulse-chase–pulse-chase: Cells were seeded on sterilized coverslips with respective media. First, 

cells were incubated with 10 µM of EdU for 15 min (first pulse). The cells were washed twice with 

respective warm media supplemented with 50 µM of thymidine (Cat.No.: T1895, Sigma-Aldrich 

Chemie GmbH, Taufkirchen, Germany) to stop the incorporation of EdU before incubating with fresh 

media for another three hours. Cells were then incubated with 10 µM of 5-bromo-2′-deoxyuridine 

(BrdU) (Cat.No.: B5002, Sigma-Aldrich Chemie GmbH, Taufkirchen, Germany) for 15 min (second 

pulse). The cells were washed twice with warm media supplemented with 50 µM of thymidine and 

incubated in fresh media for another three hours. The cells were washed with PBS 1× before fixing 

37 


with 3.7% formaldehyde in PBS 1× at room temperature for 10 min. After fixation, the cells were 

washed thrice with PBS 1×.  

For a simpler pulse-chase, cells were fixed as above after the first pulse and 2.5/3 hrs of 

chase instead of adding BrdU.  

Pulse-chase/pulse-pulse for earliest origins: Multiple 100 mm plates where cells were grown for at 

least 24 hrs and around 60% confluent were used for mitotic shake-off. The old media was removed, 

and the cells were first washed with warm 1xPBS before adding 6 ml of media. The plates were 

shaken on a shaker for 2 min. The media were pooled from all the plates and centrifuged for 5 min at 

1500 rpm. The extra medium was discarded, and the mitotic cells were resuspended in the rest of the 

media before being seeded for 8-9 hrs. The pulse was performed by adding EdU (10 µM) for 10 min. 

Meanwhile, media supplemented with 100 µM thymidine was prepared and placed in a warm bath. 

After the first pulse, the media containing EdU was removed and washed thrice with media 

supplemented with 100 µM thymidine. The cells were further incubated for 12 min with media 

supplemented with 100 µM thymidine. Care was taken to perform the washing as fast as possible. 

After the chase of 12 min, the cells were washed once with ice-cold PBS 1× before fixing with  3.7% 

formaldehyde in PBS 1× at room temperature for 10 min. For a pulse pulse, the thymidine was 

replaced with BrdU.  

To capture single cells with the earliest origins for further sequencing, mitotic cells were 

collected and seeded for 8-9 hrs in the respective media. The pulse chase was performed as above. 

Immediately after the chase period of 12 min, the cells were washed with prewarmed PBS/EDTA 

solution, and 2 ml of trypsin was added to the plat and incubated at 37°C for 2 min. To the plate 2 ml 

of warm media was added to deactivate the trypsin, and cells were resuspended to have a single cell 

suspension. The 4 ml of solution was transferred to a 50 ml tube, and 40 ml of ice-cold PBS 1× was 

added and vortexed. The cells were centrifuged for 3 min at 1500 rpm at 4°C. The cells were first 

resuspended in 19 ml of PBS 1×, and 1 ml of 37% formaldehyde was added immediately. The tube 

was transferred to a rotor and cells were fixed at room temperature for 10 min. The tube was 

centrifuged and the cells were resuspended in 40 ml of PBS 1×. The cells were pelleted and 

resuspended in 4% BSA/ PBS 1×, and stored in -20°C.  

Immunofluorescence staining: Unless otherwise mentioned, all the immunostaining was performed 

inside a dark, humidified chamber at room temperature. Table 5 lists all the antibodies used. 

 
38 


Table 5: List of the antibodies used for immunostaining 

Reactivity Host Clonality Diluti
on 

Catalog Company 

Anti-PCNA Mouse  Monoclonal 1:200 ab29 Abcam, Cambridge, UK 

Anti-RPA 194 Mouse Monoclonal 1:200
 

sc-48385 Santa Cruz Biotechnology, 
Dallas, TX, USA 

Anti-BrdU Rabbit Polyclonal 1:400 600-401-
C29 

Rockland Immunochemicals, 
Pottstown, PA, USA 

Anti- 
H3K9me3 

Mouse Monoclonal 1:200 39285 Active Motif, Waterloo, 
Belgium 

Anti- 
H3K36me3 

Rabbit Polyclonal 1:200
0
 

ab9050 Abcam, Cambridge, UK 
 

Anti- 
H3K27me3
  

  Mouse   Monoclonal 1:200 61017 Thermo Fisher Scientific, 
Waltham, MA, USA 
 

Anti-H3K9ac Rabbit Polyclonal 1:200
 

39917 Active Motif, Waterloo, 
Belgium 

Anti- 
H3K4me3 

Rabbit Polyclonal 1:200 39159 Active Motif, Waterloo, 
Belgium 

Anti- 
H4K20me3 

Rabbit Polyclonal 1:500 ab9053 Abcam, Cambridge, UK 

Anti- 
H3K9me2 

Mouse Monoclonal 1:500 39683 Active Motif, Waterloo, 
Belgium 

Anti- H4K5ac Rabbit Polyclonal 1:500 ab51997 Abcam, Cambridge, UK 

Anti- H4K8ac Rabbit Polyclonal 1:200 ab15823 Abcam, Cambridge, UK 

Anti- 
H4K12ac 

Mouse Monoclonal 1:200 61527 Active Motif, Waterloo, 
Belgium 

Anti- 
H4K16ac 

Rabbit Polyclonal 1:200 39168 Active Motif, Waterloo, 
Belgium 

Anti- 
H3K56ac 

Rabbit Monoclonal 1:250 2134-1 Epitomics, Inc., Burlingame, 
California (Now Abcam) 

Anti-NSD1 
Antibody, 
clone 

Mouse Ascites 1:250 04-1565 Sigma-Aldrich Chemie 
GmbH, Steinheim, Germany 

39 


1NW-1A10 

Anti-SETD2 Rabbit Monoclonal 1:250 E4W8Q Cell Signaling Technology, 
Inc., MA, USA 

Anti- RNA 
polymerase II 
RPB1 
phospho S5 

Mouse Monoclonal 1:200 ab5408 Abcam, Cambridge, UK 

Anti- RNA 
polymerase II 
RPB1 
phospho S2 

Rat Monoclonal 1:500 61084 Active Motif, Waterloo, 
Belgium 

Anti-mouse 
IgG Alexa 
Fluor 488 

Goat Polyclonal 1:500 A11029 Thermo Fisher Scientific, 
Waltham, MA, USA 

Anti-mouse 
IgG Cy5 

Donkey Polyclonal 1:200 JIM-715-
175-150 

Jackson ImmunoResearch 
Europe Ltd., Cambridge, UK 

Anti-rabbit 
IgG Alexa 
Fluor 488 

Goat Polyclonal 1:500 A-11034 Thermo Fisher Scientific, 
Waltham, MA, USA 

Streptavidin 
Alexa Fluor 
488 

Conjugated - 1:500 S11223 Thermo Fisher Scientific, 
Waltham, MA, USA 

Streptavidin 
Cy5 

Conjugated - 1:500 PA45001 Amersham Biosciences, 
Amersham, UK 

 
After fixation, the cells were permeabilized with 0.5% Triton X-100 (Carl Roth, Karlsruhe, 

Germany) in PBS 1× for 10 min, followed by three washes with 0.05% Tween in PBS 1×. To give 

access to the PCNA epitope, the cells were incubated with ice-cold methanol for 10 min. The cells 

were again washed thrice with a washing buffer (0.05% Tween in PBS 1×) and blocked with 4% BSA 

in PBS 1× for 30 min. 

For the detection of EdU, cells were incubated in Click-IT cocktail mix of 100 mM Tris-HCl pH 

8.5, 10 mM CuSO4, 1 µM 647 Azide (Cat.No.: 259P.1, Carl Roth, Karlsruhe, Germany), and 100 mM 

ascorbic acid diluted in water for 30 min 146. Cells were washed thrice with 0.05% Tween in PBS 1×. 

To detect BrdU, cells were incubated in anti-BrdU primary antibody diluted in 2% BSA, 1× DNase I 

buffer (60 mM Tris/HCl pH 8.1, 0.66 mM MgCl2, 1 mM β-mercaptoethanol), and 0.1 U/mL DNase I 

(Cat.No.: D5025, Sigma-Aldrich Chemie GmbH, Steinheim, Germany) for one hour at 37 °C. For the 

inactivation of DNase I, cells were washed twice with EDTA PBS 1× for 10 min each. For PCNA 

40 


detection, methanol treated cells were incubated in the primary antibody for two hours and washed 

thrice with 0.05% Tween in PBS 1× before adding suitable secondary antibodies for one hour and 

washing.  

To detect the histone modifications, cells were blocked in the blocking buffer (4% BSA/1% fish 

skin gelatin/PBS 1×) and incubated in the respective primary antibodies overnight at 4 °C. The cells 

were washed five times and incubated in suitable secondary antibodies for one hour before washing 

five times with the washing buffer. EdU was detected after the histone modification detection using 

Click-IT chemistry with TAMRA-Azide.  

All the cells were stained with 10 mg/mL DAPI (4′,6-diamidino-2-phenylindole, Cat.No.: 

D9542, Sigma-Aldrich Chemie GmbH, Steinheim, Germany) for 10 min and mounted on Vectashield 

(Cat.No.: VEC-H-1000, Vector Laboratories Inc., Burlingame, CA, USA). All the coverslips were 

sealed with transparent nail polish and air-dried. 

Immunostaining in suspension for earliest origins: To capture individual cells in suspension, the 

frozen cells were thawed and pelleted. Here after resuspension in 2% BSA/PBS 1× the cells were 

transferred to a 1.5 ml tube. The EdU Click-IT staining was performed with Click-IT reaction cocktail 

as above but with 1% Saponin (Catalog No.:47036, Sigma-Aldrich Chemie GmbH, Steinheim, 

Germany). The cells were washed twice in suspension with 2% BSA/PBS 1×. After the final wash, the 

cells were pelleted, supernatant was removed, and resuspended in the leftover solution. To this 100% 

MeOH (-20 °C) was added slowly till 1 ml with tapping in between each drop. The cells were vortexed 

briefly, and placed on ice for 10 min with inverting the tube in between. The cells were pelleted with 

2000 rpm for 5 min at 4 °C, supernatant was removed, and blocked in 2% BSA/PBS 1× for 30 min on 

a rotor. The tube was wrapped with aluminum foil. After 30 min, appropriate amount of primary 

antibody against PCNA was added to the tube, and kept on the rotor for 2 hrs. The cells were washed 

thrice in 2% BSA/PBS 1×, and again resuspended in appropriate secondary antibody and placed on 

the rotor for 1 hr. The cells were washed thrice and resuspended in 2% BSA/PBS 1×.  

2.4 Probe generation, metaphase spread, repli-FISH, and immuno repli-FISH 
Probe generation: The probe generation, fluorescence in situ hybridization, and co-detection of 

replication foci (RFi) and FISH probes experiments were performed as described before 147. All the 

plasmids and primers used are summarized in Table 6. For the genomic DNA (gDNA) preparation, 

hTERT RPE1 was pelleted and incubated overnight in TNES buffer (10 mM Tris; pH 7.5, 400 mM 

NaCl, 10 mM EDTA, 0.6% SDS) supplemented with 1 mg/mL Proteinase K (Cat. No.:BS202505, 

Bio&sell GmbH, Feucht, Germany) at 50 °C. RNA was removed by the addition of 0.5 mg/mL RNase 

A (Cat.No.: 10109169001, Sigma-Aldrich Chemie GmbH, Steinheim, Germany) for 30 min at 37 °C. 

The gDNA was extracted by the addition of 6 M NaCl at a final concentration of 1.25 M and vigorous  

41 


Table 6: A list of the probes and preparation methods 

Target Labeling 
Method 

Primers/Plasmids Reference 

Alu PCR AluF: 5′-GGATTACAGGYRTGAGCCA-3′ 
AluR: 3′-RCCAYTGCACTCCAGCCTG-5′ 
 

148 

Centromere PCR α27: 
5′-CATCACAAAGAAGTTTCTGAGAATGCTTC-3′ 
α30: 
5′-TGCATTCAACTCACAGAGTTGAACCTTCC-3′ 
 

21 

LINE1 Nick 
translation
  

Plasmid pLRE3-eGFP  149 

rDNA (human) Nick 
translation 

Plasmid pUC-hrDNA-12.0  150 

rDNA (mouse) Nick 
translation 

pMr974 151 

MaSat 
(Mouse) 

PCR 5’-AAAATGAGAAACATCCACTTG-3’ 
5’-CCATGATTTTCAGTTTTCTT-3’ 

 
shaking. After centrifugation (15 min, 11,000× g, RT), gDNA was precipitated from the supernatant by 

the addition of 100% ice-cold ethanol followed by incubation at –20 °C for 1 h and subsequent 

centrifugation (10 min, 11,000× g, 4 °C). The pellet was washed with 70% ethanol, air-dried, and 

dissolved in double distilled water. The plasmids containing rDNA and LINE1 probes were labeled 

with Cy3-dUTP (Cat.No.: ENZ-42501, Enzo Life Sciences, Lörrach, Germany) using nick translation. 

To prepare the Alu and centromere probes, the purified gDNA from hTERT RPE1 was used as a 

template to amplify and label with biotin-16-dUTP (Cat.No.: 11093070910, Roche Diagnostics 

Deutschland GmbH, Mannheim, Germany) via PCR using specific Alu primers 

(5′-GGATTACAGGYRTGAGCCA-3′; 3′-RCCAYTGCACTCCAGCCTG-5′) as well as specific 

centromere primers (α27: 5′-CATCACAAAGAAGTTTCTGAGAATGCTTC-3′); (α30: 

5′-TGCATTCAACTCACAGAGTTGAACCTTCC-3′) (refer to Figure 6A). Optionally, probes were 

sheared with a Covaris S220 (Covaris Inc., Woburn, MA, USA) in microTUBEs (50 µL aliquots; 

520,045, Covaris Inc.) to a final size of ~ 500 bp when the size distribution of the labeled probes was 

above 2 kb. All probes (~100 ng) except rDNA were precipitated with 1 µg of fish sperm DNA 

(Cat.No.: 10223638103, Roche Diagnostics Deutschland GmbH, Mannheim, Germany), 0.13× NaAC, 

42 


and 2.5× ethanol, before being washed with 70% ethanol, air dried, and dissolved in the hybridization 

solution (50% Formamide/SSC 2×). Around 100 ng of rDNA was co-precipitated with human 1 µg of 

Cot-1 DNA (Cat.No.: 5190-3393, Agilent, Santa Clara, CA, USA), 1 µg of fish sperm DNA, 0.13× 

NaAC, and 2.5× ethanol to reduce non-specific signals. 

Metaphase spreads were used to validate the probes. The hTERT RPE1 / BJ-5ta cells were seeded 

for at least 24 h before being treated with 0.1 µg/mL colcemid (N-deacetyl-N-methylcolchicine, 

Cat.No.: 10295892001, Roche Diagnostics Deutschland GmbH, Mannheim, Germany) for three to 

four hours. Cells were then harvested by trypsinization and incubated for 30 min with 75 mM KCl at 

37 °C with tapping in between. They were then fixed dropwise by adding ice-cold methanol/acetic 

acid (3:1) for 30 min on ice, and this was repeated twice. For chromosome spread, the cell 

suspension was dropped onto an ice-cold wet microscopy slide from a height of approximately 20 cm. 

The slide was then air-dried overnight. For metaphase FISH, the slides were rehydrated in ddH2O for 

10 min, digested with 0.005% pepsin (165 U/mL, Cat.No.: P6887, Sigma-Aldrich Chemie GmbH, 

Steinheim, Germany) in 0.01 M HCl for 10 min at 37 °C, fixed with 2% formaldehyde for 5 min, 

washed twice with SSC 2×, dehydrated in 70%, 80%, and 100% ethanol for 3 min each and air dried. 

After equilibrating with 10 µL of hybridization solution containing respective probes for 30 min at 37 °C 

inside a sealed hybridization box, the metaphase spreads were co-denatured at 80 °C for 5 min and 

immediately covered in ice for another 5 min. The box was transferred to a humidified chamber (37 

°C) and left overnight. Post-hybridization washes were performed with SSC 2× and blocked with 2% 

BSA/SSC 2× for 30 min. Biotin-labeled probes were detected with a suitable streptavidin-conjugated 

fluorophore, counterstained with DAPI, and mounted on Vectashield. The FISH signals were validated 

on metaphase spreads (Figure 6B) 

Repli-FISH and immuno-FISH: Cells were treated with 10 µM of EdU for 15 min, washed twice with 

PBS 1×, and fixed with 3.7% formaldehyde in PBS 1×. The cells were permeabilized with 0.5% Triton 

X-100 in PBS 1×, washed, and incubated in 20% glycerol in PBS 1× overnight at 4 °C. The cells were 

snap-frozen in liquid nitrogen and ice-cold incubated for 2 min with 20% glycerol PBS 1×, and this 

step was repeated two more times. This was followed by RNAse treatment (0.1 mg/ml) for 1 hr at 37 

°C and washing with washing buffer. If detection of EdU was performed, it was done before the FISH 

as above and fixed with 2% formaldehyde in PBS 1× for 10 min. The DNA was depurinated in 

ice-cold 0.1 N HCl/0.5% Triton X-100 for 5 min, the cells were washed with SSC 2× and fixed with 2% 

formaldehyde for 5 min before dehydration (70%, 80%, and 100% EtOH) and incubating with the 

probes. For Alu and LINE1 co-detection, equal volumes from each probe were pulled, mixed, and 

added to the coverslip, incubated for 15 min at 37 °C. In a water bath, the cells and probes were 

co-denatured at 80 °C for 5 min, immediately placed on ice for 5 min, transferred to the humidified 

43 


hybridization chamber at 37 °C, and le