Notes
Slide Show
Outline
1
Genomics
  • Chapter 18
2
Mapping Genomes
  • Maps of genomes can be divided into 2 types
  • -Genetic maps
  • -Abstract maps that place the relative location of genes on chromosomes based on recombination frequency
  • -Physical maps
  • -Use landmarks within DNA sequences, ranging from restriction sites to the actual DNA sequence
3
Physical Maps
  • Distances between “landmarks” are measured in base-pairs
  • -1000 basepairs (bp) = 1 kilobase (kb)
  • Knowledge of DNA sequence is not necessary
  • There are three main types of physical maps
  • -Restriction maps
  • -Cytological maps
  • -Radiation hybrid maps
4
Physical Maps
  • Restriction maps
  • -The first physical maps
  • -Based on distances between restriction sites
  • -Overlap between smaller segments can be used to assemble them into a contig
  • -Continuous segment of the genome
5
 
6
Physical Maps
  • Cytological maps
  • -Employ stains that generate reproducible patterns of bands on the chromosomes
  • -Divide chromosomes into subregions
  • -Provide a map of the whole genome, but at low resolution
  • -Cloned DNA is correlated with map using fluorescent in situ hybridization (FISH)
7
Physical Maps
8
Physical Maps
  • Radiation hybrid maps
  • -Use radiation to fragment chromosomes randomly
  • -Fragments are then recovered by fusing irradiated cell to another cell
  • -Usually a rodent cell
  • -Fragments can be identified based on banding patterns or FISH
9
Physical Maps
  • Sequence-tagged sites
  • -An STS is a small stretch of DNA that is unique in the genome
  • -Only 200-500 bp
  • -Boundary is defined by PCR primers
  • -Identified using any DNA as a template
  • -STSs essentially provide a scaffold for assembling genome sequences
10
 
11
Genetic Maps
  • Genetic maps are measured in centimorgans
  • -1 cM = 1% recombination frequency
  • Linkage mapping can be done without knowing the DNA sequence of a gene
  • -Limitations:
  • 1. Genetic distance does not directly correspond to actual physical distance
  • 2. Not all genes have obvious phenotypes
12
Genetic Maps
  • Most common markers are short repeat sequences called, short tandem repeats, or STR loci
  • -Differ in repeat length between individuals
  • -13 form the basis of modern DNA fingerprinting developed by the FBI
  • -Cataloged in the CODIS database to identify criminal offenders
13
Genetic Maps
  • Genetic and physical maps can be correlated
  • -Any cloned gene can be placed within the genome and can also be mapped genetically
14
Genetic Maps
  • All of these different kinds of maps are stored in databases
  • -The National Center for Biotechnology Information (NCBI) serves as the US repository for these data and more
  • -Similar databases exist in Europe and Japan
15
Whole Genome Sequencing
  • The ultimate physical map is the base-pair sequence of the entire genome
16
Whole Genome Sequencing
  • Sequencers provide accurate sequences for DNA segments up to 800 bp long
  • -To reduce errors, 5-10 copies of a genome are sequenced and compared
  • Vectors use to clone large pieces of DNA:
  • -Yeast artificial chromosomes (YACs)
  • -Bacterial artificial chromosomes (BACs)
  • -Human artificial chromosomes (HACs)
  • -Are circular, at present
17
Whole Genome Sequencing
  • Clone-by-clone sequencing
  • -Overlapping regions between BAC clones are identified by restriction mapping or STS analysis
  • Shotgun sequencing
  • -DNA is randomly cut into smaller fragments, cloned and then sequenced
  • -Computers put together the overlaps
  • -Sequence is not tied to other information
18
 
19
The Human Genome Project
  • Originated in 1990 by the International Human Genome Sequencing Consortium
  • Craig Venter formed a private company, and entered the “race” in May, 1998
  • In 2001, both groups published a draft sequence
  • -Contained numerous gaps
20
The Human Genome Project
  • In 2004, the “finished” sequence was published as the reference sequence (REF-SEQ) in databases
  • -3.2 gigabasepairs
  • -1 Gb = 1 billion basepairs
  • -Contains a 400-fold reduction in gaps
  • -99% of euchromatic sequence
  • -Error rate = 1 per 100,000 bases
21
Characterizing Genomes
  • The Human Genome Project found fewer genes than expected
  • -Initial estimate was 100,000 genes
  • -Number now appears to be about 25,000!
  • In general, eukaryotic genomes are larger and have more genes than those of prokaryotes
  • -However, the complexity of an organism is not necessarily related to its gene number
22
Characterizing Genomes
23
Finding Genes
  • Genes are identified by open reading frames
  • -An ORF begins with a start codon and contains no stop codon for a distance long enough to encode a protein


  • Sequence annotation
  • -The addition of information, such as ORFs, to the basic sequence information
24
Finding Genes
  • BLAST
  • -A search algorithm used to search NCBI databases for homologous sequences
  • -Permits researchers to infer functions for isolated molecular clones


  •  Bioinformatics
  • -Use of computer programs to search for genes, and to assemble and compare genomes
25
Genome Organization
  • Genomes consist of two main regions


  • -Coding DNA
  • -Contains genes than encode proteins


  • -Noncoding DNA
  • -Regions that do not encode proteins
26
Coding DNA in Eukaryotes
  • Four different classes are found:
  • -Single-copy genes : Includes most genes
  • -Segmental duplications : Blocks of genes copied from one chromosome to another
  • -Multigene families : Groups of related but distinctly different genes
  • -Tandem clusters : Identical copies of genes occurring together in clusters
  • -Also include rRNA genes
27
Noncoding DNA in Eukaryotes
  • Each cell in our bodies has about 6 feet of DNA stuffed into it
  • -However, less than one inch is devoted to genes!


  • Six major types of noncoding human DNA have been described
28
Noncoding DNA in Eukaryotes
  • Noncoding DNA within genes
  • -Protein-encoding exons are embedded within much larger noncoding introns
  • Structural DNA
  • -Called constitutive heterochromatin
  • -Localized to centromeres and telomeres
  • Simple sequence repeats (SSRs)
  • -One- to six-nucleotide sequences repeated thousands of times
29
Noncoding DNA in Eukaryotes
  • Segmental duplications
  • -Consist of 10,000 to 300,000 bp that have duplicated and moved


  • Pseudogenes
  • -Inactive genes
30
Noncoding DNA in Eukaryotes
  • Transposable elements (transposons)
  • -Mobile genetic elements
  • -Four types:
  • -Long interspersed elements (LINEs)
  • -Short interspersed elements (SINEs)
  • -Long terminal repeats (LTRs)
  • -Dead transposons
31
Noncoding DNA in Eukaryotes
32
Expressed Sequence Tags
  • ESTs can identify genes that are expressed
  • -They are generated by sequencing the ends of randomly selected cDNAs
  • ESTs have identified 87,000 cDNAs in different human tissues
  • -But how can 25,000 human genes encode three to four times as many proteins?
  • -Alternative splicing yields different proteins with different functions
33
Alternative Splicing
34
Variation in the Human Genome
  • Single-nucleotide polymorphisms (SNPs) are sites where individuals differ by only one nucleotide
  • -Must be found in at least 1% of population
  • Haplotypes are regions of the chromosome that are not exchanged by recombination
  • -Tendency for genes not to be randomized is called linkage disequilibrium
  • -Can be used to map genes
35
 
36
Genomics
  • Comparative genomics, the study of whole genome maps of organisms, has revealed similarities among them
  • -For example, over half of Drosophila genes have human counterparts


  • Synteny refers to the conserved arrangements of DNA segments in related genomes
  • -Allows comparisons of unsequenced genomes
37
Genomics
38
 
39
Genomics
  • Organellar genomes
  • -Mitochondria and chloroplasts are  descendants of ancient endosymbiotic bacterial cells
  • -Over time, their genomes exchanged genes with the nuclear genome
  • -Both organelles contain polypeptides encoded by the nucleus
40
Genomics
  • Functional genomics is the study of the function of genes and their products
  • DNA microarrays (“gene chips”) enable the analysis of gene expression at the whole-genome level
  • -DNA fragments are deposited on a slide
  • -Probed with labeled mRNA from different sources
  • -Active/inactive genes are identified
41
 
42
Genomics
  • Transgenics is the creation of organisms containing genes from other species (transgenic organisms
  • -Can be used to determine whether:
  • -A gene identified by an annotation program is really functional in vivo
  • -Homologous genes from different species have the same function


43
Genomics
44
Proteomics
  • Proteomics is the study of the proteome
  • -All the proteins encoded by the genome


  • The transcriptome consists of all the RNA that is present in a cell or tissue
45
Proteomics
  • Proteins are much more difficult to study than DNA because of:
  • -Post-translational modifications
  • -Alternative splicing
  • However, databases containing the known protein structural motifs exist
  • -These can be searched to predict the structure and function of gene sequences
46
Proteomics
47
Proteomics
  • Protein microarrays are being used to study large numbers of proteins simultaneously
  • -Can be probed using:
  • -Antibodies to specific proteins
  • -Specific proteins
  • -Small molecules


  • The yeast two-hybrid system has generated large-scale maps of interacting proteins
48
Applications of Genomics
  • The genomics revolution will have a lasting effect on how we think about living systems
  • The immediate impact of genomics is being seen in diagnostics
  • -Identifying genetic abnormalities
  • -Identifying victims by their remains
  • -Distinguishing between naturally occurring and intentional outbreaks of infections
49
Applications of Genomics
50
Applications of Genomics
  • Genomics has also helped in agriculture
51
Applications of Genomics
  • Genome science is also a source of ethical challenges and dilemmas
  • -Gene patents
  • -Should the sequence/use of genes be freely available or can it be patented?
  • -Privacy concerns
  • -Could one be discriminated against because their SNP profile indicates susceptibility to a disease?