Information

Genetic variation and sensitivity of wild cabbage

Genetic variation and sensitivity of wild cabbage


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

1: Is it true that the genetic variation of kale plants is smaller than wild cabbage?

I think so, because of human selection of the kale plants.

2: Are broccoli and cauliflower more sensitive to changing environmental conditions than wild cabbage?

I think so, because domesticated crops have a lower genetic diversity than wild varieties. Genetic diversity is useful for adaptation

3: Is it a good recommendation to grow new Sprout plants from the seed of wild cabbage plants?

I don't think this is the best idea, because genetically engineered Sprout plants could be much better.


I am going to assume that Kale is a plant derived from wild cabbage and has been selected upon for certain attributes by humans for 1000's of generations.

1 There are two principle reasons why genetic (allelic) diversity might be lower in the kale subgroup than in the wild cabbage. Firstly, strong selection imposed by humans may have diminished allelic diversity. Strong selection can drive alleles to fixation. Secondly, if the kale has come from a small sub-population of the wild cabbage it may have been through a genetic bottleneck, this is likely to reduce allelic diversity because, unless all alleles from the original population are captured in the founder populations, some alleles will be lost by genetic drift.

2 Generally allelic diversity is a "good" thing in an unstable environment. It allows different trait combinations and variation in traits to occur. If the environment changes drastically then it could impose severe selective forces on the population, those with high allelic diversity are likely to come up with a combination of alleles that survives.

E.g. Humans only allow the cabbages over 18cm in diameter to reproduce in to the next generation. 5 loci affect cabbage size, each having a "large cabbage" (C) allele and a "small cabbage" (c) allele. Having 5 c alleles gives a cabbage size of 15cm. Each C allele adds 1cm. If a population has fixed the c allele at 3 loci (i.e. there is low genetic diversity) then none of the cabbages will be >18cm. If a population has both the C and c allele at all loci then some of the population will be over 18cm.

3 The final question is difficult to answer, it depends on what you define as "a good recommendation," that is very subjective. If the aim is to maximize allelic diversity then yes, if it is to maximize cabbage size it is probably no (because the "genetically engineered" plants have been selected upon for large cabbages - they might have the C alleles fixed in the populations!)


Genetic Variation in Taste Sensitivity to Sugars in Drosophila melanogaster

Taste sensitivity plays a major role in controlling feeding behavior, and alterations in feeding habit induced by changes in taste sensitivity can drive speciation. We investigated variability in taste preferences in wild-derived inbred lines from the Drosophila melanogaster Genetic Reference Panel. Preferences for different sugars, which are essential nutrients for fruit flies, were assessed using two-choice preference tests that paired glucose with fructose, sucrose, or trehalose. The two-choice tests revealed that individual lines have differential and widely variable sugar preferences, and that sugar taste sensitivity is polygenic in the inbred population tested. We focused on 2 strains that exhibited opposing preferences for glucose and fructose, and performed proboscis extension reflex tests and electrophysiological recordings on taste sensilla upon exposure to fructose and glucose. The results indicated that taste sensitivity to fructose is dimorphic between the 2 lines. Genetic analysis showed that high sensitivity to fructose is autosomal dominant over low sensitivity, and that multiple loci on chromosomes 2 and 3 influence sensitivity. Further genetic complementation tests for fructose sensitivity on putative gustatory receptor (Gr) genes for sugars suggested that the Gr64a-Gr64f locus, not the fructose receptor gene Gr43a, might contribute to the dimorphic sensitivity to fructose between the 2 lines.

Keywords: Drosophila melanogaster Genetic Reference Panel genetic variation inbred line sugar receptor taste sensitivity.


Introduction

Crop improvement is needed now more than ever with challenges associated with feeding an ever-expanding population under increasingly variable growth conditions. The ability to produce crops that meet societal needs is enhanced by a thorough understanding of the genome of a species. Genomic resources expand the toolbox available for plant breeding and crop improvement efforts. Various tools have risen in popularity for plant breeding, in some cases as short-lived bandwagons and others as paradigm shifts in crop improvement [1, 2]. Within crop genomics, advances relevant to crop improvement have primarily been in marker (e.g., Illumina single nucleotide polymorphism (SNP) chips, kompetitive allele-specific PCR (KASP) assays, genotyping-by-sequencing (GBS)) and sequencing (e.g., Illumina, PacBio, Nanopore) technology. Recent innovations are driving a paradigm shift in which the extent and relevance of structural variation within the pan-genome of crop species are now being considered.

Access to plant genome assemblies in the early 2000s revolutionized thinking about the biology of crops and plant breeding [3,4,5]. These early assemblies allowed for a deeper understanding of the diversity in plant species, primarily at the level of SNPs [6,7,8,9]. However, after a short while, it became obvious that single-reference assemblies represent only a small fraction of species-wide genomic space [10,11,12,13]. Extensive structural variants (SVs) (e.g., presence-absence variation (PAV), copy number variation (CNV), and chromosomal rearrangements Fig. 1) were discovered, with the first two classes contributing to the variation in genome content. Within species, genomes vary in both gene content (e.g., tandem duplicated genes, CNVs dispersed throughout the genome, and PAVs of genes) and repetitive portions of the genome (e.g., transposable elements, knob repeats, centromere repeats). In characterizing this variation, the genomic fraction common to all individuals within a species has been termed the “core” genome and the variable fraction the “dispensable” genome.

Diagrams of structural variants that can be found in crop genomes

There are many mechanisms that can generate a structural variation. For example, transposable elements (TEs) can replicate themselves in a genome and can also capture and carry gene sequences to new genomic locations [14,15,16]. This process can cause significant disruption of the coding portion of the genome [15, 16]. Additionally, structural variation can be introduced through errors during meiotic recombination [17], such as non-allelic homologous recombination (unequal recombination [18]) and double-strand break repair via single-strand annealing [19]. Finally, PAVs can be created, especially in plants, through differential genome fractionation across genotypes following a whole-genome duplication event [20], although in maize, a paleopolyploid, it was shown that this phenomenon played a limited role in creating SVs among elite temperate germplasm [21].

The generation of multiple, reference-quality genome assemblies per crop species is now a reality [22,23,24]. Our way of thinking about crop genomics is changing as we gain a deeper understanding of the structural variation within the pan-genome. Initial efforts to dissect the genetic architecture of traits (e.g., quantitative trait locus (QTL) mapping and genome-wide association studies (GWAS)) and genomic prediction efforts have relied primarily on SNP markers. The structural variation that has been uncovered in the pan-genome era necessitates a reevaluation of the determinants of phenotype. To date, structural variation has already been associated with environmental adaptation such as tolerance of abiotic and biotic stress [25,26,27,28] and flowering time ([29, 30] for an extensive review, see [31]). Additionally, plant domestication traits such as non-shattering [32] and changes in plant architecture [33, 34] are caused by SVs. For example, a TE insertion

60 kb upstream of the maize tb1 gene played an important role in changing maize architecture during its domestication [35]. In fact, SVs in non-coding regions have been shown in many instances to influence gene expression of nearby genes [36, 37]. Given the breadth of traits affected by SV, their characterization is important for crop domestication and improvement and will facilitate future efforts in these areas.

Crop genomics has transitioned from the era of a single reference genome to a time when we now have access to tens or hundreds of reference-quality genome assemblies within a species (Table 1). This article reviews previous crop genomic efforts relevant to crop improvement and expected advances in light of recent progress in characterizing structural variation at the pan-genome level.

Assembly and bioinformatic advances allow characterization of crop pan-genomes

Advances in crop genome assembly technology

Over the last two decades, advances in sequencing technology and assembly algorithms have profoundly affected our understanding of the complexity and structure of genomes. Crops were among the first species with assembled genomes given their economic importance and the relevance of genomic information to breeding. The earliest model crop genomes were assembled with Sanger sequencing, BAC-by-BAC approaches, and overlap-layout-consensus (OLC) assemblers (e.g., rice [3], maize [4], sorghum [55], soybean [56], and grape [57]). Subsequent crop reference genomes increasingly relied on next-generation sequencing (e.g., potato [58]) with some assembled entirely from paired-end and mate-pair Illumina data and de Bruijn graph approaches (e.g., barley, wheat [59, 60]). These crop reference assemblies were, in many cases, rapidly followed by large resequencing studies in which short-read data were generated for additional individuals and mapped to the reference to characterize species-level diversity (e.g., rice [42, 61, 62], maize [6], soybean [63]).

Within the last 5 years, the reduced cost of Illumina sequencing and improved assembly algorithms facilitated de novo assembly of multiple accessions per crop using low-cost short-read data (e.g., maize-PH207 [64], maize-W22 [65], maize-HZS [66], maize-Flint genomes [50], rice genomes [43, 67], soybean genomes [12]). While this approach has generated highly complete and contiguous assemblies of low-copy genic regions, the more repetitive, TE-rich regions of the genome have proven recalcitrant to assembly with short reads, resulting in numerous gaps and partial assembly in these regions.

Recently, the maturation of long-read technology has facilitated much more contiguous and complete assembly of crop genomes [68,69,70,71,72] and, in some cases, multiple long-read-based assemblies within a single species [23, 24]. These assemblies are already facilitating discoveries of the relevance of non-coding and regulatory variation to agronomic traits, among other important discoveries [73, 74]. Sequence data continues to improve rapidly with sequence output increasing steadily and error rates decreasing (e.g., PacBio HiFi libraries), thereby diminishing the cost of assembly and increasing the utility of long-read assemblies for uncovering agronomically relevant variation across lines within crop species.

Characterizing structural variation based on a single reference genome

Methods to detect structural variation began to appear shortly after the publication of the first genome assemblies and have continued to develop as sequencing technologies have advanced (for comprehensive reviews, see [75, 76]). Early efforts to characterize CNV/PAV across species relied on hybridization arrays (e.g., comparative genomic hybridization (CGH)) that were based on probes often designed using only sequence from an initial reference genome assembly [11, 13, 19]. While array-based approaches are relatively inexpensive and high-throughput, they do have limitations. For example, once an array is developed, it is a static instrument, and newly identified loci of interest are not characterized. Additionally, when probes are based on a single reference sequence, ascertainment bias can be observed (i.e., hybridization efficiency diminishes when samples are more divergent from the reference individual).

As short-read resequencing decreased in cost and became commonplace, whole-genome sequencing (WGS) approaches were more frequently used to characterize CNV/PAV in crops [77,78,79]. These approaches for detecting CNV/PAV fall into three main categories: read depth, read pair, and split read [80]. With read-depth methods, short reads are mapped to a reference, and the relative depth of sequence at a locus serves as a proxy for copy number in a given individual [80]. Read-pair methods identify CNV/PAV based on discrepancies in the distance between paired-end sequences relative to their distance in the reference assembly [80]. Split-read methods detect SVs that interrupt the sequence within short reads [80].

The use of whole-genome sequencing allowed for characterization of a greater breadth of variants than hybridization arrays, but this approach suffers similar limitations: (1) sequence from loci that are missing in the reference genome due to either incomplete assembly or true biological absence does not map and remains uncharacterized, (2) divergent reads map less efficiently, and (3) uneven coverage bias of short-read sequencing can result in inaccuracies [81]. These shortcomings have been addressed to some extent through the assembly of unmapped reads [10, 48] and through the use of pseudo-references in which line-specific SNPs are introduced into the reference to increase mapping efficiency [82]. Characterization of structural variation in the repetitive fraction of the genome is particularly challenging with short-read resequencing data because mapping and assembly of unmapped reads are particularly inefficient and unreliable in these regions.

New approaches have rapidly developed for CNV/PAV characterization that leverage recently developed library preparation techniques and the maturation of single-molecule, long-read sequencing (comprehensively reviewed in [75]). For example, connected-molecule approaches (10x, Hi-C, Strand-Seq) can characterize long-range information using short reads through the development of specialized libraries of linked reads. Single-molecule approaches (optical maps (e.g., Bionano) and long-read sequencing, such as PacBio and Oxford Nanopore Technologies) allow for alignment of sequences from multiple individuals and, because of read length, enable characterization of sequences missing in the reference genome. Both of these approaches allow for the characterization of small- and intermediate-sized SVs. Large SVs (i.e., > 1 Mb) are more effectively characterized using optical maps (e.g., [83]). Collectively, these innovations have led to the most comprehensive characterization of CNV/PAV to date [84, 85]. However, the underlying data are still relatively expensive and must be generated at high depth for confident calls, making them impractical at the scale in which crop improvement programs often operate.

Characterizing structural variation through creation of a pan-genome reference

Access to multiple reference-quality genome assemblies within a species provides opportunities to identify SVs in a non-reference-biased manner. However, a number of challenges arise in such an approach. First, several crop species have large, complex genomes, making numerous assemblies per taxon cost-prohibitive. To overcome this limitation, a small number of breeding program founder individuals, which capture the majority of segregating haplotypes, can be targeted for genome assembly and identification of relevant SVs. Second, while multiple assemblies will reduce reference bias, assembly errors can lead to the detection of false SVs and compromise downstream analysis, particularly when de novo assemblies are generated using different data types or assembly algorithms. A third challenge is the consolidation of pan-genome variation into a single reference or coordinate system, a useful step for the analyses of the biological significance of SVs in crop species including QTL analysis, GWAS, and genomic prediction.

Several methods exist for summarizing SV information in a pan-genome context. One approach is to map resequencing reads to a reference genome, de novo assemble unmapped reads, and add the assembled contigs to the reference assembly (known as the map-to-pan approach) [48, 86]. This strategy can minimize errors by exploiting the information already available from a high-quality reference genome and limit the coordinate consolidation issue, but the genomic locations of newly assembled contigs remain unknown without further analysis. A second alternative is the construction of a graph-based rather than linear reference genome [87]. In this approach, any variant (SNP or SV) is added to the reference as a node at the genomic location where it is discovered [88, 89]. Recently, a hybrid approach between linear and graph-based reference genomes has been developed to build on the strengths of these methods. In this approach, reads are first mapped to a graph-based genome, and haplotypes are associated with one of the reference genomes used to build the graph. Reads are then realigned to this genome leading to more accurate mapping than the graph-based approach alone [90]. For detailed descriptions of each method, and their advantages and disadvantages, see [91, 92].

Relevance of transposable elements to crop improvement

As pan-genomes become widely available for crop species, TEs, a driver of structural variation, will receive increasing attention in crop improvement. Plant genomes (including crop species) are particularly rife with TEs [93], and the relevance of TEs to crop phenotypes has been repeatedly demonstrated. Transposable elements can be functionally relevant in a number of ways including modifying the structure and amount of gene product that is transcribed (Fig. 2 [14, 23, 35, 37, 94,95,96,97,98,99,100]). For example, in maize, a Harbinger-like DNA transposon represses the expression of the ZmCCT9 gene to promote flowering under long-day conditions [37]. In rice, a Gypsy retrotransposon has been shown to enhance the expression of the OsFRDL4 gene and promote aluminum tolerance [101]. Two Copia retrotransposons independently inserted into the promoter region of the orange Ruby gene, resulting in its enhanced expression and driving convergent evolution of the blood orange trait [102]. Finally, a Copia retrotransposon Rider has created polymorphism in the SUN locus resulting in the oval shape typical of the Roma tomato variety [103, 104]. Despite their prevalence and relevance to agronomic phenotypes, TEs have, until recently, been largely ignored in crop improvement efforts.

Functional consequences of new transposable element insertions. a Possible effects on gene product structure. b Possible effects on gene product abundance

TEs create the majority of insertions and deletions in many crop genomes. For example, > 75% of large InDels (i.e., ≥ 100 bp) in both tomato [23] and soybean [24] pan-genomes consist of at least one TE. Across four maize lines, there is greater than 1.6 Gb of TE sequence that was found to segregate in just this narrow subset of genotypes [105]. Genome-wide variation in TE content at the species level has, until recently, been difficult to characterize because, as described above, the repetitive fraction of genomes has historically been poorly assembled, and there are challenges with accurate read alignment to these regions. Methods to characterize variation in TE content using short-read data [106] and whole-genome comparisons [105] are emerging and will help provide access to a new level of functional variation underlying agronomic phenotypes.

Once TE sequences are captured in de novo genome assemblies, a critical remaining challenge is an accurate annotation to the family level. Three general approaches are used. The first is homology-based using existing TE databases such as Repbase [107] and P-MITE [108]. This approach is quick because it uses annotations from other species, but is limited by the availability of such information and the extent to which TE sequences are conserved across species [109]. The second approach is based on the copy number of sequences [110,111,112] and is relatively fast and sensitive for the identification of high-copy number repeats. However, the specific annotation of a sequence is unknown (i.e., these could be large gene families, TEs, other types of repeats), and low-copy TEs are often missed. The limited classification information provided by this approach hampers biological inference and utility for crop improvement. The third approach is the de novo identification of TEs based on structural features. Structural annotation does not rely on existing TE libraries and is very sensitive. This method depends critically on knowledge of the diagnostic structural components of TEs and, when this knowledge is incomplete or imprecise, can result in inaccurate annotation [113]. Recently, efforts have been made to combine these approaches into a comprehensive solution for TE annotation. Such pipelines incorporate structural and homology information, repetitiveness, existing TE curations, and extensive filtering to generate high-quality de novo TE annotations. Methods developed based on this approach include EDTA [114] and RepeatModeler2 [112]. Comprehensive TE annotation of high-quality pan-genomes will allow us to further explore their varied roles within crop genomes [115] and to link TE variation, a pervasive form of SV, to phenotypes with agronomic relevance [116].

Advancing QTL mapping and GWAS using crop pan-genomes

Two main approaches are used to identify genomic regions associated with a desired phenotype: QTL mapping with biparental populations and GWAS with panels of diverse individuals. Early crop reference genome assemblies facilitated the development of platforms (e.g., Illumina SNP chips) that allow for rapid, cost-effective genotyping of thousands or millions of SNPs across large sets of individuals. This increase in marker density dramatically increased resolution in mapping studies, which aided in the identification and cloning of QTLs associated with disease resistance, drought tolerance, yield, plant architecture, and other important agronomic traits [117]. With these marker-trait associations identified, breeders can use linked markers to select the best plants in a population without extensive phenotyping, either as functional markers [118] or through marker-assisted selection [119].

One major concern in QTL mapping or GWAS based on a single reference genome is reference bias [120]. If variants associated with a trait are not present in the reference genome, then QTL mapping or GWAS will not be able to detect them (Fig. 3a, b). For example, a maize gene conferring resistance to sugarcane mosaic virus could be identified by GWAS using markers based on the B73, but not the PH207, genome assembly, because the gene was not present in the PH207 assembly [120]. This situation is further exacerbated with more diverse germplasm (i.e., secondary gene pools), making it difficult to identify causative variation and bring it into the germplasm of breeding programs. A further problem is that true deletions relative to a reference genome are indistinguishable from missing data due to technical problems (e.g., low sequence coverage). Imputation of allelic variants across true deletions can result in decreased power to detect a significant association (Fig. 3c).

Impact of pan-genome representation on dissection of quantitative variation and applications to crop improvement. a Mapping reads to a single reference genome assembly (left) or a pan-genome graph (right) that captures structural variation in the species. b Impact of the read mapping method (single reference assembly vs. pan-genome graph) and subsequent variant calling on the ability to dissect the genetic architecture of a trait. c Causes for lack of identification of significant regions of the genome using variants called by mapping reads to a single reference genome assembly. d Methods breeders can utilize to exploit newly identified variants involve marker-assisted selection (MAS) and/or genomic selection (GS), inserting sequence through a transgene, or making other changes to a causative region with genome editing

QTL and GWAS studies have primarily relied on SNP data to date, but other markers have been useful in linking different types of variation to phenotype. For example, GWAS performed with both read depth variants (RDVs, a proxy for SVs), and SNPs in maize demonstrated that RDVs were enriched for significant GWAS results relative to SNPs for traits such as leaf development and disease resistance [77]. Similarly, in a large-scale GWAS using transcript abundance as a marker, gene associations with maize development traits were identified that were not detected by GWAS using SNPs [10]. While read depth and transcript abundance variants were useful in the initial efforts to assess the importance of SVs to phenotypic variation, they do not capture the complete structural variant landscape within a population. For example, read depth variants can only capture SVs that are present in the reference genome (e.g., insertions relative to the reference are not evaluated), leading to a strong reference bias and an incomplete picture of the relationship between SVs and phenotypes. RNA-seq is focused only on transcribed regions, is dependent on what tissues and developmental stages are sampled, and can be driven by both allelic variation in regulatory regions and true structural variation.

As the crop improvement paradigm shifts to a pan-genome perspective, the contribution of SVs to trait variation is becoming clear. Recently in Brassica napus, GWAS was performed with PAVs identified from eight whole-genome assemblies, and causal associations between SVs and silique length, seed weight, and flowering time were discovered that were not captured by SNP-GWAS. Likewise, GWAS based on the graph soybean pan-genome identified a PAV associated with variation in seed luster [24]. In peach, candidate causative SVs for early fruit maturity, flesh color around the stone, fruit shape, and flat shape formation have also been observed [121]. However, our understanding of the importance of SVs to phenotypic trait variation is still in its infancy. As technology and algorithm advancements allow for the complete SV landscape to be characterized at the scale of breeding programs and incorporated into a graph-based framework, it is anticipated that we will see a growing number of SVs underlying phenotypic variation important for crop improvement.

Advancing genomic prediction using crop pan-genomes

A number of important traits for crop improvement are controlled by many QTLs with small effect (e.g., yield). A complex genetic architecture makes it difficult to identify all QTLs underlying a trait, correctly estimate their effects, and introgress them into elite lines using methods such as marker-assisted selection [122,123,124]. Genomic selection is an alternative approach for complex traits, where marker effects are estimated from a training set, the phenotype of an individual is predicted based on the estimated marker effects (i.e., genomic prediction), and selections are made based on the predicted phenotype [125]. Regression and Bayesian approaches for genomic prediction were first described in the early 2000s and revolutionized animal and plant breeding [126]. Using SNPs as predictors, important agronomic traits such as grain yield, grain moisture, grain quality, biomass traits, and stalk and root lodging have been predicted with fairly high accuracy [127,128,129,130,131,132,133].

Traditionally, SNPs identified relative to a single reference genome have been used for genomic selection. However, as described above, there are a number of limitations and biases that are introduced with the use of a single reference for such applications. New approaches for identifying markers within a pan-genome framework are needed to improve prediction accuracy. The Practical Haplotype Graph (PHG) is one such method that successfully deals with the complexity of a species’ pan-genome at the scale necessary for complex traits and plant breeding programs [134]. In the PHG approach, existing genomic resources of breeding program founder lines (e.g., whole-genome resequencing data and/or whole-genome assemblies) are loaded into a graph-pan-genome database. Accurate imputation of low-sequence-coverage individuals (as low as 0.01× coverage) in the breeding population is achieved based on consensus haplotypes derived from the graph-pan-genome database. The PHG is a promising strategy for reducing the costs of genotyping, while also capturing a greater breadth of diversity in large breeding populations.

A major issue in genomic prediction is that genotype by environment (G×E) interactions decrease the prediction accuracy for individuals grown in novel environments. Statistical models that account for G×E have been designed to attempt to overcome this limitation [135,136,137]. Incorporation of SV data in such prediction models may further help to address issues of G×E in genomic prediction accuracy, because these variants have been shown to play a particularly important role in adaptation across environments. Not all SVs will be tagged by SNPs [70, 77, 138] and phenotypic variation driven by untagged SVs will be missed by prediction models. For example, Lyra et al. found that predictive ability for maize plant height under low nitrogen increased when adding just a few hundred CNVs to an analysis of

20k SNPs [139]. However, while adding these additional markers may result in higher predictive accuracy, their addition may not be practical in breeding programs at the moment, as they require novel data generation and analysis infrastructure. Breeders need to balance the costs of scoring different markers with the increased efficiency of genomic prediction and genetic gain. For the time being, structural variation information from a pan-genome will be most readily used by breeders if existing SNP genotyping technology includes markers in strong linkage disequilibrium (LD) with phenotypically important SVs. For SVs not tagged by SNPs [70, 77, 138], characterization of these variants using novel approaches is only prudent if the genetic gain is large enough to justify the increased cost.

Future challenges and opportunities in applications of pan-genomics for crop improvement

Beyond the promise that recent genomic advances offer for characterizing diversity in model crop systems and for improvement of trait mapping and prediction, they also present opportunities to tackle difficult and understudied crop genomes and could potentially enable novel, gene-editing approaches to breeding.

Complexity of polyploid genomes

Allopolyploidy (the result of interspecific or intergeneric hybridization and chromosome doubling) and autopolyploidy (the result of whole-genome duplication) are particularly common in plant species [140, 141]. In fact, all angiosperms have undergone at least two rounds of polyploidy in their evolutionary history [142]. Many have returned to a diploid state, bearing remnants of this evolutionary history in their genomes [143]. As a natural mechanism, polyploidization can increase allelic diversity, expand the complement of genes, generate novel phenotypic variation, and aid in adaptation to new environments [144, 145]. Taking advantage of this, plant breeders have also generated artificial polyploids resulting in increased grain yield [146], fruit size [147], and seedless fruit [148].

While polyploid crops are vitally important to sustain human life, genomic studies in these species have traditionally been very challenging for a number of reasons. High-quality genome assembly of polyploid species has been difficult to achieve due to their inclusion of multiple, closely related subgenomes and the associated challenges in discriminating homeologous loci and creating non-mosaic subgenome scaffolds (Fig. 4a). Many have resorted to sequencing diploid progenitors [149] or closely related species [58, 150] of polyploid crops in order to reduce genome complexity when generating initial reference assemblies. However, closely related diploids fail to capture lineage-specific SNPs, SVs, and other forms of variation that have accumulated post-polyploidization [39]. Beyond these difficulties in genome assembly, genomic approaches to polyploid crop improvement face further complications: (1) dissection of the genetic architecture of complex traits can be confounded when variants are not mapped to the correct subgenome [151, 152], a technical limitation, and (2) biologically, the more extensive epistatic interactions in polyploids [153, 154] and regulatory feedback between subgenomes can complicate the accurate prediction of phenotype based on genotype [155](Fig. 4b).

Impacts of technological advances to facilitate crop improvement in polyploid species. a Impact of sequencing technology on polyploid assembly. b Example of how understanding of a biological process is facilitated by having structural variation within subgenomes resolved beyond simply characterizing the number of copies in the genome

Advances in sequencing technologies and assembly algorithms are already addressing technical challenges in crop genomic research in polyploids [156]. Long-read sequencing with low error rates (e.g., PacBio HiFi reads) has made high-quality polyploid genome assembly possible, with recent assemblies containing fewer gaps and resolved homeologous scaffolds (Fig. 4a). Long-read assemblies now exist for polyploid crop species such as peanut [157], wheat [60], oilseed [22], and strawberry [158]. In some instances (e.g., potato [159]), multiple genome assemblies already exist within species. Nascent polyploid pan-genome studies are uncovering substantial diversity across species. For example, the de novo assembly of a single wheat cultivar captured 107,891 genes, and a map-to-pan assembly of 17 additional cultivars captured

30,000 novel genes [53, 60]. As pan-genomic studies expand in polyploid crop species, we expect that, due to genomic redundancy and complexity, the degree of structural variation within polyploid species will be greater than that observed in diploid species, and SVs may be particularly fruitful markers for genomic approaches to polyploid crop improvement. Technical progress in assembling polyploid genomes (e.g., improvements to haplotype and homeolog phasing) should facilitate basic, biological study of the differences in the genotype-to-phenotype map between diploids and polyploids, knowledge of fundamental importance to the future of polyploid crop improvement.

Genomic resources for understudied crop species

For understudied crops, pan-genome-assisted breeding efforts remain limited due to the small size of the research communities for these species and, in some cases, due to the challenges associated with genome complexity. For the majority of understudied crop species, transcriptome assemblies are currently used as a proxy to the genome for improvement efforts. One such example is Silphium integrifolium, a species with a large genome size (2n = 2x = 14 haploid genome size of

9 Gb [160]) that is currently being domesticated into an oil crop. Through transcriptome assembly and resequencing of 68 wild S. integrifolium accessions, several loci associated with adaptation to different climate conditions were identified [161]. While SNP data helped identify loci under selection, structural variation, an important source of local adaptation, remained uncharacterized. Pennycress (Thlaspi arvense) is another species that is currently being domesticated for use as an oil crop [162]. While it has advanced from an initial transcriptome assembly [163] to a full genome assembly [164], access to pan-genome variation is not yet available, despite the relatively small size (539 Mb) and simple genome structure of the species. Turfgrass and forage crops are further examples of understudied crops with limited genomic resources. Perennial ryegrass (Lolium perenne) has a fragmented draft genome [165], which may not be sufficient to enable pan-genomic research within the species. For other turfgrass species, such as hexaploid hard fescue (Festuca brevipila), long-read sequencing of the transcriptome has been used as a proxy of the reference genome, but it remains difficult to distinguish homeologs using this approach [166].

While pan-genomic studies may be in their infancy in non-model crops, it is anticipated that rapid advances in sequencing, assembly algorithms, and analysis pipelines in model systems and diminishing costs will very quickly enable this research. The time from publication of the first rice genome assembly to release of the first rice pan-genome was over a decade ([3] Table 1). We anticipate that the development of genomic resources, including pan-genomes, will now be much more rapid. Indeed, pan-genomic studies have already been published in Capsicum (pepper) and Juglans (walnut) species [46, 51], and others will soon follow.

Rapid domestication of new and existing species

The recent availability of high-quality genomes and pan-genomes has enabled a new era of crop domestication. With pan-genome information, breeders can more effectively identify causal genetic variants (e.g., SNPs, CNV, PAV) underlying domestication traits and apply gene-editing tools to rapidly achieve desirable agronomic traits in wild plants. For example, the tomato pan-genome has revealed that variation at the fruit weight QTL fw3.2 is caused by tandem duplication of the cytochrome P450 gene SlKLUH [23] rather than a SNP in the gene’s promoter as proposed earlier [167]. CRISPR/Cas9 gene editing to reduce the copy number of the SKILUH gene successfully altered fruit weight, a crop domestication phenotype [23]. Similarly, by using resequencing data and a map-to-pan approach, Gao et al. conducted a comparative analysis of 725 cultivated tomatoes and close wild relatives, uncovering gene loss during tomato domestication [49]. Further enrichment analysis suggested that defense response genes and nearly 1200 promoter sequences were targeted by selection during domestication and improvement [49]. A non-reference

4 kb substitution in the TomLoxC promoter region was also discovered that modifies fruit flavor [49]. These variants that distinguish crops from their wild relatives are prime targets for gene editing for rapid domestication.

Domestication has greatly reduced the genetic diversity of crops compared to their wild relatives [168]. Identifying and utilizing genetic diversity from crop wild relatives has been a major focus in crop improvement [169, 170]. Together, pan-genome information and CRISPR/Cas9 technologies enable de novo domestication of wild plants and can reduce barriers to the use of genetic variation from secondary and tertiary gene pools (wild relatives) [171, 172]. For example, Zsögön et al. edited six loci in wild tomato (Solanum pimpinellifolium) and significantly increased its yield, productivity, and nutritional value resulting in de novo domestication of tomato [173].

In summary, the complete catalog of variation that has been made possible by recent genomic technology and a pan-genome approach presents a substantial opportunity for crop improvement. We can, not only move beyond single-reference-based resequencing in model crops to a full understanding of structural variation and its link to phenotype, but also tackle complex, polyploid genomes, rapidly move understudied crops into the genomic era, and bring down barriers between crops and their wild relatives so that breeders can more easily expand their tool kit to include exotic germplasm. While further infrastructure and method development is necessary to fully realize this potential, there is a paradigm shift in the making.


Discussion

Insights into the Identities of Wild and Feral Populations

The identification of crop wild relatives allows for direct comparisons between wild and crop forms and provides an important source of genetic material for crops ( Honnay et al. 2012 Dempewolf et al. 2014), which often have relatively limited genepools due to diversity bottlenecks from selection ( Olsen and Gross 2008). Feral relatives can be morphologically indistinguishable from wild material ( Wang et al. 2017) and the wild or feral nature of B. rapa populations have been contested ( Andersen et al. 2009) hindering domestication research, breeding, and conservation in the species. In our analyses, weedy B. rapa samples from the Caucasus, Siberia, and Italy (“weedy CSI”) consistently segregate in a lineage distinct from weedy samples from the Americas and Europe outside of Italy, a lineage that has not been recovered in previous studies. With the exception of a single accession from Abruzzo in central Italy, and an accession from Tomsk in south-central Siberia, the weedy CSI clade consists of samples in or close to the Caucasus mountains in Georgia and northeastern Turkey. Both mid-Holocene and contemporary niche models suggested high habitat suitability in the Caucasus and Italy, but generally low suitability across Siberia except for in the extreme southern region. The lack of suitability in the latter case could be due to a lack of occurrence data from Siberia used in the niche model or erroneous passport data for the Siberian accession. Our analyses present the possibility that this lineage represents either truly wild relatives of some or all B. rapa crops or populations deriving from an early feralization event.

The hypothesis that the weedy CSI clade represents truly wild populations is consistent with Sinskaya’s (1969) suggestion that wild forms still exist in the Caucasus mountains and Siberia and our niche models indicate that habitat for B. rapa in the Caucasus, Siberia, and Italy would be suitable under a mid-Holocene climate model ( fig. 4A). The position of weedy CSI samples as sister to all crops and weeds in the RAxML and NJ analyses, and high nucleotide diversity support this interpretation. This lineage is also located at the juncture of all the crop subspecies in the PCA and contains ancestry common to each of the major crop clades in fastSTRUCTURE, which may indicate that the other crops as weeds were subsampled from the diversity present in weedy populations in the CSI range. The placement of weedy CSI in the SNPhylo and TreeMix trees as sister to Central and East Asian crops, suggests the possibility that weedy CSI is truly wild, but an independent domestication event gave rise to the European crops.

An alternative interpretation would be that weedy CSI represents a highly admixed feral lineage. Many crops are known to exist in wild-domesticated-feral hybrid swarms (e.g., Beebe et al. 1997 Wang et al. 2017 Allaby et al. 2008). As mentioned above, the diverse ancestry for these individuals recovered by fastSTRUCTURE and the central position in the PCA could be explained by admixture between crop and/or weedy types. Several patterns traditionally associated with wild progenitors in analyses may instead indicate admixed feral populations ( Wang et al. 2017). For example, high levels of admixture could also have inflated nucleotide diversity and caused the clade to be placed at an artificially deep position in phylogenetic analyses. Demographic models tested in moments were inconclusive regarding the early splits of the major lineages, but the models that performed best allowed for extensive early gene flow, supporting this hypothesis. The TreeMix recovered a migration edges supported by f4-statistics that would suggest gene flow between CSI and toria, and the TreeMix and SNPhylo trees are consistent with weedy CSI populations deriving from an ancient feral lineage that branched off from an ancestor of Central Asian oilseeds and East Asian crops.

Our analyses recovered strong evidence for the feral nature of the weedy populations outside of the CSI clade. Non-CSI European and American weedy samples emerged in association with European crops in PCA, fastSTRUCTURE, RAxML (99% BS), SNPhylo, and NJ analyses and had relatively low differentiation from European crops in the FST analysis. Despite lower levels of diversity among the feral samples, their local adaptation to biotic and abiotic stresses could make them a valuable source of breeding material as in feral rice ( Li et al., 2017).

Due to the geographic disjuncture of the weedy CSI samples, it is likely that there are related populations in the intervening areas. Several possibilities could explain the geographic disjuncture of the samples from Italy and the Caucasus in the CSI clade. Weedy B. rapa from the intervening area (e.g., Slovakia, Austria, and Serbia) were associated with other European weeds and crops in the analyses, but the TreeMix analysis suggested that introgression had occurred between the CSI clade and the European weeds, consistent with an exoferal origin of most of the European weedy accessions. Truly wild populations could have been more widespread in Europe previously, but may have been largely replaced by successful feral populations that are preadapted to agroecosystems. Alternatively, the sole Italian sample could have been introduced relatively recently as a seed contaminant in imported crops. Weedy populations of B. rapa have recently been reported in Southwest China ( Dong et al. 2018) and Japan ( Aono 2011), despite previous reports of absence from East Asia ( De Candolle 1886 McGrath and Quiros 1992) as well as in Algeria ( Aissiou et al. 2018). Sampling these populations could further clarify the evolutionary history of B. rapa.

Domestication Center, Timing, and Initial Crop Type

The location and timing of domestication as well as the initial domesticated form of B. rapa that was subsequently selected for other crop types has been debated. Our analyses support an initial domestication of turnips and/or oilseeds in Central Asia 3,430–5,930 years before present (YBP) with subsequent diffusion to Europe and East Asia.

Turnips from Central Asia appear to be most closely associated with the putatively wild CSI populations in our FST analysis and had the highest levels of nucleotide diversity of any population. This is consistent with the findings of Qi et al. (2017), who found Central Asian and European turnips to have relatively higher values than East Asian vegetables, and much higher values than yellow sarsons and toria. Given our use of different reduced representation markers, the absolute comparison of values is not possible. Also as in Qi et al. (2017), we estimated highest Ne for turnips, intermediate values for East Asian vegetables, and the lowest values for yellow sarsons and toria. The turnip samples from the Hindu Kush mountains of Afghanistan, Pakistan, and Tajikistan are sister to the rest of the B. rapa samples from West Asia and Europe in our TreeMix, RAxML, and NJ trees and demographic model and are located in an area of very high habitat suitability in the Mid-Holocene species distribution model.

East Asian turnips are also associated with the putatively wild samples and Central Asian turnips as seen in the PCA, fastSTRUCTURE, and FST analyses. East Asian turnips also emerge as sister to other East Asian crops (with the exception of the Japanese leafy crops which are intermingled with East Asian turnips) in our tree-based analyses. An early Central Asian origin is supported by linguistic evidence for turnips in the Pontic Steppe north of the Caucasus mountains approximately between 6430-4278 YBP and in S.W. Asia nearly 3,000 YBP ( supplementary table 1 , Supplementary Material online).

It has been hypothesized that Central Asian oilseeds arose from an independent domestication event ( Zhao et al. 2005 Warwick et al. 2008) or from European oilseeds ( Song et al. 1988). Although our TreeMix, SNPhylo, NJ, and RAxML trees show Central Asian oilseeds from the mountains of Afghanistan, Pakistan, and India as sister to East Asian crops, they are associated with West, Central, West, and East Asian turnips in fastSTRUCTURE. These results support either an independent domestication for oilseeds or selection on Central Asian turnips before the diversification of other crop forms across Europe and Asia. Literary evidence for oilseed crops in Northern India almost 3,000 YBP supports their antiquity in the area ( supplementary table 1 , Supplementary Material online). Apparent introgression between the Central Asian oilseed toria and Central Asian turnips as recovered by TreeMix and f4-statistics is consistent with the geographic proximity of these crops. In our tree-based analyses, toria is consistently sister to yellow sarsons.

In the demographic models we tested with weedy CSI as sister to the remaining crops and weeds, the split was estimated at between 3,430–5,930 YBP (years before present). If the weedy CSI populations are truly wild, this provides a rough temporal estimate of the domestication of B. rapa. Although this prediction is sensitive to our estimates of initial effective population size and generation time, this finding is consistent with archaeological and linguistic evidence ( supplementary table 1 , Supplementary Material online).

Convergent Phenotypes in Leafy and Oilseed Crops Deriving from Turnip Crops

Our inclusion of diverse leafy crop types from Europe and Asia allowed us to identify distinct evolutionary lineages for at least three different types of leafy B. rapa crops—grelos, rapini, and East Asian greens—all with likely origins in different turnip populations. None of our tree-based analyses recovered these three leafy crops together as a common monophyletic clade and they do not cluster with one another in the PCA. Instead, turnip crops from geographically similar regions emerged as sister to each one.

North African turnips emerged as sister to the Italian leafy crop rapini in TreeMix, RAxML (97% BS), SNPhylo, NJ analyses, and Turkish turnips were sister to rapini in the moments demographic model. Our findings differ from Qi et al. (2017) who found rapini to be sister to the Central Asian oilseeds possibly because the majority of the rapini samples used in Qi et al. (2017) were later suspected to be polyploids in Bird et al. (2017) or had Central Asian provenance. Our findings suggest a possible trans-Mediterranean introduction of turnips to Italy followed by selection for leafy forms or domestication for a leafy type in North Africa where turnip leaves are also eaten (Hammer and Perrino 1985). A different Mediterranean turnip population, Spanish turnips, however, emerged as sister to the Spanish leafy crop grelos in our RAxML, SNPhylo, and NJ analysis. PCA results also indicate an association of the two Mediterranean leafy crops with distinct turnip populations. This provides the first evidence that grelos were independently selected from local turnip crops instead of sharing a common origin with rapini ( Francisco et al. 2009).

A similar pattern was recovered in East Asia, where leafy crops may have an origin in a distinct population of turnips. East Asian turnips emerged as sister to East Asian leafy crops in the TreeMix, RAxML, SNPhylo, and NJ trees consistent with the results from Bird et al. (2017) and Cheng et al. (2016). The origin of East Asian leafy types in turnips is also consistent with Takuno et al.’s (2007) hypothesis that an unimproved proto-crop from Europe or Central Asia was introduced to East Asia where it was subsequently selected into diverse leafy morphotypes. The origins of napa cabbage as a result of intermixing of bok choy and East Asian turnips is supported in our demographic model as in Qi et al. (2017) but not recovered in the TreeMix migration edges supported by four-population tests. Most Japanese leafy greens are also clustered with the East Asian turnips suggesting that they too may share a common origin in turnips and may have been selected from an early introduction of turnips or a proto-vegetable type. Although the East Asian turnips sampled had elevated LD compared with bok choy and napa cabbage, Cheng et al. (2016) found the opposite, perhaps as a result of their broader sampling.

Although these patterns could suggest parallel selection for leafy types out of distinct turnip populations, they may also be explained by a single origin of the leafy morphotype with subsequent introgression resulting in the leafy phenotype arising in other turnip lineages, although the latter possibility is not supported by our TreeMix and f4-statistic analyses. The enlarged root-hypocotyl of turnips is controlled by relatively few genes and may have therefore been lost multiple times through human selection and/or ferality ( McGrath and Quiros 1992). The pattern of selection for leafy crops out of conspecific crops with swollen stems or taproots does not seem to be common in analogous crops. A similar trend has been observed, however, in B. juncea as it appears that root mustard (B. juncea ssp. napiformis) was the initially selected domesticated, followed by leafy, stem, and oilseed types ( Yang et al. 2018). This differs from B. oleracea, B. napus, and Beta vulgaris, where leafy or oilseed types appear to predate the swollen stem/root types ( Zohary and Hopf, 2000 Maggioni 2015 An et al. 2019).

Our limited sampling of turnip rape oilseeds from East Asia and Europe restrict our inferences, but like Bird et al. (2017) and Qi et al. (2017), turnip rape did not form a monophyletic clade. Our data support an origin of European turnip rape from European turnips, and East Asian turnip rape from turnips, and/or bok choy consistent with Reiner et al. (1995).

Taxonomic Implications

Our findings indicate that an infraspecific taxonomic recircumscription of B. rapa is warranted to avoid confusion. Several classification systems are currently in use with a combination of subspecies, varieties, and cultivar groups as infraspecific designations ( McAlvay et al. 2018). Our findings highlight several paraphyletic lineages that are typically considered to be the same taxon including turnip rape (B. rapa ssp. oleifera), weedy forms (B. rapa ssp. sylvestris), and rapini/grelos (B. rapa ssp. sylvestris var. esculenta). In the case of the latter name the “ssp. sylvestris” epithet is misleading given its likely (multiple) origins in turnips. Furthermore, our results indicate that turnips may also have multiple origins or a single origin with reversions to turnip types, with the former scenario suggesting the need for a reexamination of nomenclature. A phylogenetically and morphologically informed recircumscription of B. rapa to create an international standard for delineating these lineages would be a valuable future contribution to Brassica research and breeding.


Climate change affects the genetic diversity of a species

What effects does climate change have on the genetic diversity of living organisms? In a study led by Charité -- Universitätsmedizin Berlin, an international team of researchers studied the genome of the alpine marmot, an ice-age remnant that now lives in large numbers in the high altitude Alpine meadow. Results were unexpected: the species was found to be the least genetically diverse of any wild mammal studied to date. An explanation was found in the marmots genetic past. The alpine marmot has lost its genetic diversity during ice-age related climate events and been unable to recover its diversity since. Results from this study have been published in the journal Current Biology.

A large rodent from the squirrel family, the alpine marmot lives in the high-altitude mountainous terrain found beyond the tree line. An international team of researchers has now successfully deciphered the animal's genome and found the individual animals tested to be genetically very similar. In fact, the animal's genetic diversity is lower than that of any other wild mammal whose genome has been genetically sequenced. "We were very surprised by this finding. Low genetic diversity is primarily found among highly endangered species such as, for instance, the mountain gorilla. Population numbers for the alpine marmot, however, are in the hundreds of thousands, which is why the species is not considered to be at risk," explains Prof. Dr. Markus Ralser, the Director of Charité's Institute of Biochemistry and the investigator with overall responsibility for the study, which was co-led by the Francis Crick Institute.

As the alpine marmot's low genetic diversity could not be explained by the animal's current living and breeding habits, the researchers used computer-based analysis to reconstruct the marmot's genetic past. After combining the results of comprehensive genetic analyses with data from fossil records, the researchers came to the conclusion that the alpine marmot lost its genetic diversity as a result of multiple climate-related adaptations during the last ice age. One of these adaptations occurred during the animal's colonization of the Pleistocene steppe at the beginning of the last ice age (between 110,000 and 115,000 years ago). A second occurred when the Pleistocene steppe disappeared again towards the end of the ice age (between 10,000 and 15,000 years ago). Since then, marmots have inhabited the high-altitude grasslands of the Alps, where temperatures are similar to those of the Pleistocene steppe habitat. The researchers found evidence to suggest that the marmot's adaptation to the colder temperatures of the Pleistocene steppe resulted in longer generation time and a decrease in the rate of genetic mutations. These developments meant that the animals were unable to effectively regenerate their genetic diversity. Overall results suggest that the rate of genome evolution is exceptionally low in alpine marmots.

Commenting on the significance of their results, Prof. Ralser says: "Our study shows that climate change can have extremely long-term effects on the genetic diversity of a species. This had not previously been shown in such clear detail. When a species displays very little genetic diversity, this can be due to climate events which occurred many thousands of years ago," He adds: "It is remarkable that the alpine marmot managed to survive for thousands of years despite its low genetic diversity." After all, a lack of genetic variation can mean a reduced ability to adapt to change, rendering the affected species more susceptible to both diseases and altered environmental conditions -- including changes in the local climate."

Summarizing the study's findings, Prof. Ralser explains: "We should take the results of the study seriously, as we can see similar warnings from the past. In the 19th century, the passenger pigeon was one of the most abundant species of land birds in the Northern Hemisphere, yet, it was completely wiped out within just a few years. It is possible that low genetic diversity played a role in this." Outlining his plans for further research, he adds: "An important next step would be to study other animals more closely which, like the alpine marmot, managed to survive the ice age. These animals might be trapped in a similar state of low genetic diversity. Currently, estimates of a particular species' extinction risk are primarily based on the number of animals capable of breeding. We ought to reconsider whether this should be the only criterion we use."

Prof. Dr. Markus Ralser was appointed Einstein Professor for Biochemistry at Charité in May 2018. An expert in metabolism, Prof. Ralser came to Charité after spending time at the Francis Crick Institute in London and the University of Cambridge, where he led teams involved in this study. Other researchers involved in the research hailed from the University of Sheffield, Bielefeld University, the Max Planck Institute for Molecular Genetics and other institutions. The researchers originally set out to study the alpine marmot's genome in order to gain a better understanding of the animal's lipid metabolism.


Arus, P. & T.J. Orton, 1983. Inheritance and linkage relationships of isozyme loci in Brassica oleracea. J. Hered. 74: 405–412.

Carafa, A.M. & G. Carratu, 1997. Stigma treatment with saline solutions: a new method to overcome self-incompatibility in Brassica oleracea L. J. Hort. Sci. 72: 531–535.

Charlesworth, D. & P. Awadalla, 1998. Flowering plant selfincompatibility: the molecular population genetics of Brassica S-loci. Heredity 81: 1–9.

Cornuet, J.M. & G. Luikart, 1996. Description and power analysis of two tests for detecting recent bottlenecks from allele frequency data. Genetics 144: 2001–2014.

Dallas, J.F., F. Bonhomme, P. Boursot, J. Britton-Davidian & V. Bauchau, 1998. Population genetic structure in a Robertsonian race of house mice: evidence from microsatellite polymorphism. Heredity 80: 70–77.

Dellaporta, S.L., J. Wood & J.B. Hicks, 1983. A plant DNA minipreparation: version II. Pl. Mol. Biol. Rep. 1: 19–21.

Di Rienzo, A., A.C. Peterson, J.C. Garza, A.M. Valdes, M. Slatkin & N.B. Freimer, 1994. Mutational processes of simple-sequence repeat loci in human populations. Proc. Natl. Acad. Sci. U.S.A. 91: 3166–3170.

Fang, G., S. Hammar & R. Grumet, 1992. A quick and inexpensive method for removing polysaccharides from plant genomic DNA. Biotechniques 13: 52–56.

Goudet, J., 1995. FSTAT (Version 1.2): a computer program to calculate F-statistics. J. Hered. 86: 485–486.

Goudet, J., T. De Meeüs, A.J. Day & C.J. Gliddon, 1994. The different levels of population structuring of the dogwhelk, Nucella lapillus, along the south Devon coast. In: Beaumont, A.R. (Ed.), Genetics and Evolution of Aquatic Organisms, Chapman & Hall, London, pp. 81–95.

Goudet, J., M. Raymond, T. De Meeüs & F. Rousset, 1996. Testing differentiation in diploid populations. Genetics 144: 1993–1940.

Grandclement, C. & G. Thomas, 1996. Detection and analysis of QTLs based on RAPD markers for polygenic resistance to Plasmodiophora brassicae Woron in Brassica oleracea L. Theor. Appl. Genet. 93: 86–90.

Greenhalgh, J.R. & N.D. Mitchell, 1976. The involvement of flavour volatiles in the resistance to downy mildew of wild and cultivated forms of Brassica oleracea. New Phytol. 77: 391–398.

Hamrick, J.L. & M.J.W. Godt, 1996. Effects of life-history traits on genetic diversity in plant species. Phil. Trans. R. Soc. Lond. B 351: 1291–1298.

Hu, J., J. Sadowski, T.C. Osborn, B.S. Landry & C.F. Quiros, 1998. Linkage group alignment from four independent Brassica oleracea RFLP maps. Genome 41: 226–235.

Hurtrez-Boussès, S., 1996. Genetic differentiation among natural populations of the rare Corsican endemic Brassica insularis Moris: implications for conservation guidelines. Biol. Conserv. 76: 25–30.

Kresovich, S., A.K. Szewc-McFadden, S.M. Bliek, & J.R. McFerson, 1995. Abundance and characterisation of simple-sequence repeats (SSRs) isolated from a size-fractionated genomic library of Brassica napus L. (rapeseed). Theor. Appl. Genet. 91: 206–211.

Lannér-Herrera, C., M. Gustafsson, A.-S. Fält & T. Bryngelsson, 1996. Diversity in natural populations of wild Brassica oleracea as estimated by isozyme and RAPD analysis. Genet. Res. Crop Evol. 43: 13–23.

Lydiate, D., A. Sharpe, U. Langercrantz & I. Parkin, 1993. Mapping the Brassica genome. Outl. Agric. 22: 85–92.

Mantel, N., 1967. The detection of disease clustering and a generalised regression approach. Cancer Res. 27: 209–220.

Mellersh, C. & J. Sampson, 1993. Simplifying detection of microsatellite length polymorphisms. Biotechniques 15: 582.

Mitchell, N.D., 1976. The status of Brassica oleracea L. subsp. oleracea (wild cabbage) in the British Isles. Watsonia 11: 97–103.

Mitchell, N.D. & A.J. Richards, 1979. Biological Flora of the British Isles No 145 Brassica oleracea L. subsp. oleracea. J. Ecol. 67: 1087–1096.

Mithen, R., A.F. Raybould & A. Giamoustaris, 1995. Divergent selection for secondary metabolites between wild populations of Brassica oleracea and its implications for plant herbivore interactions. Heredity 75: 472–484.

Nauta, M.J. & F.J. Weissing, 1996. Constraints on allele size at microsatellite loci: implications for genetic differentiation. Genetics 143: 1021–1032.

Nei, M., 1973. Analysis of gene diversity in subdivided populations. Proc. Natl. Acad. Sci. U.S.A. 70: 3321–3323.

Nelson, A., 1927. Fertility in the genus Brassica. J. Genet. 18: 109–135.

Ockendon, D.J., 1974. Distribution of self-incompatibility alleles and breeding structure of open-pollinated cultivars of Brussels sprouts. Heredity 33: 159–171.

Pemberton, J.M., J. Slate, D.R. Bancroft & J.A. Barrett, 1995. Nonamplifying alleles at microsatellite loci: a caution for parentage and population studies. Mol. Ecol. 4: 249–252.

Pfeiffer, A., A.M. Olivieri & M. Morgante, 1997. Identification and characterization of microsatellites in Norway spruce. Genome 40: 411–419.

Raybould, A.F., A.J. Gray, M.J. Lawrence & D.F. Marshall, 1991. The evolution of Spartina anglica (C.E. Hubbard): origin and genetic variability. Biol. J. Linn. Soc. 43: 111–126.

Raybould, A.F., J. Goudet, R.J. Mogg, C.J. Gliddon & A.J. Gray, 1996. Genetic structure of a linear population of sea beet (Beta vulgaris ssp. maritima) revealed by isozyme and RFLP analysis. Heredity 76: 111–117.

Raybould, A.F., R.J. Mogg, C. Aldham, C.J. Gliddon, R.J. Thorpe & R.T. Clarke, 1998. The genetic structure of sea beet (Beta vulgaris ssp. maritima) populations. III. Detection of isolation by distance at microsatellite loci. Heredity 80: 127–132.

Raymond, M., 1997. Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145: 1219–1228.

Raymond, M. & F. Rousset, 1995. genepop (Version 1.2): population genetics software for exact tests and ecumenicism. J. Hered. 86: 248–249.

Rether, B., G. Delmas & A. Laouedj, 1993. Isolation of polysaccharide-free DNA from plants. Pl. Mol. Biol. Rep. 11: 333–337.

Rousset, F. & M. Raymond, 1995. Testing heterozygote excess and deficiency. Genetics 140: 1413–1419.

Saghai Maroof, M.A., R.M. Biyashev, G.P. Yang, Q. Zhang & R.W. Allard, 1994. Extraordinary polymorphic microsatellite DNA in barley: species diversity, chromosomal locations, and population dynamics. Proc. Natl. Acad. Sci. U.S.A. 91: 5466–5470.

Slatkin, M., 1985. Gene flow in natural populations. Ann. Rev. Ecol. Syst. 16: 393–430.

Slatkin, M., 1993. Isolation by distance in equilibrium and nonequilibrium populations. Evolution 47: 264–279.

Slatkin, M., 1995. A measure of population subdivision based on microsatellite allele frequencies. Genetics 139: 457–462.

Snogerup, S., M. Gustafsson & R. von Bothmer, 1990. Brassica sect. Brassica (Brassicaceae). Willdenowia 19: 271–365.

Song, K., T.C. Osborn & P.H. Williams, 1990. Brassica taxonomy based on nuclear restriction fragment length polymorphisms (RFLPs). 3. Genome relationships in Brassica and related genera and the origin of Brassica oleracea and B. rapa (syn. campestris). Theor. Appl. Genet. 79: 497–506.

Streiff, R., T. Labbe, R. Bacilieri, H. Steinkellner, J. Glossl & A. Kremer, 1998. Within population genetic structure in Quercus robor L. and Quercus petraea (Matt.) Liebl. assessed with isozymes and microsatellites. Mol. Ecol. 7: 317–328.


Abdelkrim J, Robersten BC, Stanton J-A, Gemmell NJ (2009) Fast, cost-effective development of species-specific microsatellite markers by genome sequencing. Biotechniques 46:185–191

Anderson C, Epperson BK, Fortin MJ, Holderegger R, James PMA, Rosenberg MS, Scribner KT, Spear S (2010) The importance of spatial and temporal scale in landscape genetics. Mol Ecol (submitted)

Angelone S, Holderegger R (2009) Population genetics suggests effectiveness of habitat connectivity measures for the European tree frog in Switzerland. J Appl Ecol 46:879–887

Baguette M, Van Dyck H (2007) Landscape connectivity and animal behavior: functional grain as a key determinant for dispersal. Landsc Ecol 22:1117–1129

Barton NH, Wilson I (1995) Genealogies and geography. Philos Trans R Soc Lond B 349:49–59

Beaumont MA, Balding DJ (2004) Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol 13:969–980

Beaumont MA, Rannala B (2004) The Bayesian revolution in genetics. Nat Rev Genet 5:251–261

Beaumont MA, Zhang WY, Balding DJ (2002) Approximate Bayesian computation in population genetics. Genetics 162:2025–2035

Blum MGB, Francois O (2010) Non linear regression models for approximate Bayesian computation. Stat Comput 20:63–73

Born C, Hardy OJ, Ossari S, Attéké C, Wickings EJ, Chevallier MH, Hossaert-McKey M (2008) Small-scale spatial genetic structure in the Central African rainforest tree species, Aucoumea klaineana: a hierarchical approach to infer the impact of limited gene dispersal, population history and habitat fragmentation. Mol Ecol 17:2041–2050

Bowman J, Jaeger JAG, Fahrig L (2002) Dispersal distance of mammals is proportional to home range size. Ecology 83:2049–2055

Chen C, Durand E, Forbes F, Francois O (2007) Bayesian clustering algorithms ascertaining spatial population structure: a new computer program and a comparison study. Mol Ecol Notes 7:747–756

Cullingham CI, Kyle CJ, Pond BA, Rees EE, White BN (2009) Different permeability of rivers to raccoon gene flow corresponds to rabies incidence in Ontario, Canada. Mol Ecol 18:43–53

Currat M, Ray N, Excoffier L (2004) SPLATCHE: a program to simulate genetic diversity taking into account environmental heterogeneity. Mol Ecol Notes 4:139–142

Cushman SA (2006) Effects of habitat loss and fragmentation on amphibians: a review and prospectus. Biol Conserv 128:231–240

Cushman SA, McKelvey KS, Hayden J, Schwartz MK (2006) Gene flow in complex landscapes: testing multiple hypotheses with causal modeling. Am Nat 168:486–499

Cushman SA, Landguth EL (in press a) Spurious correlations and inference in landscape genetics. Mol Ecol

Cushman SA, Landguth EL (in press b) Scale dependency in landscape genetic inference. Land Ecol

Cushman SA, McKelvey KS, Schwartz MK (2009a) Use of empirically derived source-destination models to map regional conservation corridors. Conserv Biol 23:368–376

Cushman SA, Gutzwiller K, Evans J, McGarial K (2009b) The gradient paradigm: a conceptual and analytical framework for landscape ecology. In: Cushman SA, Huettmann F (eds) Spatial complexity, informatics and wildlife conservation. Springer, Tokyo, pp 83–110

Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7:214

Dungan JL, Perry JN, Dale MRT, Legendre P, Citron-Pousty S, Fortin MJ, Jakomulska A, Miriti M, Rosenberg MS (2002) A balanced view of scale in spatial statistical analysis. Ecography 25:626–640

Dunning JB, Stewart DJ, Danielson BJ, Noon BR, Root TL, Lamberson RH, Stevens EE (1995) Spatially explicit population models current forms and future uses. Ecol Appl 5:3–11

Durand E, Jay F, Gaggiotti OE, Francois O (2009) Spatial inference of admixture proportions and secondary contact zones. Mol Biol Evol 26:1963–1973

Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, Bibillo A, Bjornson K, Chaudhuri B, Christians F, Cicero R, Clark S, Dalal R, Dewinter A, Dixon J, Foquet M, Gaertner A, Hardenbol P, Heiner C, Hester K, Holden D, Kearns G, Kong XX, Kuse R, Lacroix Y, Lin S, Lundquist P, Ma CC, Marks P, Maxham M, Murphy D, Park I, Pham T, Phillips M, Roy J, Sebra R, Shen G, Sorenson J, Tomaney A, Travers K, Trulson M, Vieceli J, Wegener J, Wu D, Yang A, Zaccarin D, Zhao P, Zhong F, Korlach J, Turner S (2009) Real-time DNA sequencing from single polymerase molecules. Science 323:133–138

Ellison AM (2004) Bayesian inference in ecology. Ecol Lett 7:509–520

Epperson BK (1995) Spatial distribution of genotypes under isolation by distance. Genetics 140:1431–1440

Epperson BK (2003) Geographical genetics. Princeton University Press, Princeton

Epperson BK (2004) Multilocus estimation of genetic structure within populations. Theor Popul Biol 65:227–237

Epperson BK (2007) Plant dispersal, neighbourhood size and isolation by distance. Mol Ecol 16:3854–3865

Epperson BK, McRae B, Scribner KT, Cushman SA, Rosenberg MS, Fortin MJ, James PMA, Murphy M, Manel S, Legendre P, Dale MRT (2010) Utility of computer simulations in landscape genetics. Mol Ecol (submitted)

Epps CW, Palsboll PJ, Wehausen JD, Roderick GK, Ramey IR, McCullough DR (2005) Highways block gene flow and cause a rapid decline in genetic diversity of desert bighorn sheep. Ecol Lett 8:1029–1038

Estoup A, Beaumont M, Sennedot F, Moritz C, Cornuet JM (2004) Genetic analysis of complex demographic scenarios: spatially expanding populations of the cane toad, Bufo marinus. Evolution 58:2021–2036

Evans JS, Cushman SA (2009) Gradient modeling of conifer species using random forests. Landsc Ecol 24:673–683

Excoffier L, Estoup A, Cornuet JM (2005) Bayesian analysis of an admixture model with mutations and arbitrarily linked markers. Genetics 169:1727–1738

Fahrig L, Merriam G (1985) Habitat patch connectivity and population survival. Ecology 66:1762–1768

Fall A, Fortin MJ, Manseau M, O’Brien D (2007) Spatial graphs: principles and applications for habitat connectivity. Ecosystems 10:448–461

Faubet P, Gaggiotti OE (2008) A new Bayesian method to identify the environmental factors that influence recent migration. Genetics 178:1491–1504

Fenster CB, Vekemans X, Hardy OJ (2003) Quantifying gene flow from spatial genetic structure data in a metapopulation of Chamaecrista fasciculata (Leguminosae). Evolution 57:995–1007

Fischer J, Lindenmayer DB (2007) Landscape modification and habitat fragmentation: a synthesis. Glob Ecol Biogeogr 16:265–280

Foll M, Gaggiotti O (2006) Identifying the environmental factors that determine the genetic structure of Populations. Genetics 174:875–891

Foll M, Gaggiotti O (2008) A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180:977–993

Fortin MJ, Dale MRT (2005) Spatial analysis. A guide for ecologists. Cambridge University Press, Cambridge

Francois O, Blum MGB, Jakobsson M, Rosenberg NA (2008) Demographic history of European populations of Arabidopsis thaliana. PLoS Genet 4:e1000075

Guillaume F, Rougemont J (2006) Nemo: an evolutionary and population genetics programming framework. Bioinformatics 22:2556–2557

Hamilton G, Currat M, Ray N, Heckel G, Beaumont M, Excoffier L (2005) Bayesian estimation of recent migration rates after a spatial expansion. Genetics 170:409–417

Hardy OJ, Vekemans X (1999) Isolation by distance in a continuous population: reconciliation between spatial autocorrelation analysis and population genetics models. Heredity 83:145–154

Hardy OJ, Vekemans X (2002) SPAGEDi: a versatile computer program to analyse spatial genetic structure at the individual or population levels. Mol Ecol Notes 2:618–620

Hardy OJ, Maggia L, Bandou E, Breyne P, Caron H, Chevallier MH, Doligez A, Dutech C, Kremer A, Latouche-Halle C, Troispoux V, Veron V, Degen B (2006) Fine-scale genetic structure and gene dispersal inferences in 10 Neotropical tree species. Mol Ecol 15:559–571

Heller NE, Zavaleta ES (2009) Biodiversity management in the face of climate change: a review of 22 years of recommendations. Biol Conserv 142:14–32

Hey J, Nielsen R (2004) Multilocus methods for estimating population sizes, migration rates and divergence time, with applications to the divergence of Drosophila pseudoobscura and D-persimilis. Genetics 167:747–760

Hickerson MJ, Stahl EA, Lessios HA (2006) Test for simultaneous divergence using approximate Bayesian computation. Evolution 60:2435–2453

Holderegger R, Wagner HH (2008) Landscape genetics. Bioscience 58:199–207

Hudson RR (2002) Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 18:337–338

Hutchison DW, Templeton AR (1999) Correlation of pairwise genetic and geographic distance measures: inferring the relative influences of gene flow and drift on the distribution of genetic variability. Evolution 53:1898–1914

Kindlmann P, Burel F (2008) Connectivity measures: a review. Landsc Ecol 23:879–890

Kuhner MK (2009) Coalescent genealogy samplers: windows into population history. Trends Ecol Evol 24:86–93

Lande R (1987) Extinction thresholds in demographic models of territorial populations. Am Nat 130:624–635

Landguth EL, Cushman SA (2010) cdpop: a spatially explicit cost distance population genetics program. Mol Ecol Resour 10:156–161

Laval G, Excoffier L (2004) SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history. Bioinformatics 20:2485–2487

Leblois R, Estoup A, Rousset F (2009) IBDSim: a computer program to simulate genotypic data under isolation by distance. Mol Ecol Resour 9:107–109

Lichstein JW, Simons TR, Shriner SA, Franzreb KE (2002) Spatial autocorrelation and autoregressive models in ecology. Ecol Monogr 72:445–463

Lindenmayer D, Hobbs RJ, Montague-Drake R, Alexandra J, Bennett A, Burgman M, Cale P, Calhoun A, Cramer V, Cullen P, Driscoll D, Fahrig L, Fischer J, Franklin J, Haila Y, Hunter M, Gibbons P, Lake S, Luck G, MacGregor C, McIntyre S, Mac Nally R, Manning A, Miller J, Mooney H, Noss R, Possingham H, Saunders D, Schmiegelow F, Scott M, Simberloff D, Sisk T, Tabor G, Walker B, Wiens J, Woinarski J, Zavaleta E (2008) A checklist for ecological management of landscapes for conservation. Ecol Lett 11:78–91

Malecot G (1948) Les mathematiques de l’heredite. Masson et Cie, Paris, 63 pp

Manel S, Segelbacher G (2009) Perspectives and challenges in landscape genetics. Mol Ecol 18:1821–1822

Manel S, Schwartz MK, Luikart G, Taberlet P (2003) Landscape genetics: combining landscape ecology and population genetics. Trends Ecol Evol 18:189–197

Manel S, Gaggiotti OE, Waples RS (2005) Assignment methods: matching biological questions with appropriate techniques. Trends Ecol Evol 20:136–142

Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen ZT, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer MLI, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu PG, Begley RF, Rothberg JM (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380

Marjoram P, Tavare S (2006) Modern computational approaches for analysing molecular genetic variation data. Nat Rev Genet 7:759–770

Marsh DM, Page RB, Hanlon TJ, Corritone R, Little EC, Seifert DE, Cabe PR (2008) Effects of roads on patterns of genetic differentiation in red-backed salamanders, Plethodon cinereus. Conserv Genet 9:603–613

McDevitt AD, Mariani S, Hebblewhite M, Decesare NJ, Morgantini L, Seip D, Weckworth BV, Musiani M (2009) Survival in the Rockies of an endangered hybrid swarm from diverged caribou (Rangifer tarandus) lineages. Mol Ecol 18:665–679

McGarigal K, Cushman SA (2002) Comparative evaluation of experimental approaches to the study of habitat fragmentation effects. Ecol Appl 12:335–345

McRae BH, Beier P (2007) Circuit theory predicts gene flow in plant and animal populations. Proc Natl Acad Sci USA 104:19885–19890

Muirhead JR, Gray DK, Kelly DW, Ellis SM, Heath DD, Macisaac HJ (2008) Identifying the source of species invasions: sampling intensity vs. genetic diversity. Mol Ecol 17:1020–1035

Neuenschwander S, Hospital F, Guillaume F, Goudet J (2008a) quantiNemo: an individual-based program to simulate quantitative traits with explicit genetic architecture in a dynamic metapopulation. Bioinformatics 24:1552–1553

Neuenschwander S, Largiader CR, Ray N, Currat M, Vonlanthen P, Excoffier L (2008b) Colonization history of the Swiss Rhine basin by the bullhead (Cottus gobio): inference under a Bayesian spatially explicit framework. Mol Ecol 17:757–772

O’Brien D, Manseau M, Fall A, Fortin MJ (2006) Testing the importance of spatial configuration of winter habitat for woodland caribou: an application of graph theory. Biol Conserv 130:70–83

Oddou-Muratorio S, Demesure-Musch B, Pelissier R, Gouyon PH (2004) Impacts of gene flow and logging history on the local genetic structure of a scattered tree species, Sorbus torminalis L. Crantz. Mol Ecol 13:3689–3702

Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

Raufaste N, Rousset F (2001) Are partial mantel tests adequate? Evolution 55:1703–1705

Riley SPD, Pollinger JP, Sauvajot RM, York EC, Bromley C, Fuller TK (2006) A southern California freeway is a physical and social barrier to gene flow in carnivores. Mol Ecol 15:1733–1741

Rousset F (1997) Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145:1219–1228

Rousset F (2000) Genetic differentiation between individuals. J Evol Biol 13:58–62

Rousset F (2008) GENEPOP ‘007: a complete re-implementation of the GENEPOP software for Windows and Linux. Mol Ecol Resour 8:103–106

Schwartz MK, McKelvey KS (2009) Why sampling scheme matters: the effect of sampling scheme on landscape genetic results. Conserv Genet 10:441–452

Segelbacher G, Tomiuk J, Manel S (2008) Temporal and spatial Temporal and spatial analyses disclose consequences of habitat fragmentation on the genetic diversity in capercaillie (Tetrao urogallus). Mol Ecol 17:2356–2367

Shendure J, Ji HL (2008) Next-generation DNA sequencing. Nat Biotechnol 26:1135–1145

Slatkin M (1993) Isolation by distance in equilibrium and nonequilibrium populations. Evolution 47:264–279

Sokal RR, Wartenberg DE (1983) A test of spatial auto correlation analysis using an isolation by distance model. Genetics 105:219–237

Storfer A, Murphy MA, Evans JS, Goldberg CS, Robinson S, Spear SF, Dezzani R, Delmelle E, Vierling L, Waits LP (2007) Putting the ‘landscape’ in landscape genetics. Heredity 98:128–142

Storfer A, Murphy MA, Holderegger R, Spear SF, Waits LP (2010) Landscape genetics: where are we now? Mol Ecol (submitted)

Strand AE, Niehaus JM (2007) KERNELPOP, a spatially explicit population genetic simulation engine. Mol Ecol Notes 7:969–973

Sutherland GD, Harestad AS, Price K, Lertzman KP (2000) Scaling of natal dispersal distances in terrestrial birds and mammals. Conserv Ecol 4:16

Tallmon DA, Luikart G, Beaumont MA (2004) Comparative evaluation of a new effective population size estimator based on approximate Bayesian computation. Genetics 167:977–988

Tallmon DA, Koyuk A, Luikart G, Beaumont MA (2008) ONeSAMP: a program to estimate effective population size using approximate Bayesian computation. Mol Ecol Resour 8:299–301

Thompson CM, McGarigal K (2002) The influence of research scale on bald eagle habitat selection along the lower Hudson River, New York (USA). Landsc Ecol 17:569–586

Thornton K, Andolfatto P (2006) Approximate Bayesian inference reveals evidence for a recent, severe bottleneck in a Netherlands population of Drosophila melanogaster. Genetics 172:1607–1619

Van Dyck H, Baguette M (2005) Dispersal behaviour in fragmented landscapes: routine or special movements? Basic Appl Ecol 6:535–545

Vekemans X, Hardy OJ (2004) New insights from fine-scale spatial genetic structure analyses in plant populations. Mol Ecol 13:921–935

Vignieri SN (2005) Streams over mountains: influence of riparian connectivity on gene flow in the Pacific jumping mouse (Zapus trinotatus). Mol Ecol 14:1925–1937

Wade MJ, McCauley DE (1988) Extinction and recolonization—their effects on the genetic differentiation of local populations. Evolution 42:995–1005

Wagner HH, Werth S, Kalwij JM, Bolli JC, Scheidegger C (2006) Modelling forest recolonization by an epiphytic lichen using a landscape genetic approach. Landsc Ecol 21:849–865

Watts PC, Saccheri IJ, Kemp SJ, Thompson DJ (2007) Effective population sizes and migration rates in fragmented populations of an endangered insect (Coenagrion mercuriale: Odonata). J Anim Ecol 76:790–800

Wiens JA (1989) Spatial scaling in ecology. Funct Ecol 3:385–397

Wilson GA, Rannala B (2003) Bayesian inference of recent migration rates using multilocus genotypes. Genetics 163:1177–1191

With KA, King AW (1999) Extinction thresholds for species in fractal landscapes. Conserv Biol 13:314–326

Wright S (1943) Isolation by distance. Genetics 28:114–138


Abstract

Understanding the relationship between genetic variation and biological function on a genomic scale is expected to provide fundamental new insights into the biology, evolution and pathophysiology of humans and other species. The hope that single nucleotide polymorphisms (SNPs) will allow genes that underlie complex disease to be identified, together with progress in identifying large sets of SNPs, are the driving forces behind intense efforts to establish the technology for large-scale analysis of SNPs. New genotyping methods that are high throughput, accurate and cheap are urgently needed for gaining full access to the abundant genetic variation of organisms.


Introduction

The Solanaceae or nightshade family consists of more than 3000 species with great diversity in terms of habit, habitat and morphology. Its species occur worldwide, and range from large forest trees in wet rain forests to annual herbs in deserts (Knapp, 2002 ). Solanum is the largest genus in the family, and includes tomato (Solanum lycopersicum) and various other species of economic importance. Tomato breeding over recent decades has focused on higher productivity and adaption to different cultivation systems. Its economic success is reflected by the fact that, on a global scale, tomato is one of the most important vegetable crops, with a worldwide production of 161 million tonnes covering some 4 800 000 ha (http://faostat.fao.org/site/567/DesktopDefault.aspx?PageID=567#ancor). However, domestication of tomato is clearly distinct from the species divergence by natural selection as a consequence of selecting for a limited set of traits, including fruit shape and size (Rodríguez et al., 2011 ). As a result, its genetic basis has been seriously narrowed, known as the ‘domestication syndrome’ (Hammer, 1985 Doebley et al., 2006 Bai and Lindhout, 2007 Bauchet and Causse, 2012 ). In more recent times, tomato has been adapted to different growing systems by adjustment of a small number of traits, including self-pruning, plant height, earliness, fruit morphology and fruit color (Rodríguez et al., 2011 Bauchet and Causse, 2012 ). The relative small genetic variation became apparent in the face of rapidly changing environmental conditions, competing claims for arable lands, and new consumer requests. These challenges have pushed tomato breeding efforts towards better biotic and abiotic stress tolerance, higher productivity, and increased sensory and nutritional value. However, the reduced genetic variation that resulted from extensive inbreeding has decelerated tomato crop improvement. To enlarge the genetic basis, breeders now focus on introgression of desirable genes from wild relatives into the elite cultivars, but so far, this has been quite limited (Bai and Lindhout, 2007 Singh, 2007 ).

The first step of introgressive hybridization involves crosses of the cultivated tomato with heirloom species, wild relatives or more distant species of the tomato clade. Introgression breeding is possible as cultivated tomato and related wild species are intra-crossable, and most of wild species are also inter-crossable (Rick, 1979 , 1986 Spooner et al., 2005 ) despite the fact that diverse mating systems have evolved, varying from allogamous self-incompatible (SI) and facultative allogamous, to autogamous self-compatible (SC). Especially at the geographic margins of the distributions, inter-species changes in incompatibility systems that promote inbreeding over out-crossing have been documented (Peralta et al., 2008 Grandillo et al., 2011 ). Species boundaries and genetic diversity have been extensively studied in tomato using a wide range of molecular data (Peralta et al., 2008 and Grandillo et al., 2011 ). For example, RFLP analysis showed that genetic diversity for SI species far exceeds that of SC species, estimated at 75% versus 7% (Miller and Tanksley, 1990 ). Furthermore, ‘within-accession’ genetic variation was estimated at 10% of the ‘between-accession’ variation, in contrast to the genetic variation of the modern cultivars, which was estimated at <5%. This further illustrates the dramatic erosion of genetic diversity in cultivated tomato crops.

Selection of crossing parents for inter-specific hybridization requires insight into phylogenetic relationships for the tomato clade, but trees based on morphological and molecular data have not been undisputed. Four informal species groups have been proposed for the tomato clade (Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon Peralta et al., 2008 ), which are thought to have evolved from the most recent common ancestor approximately 7 million years ago (Nesbitt and Tanksley, 2002 Spooner et al., 2005 Moyle, 2008 Peralta et al., 2008 ). Despite these studies, evolutionary relationships between the 13 species in the Lycopersicon clade have not been fully resolved for example, the dichotomy between Solanum pennellii and Solanum habrochaites (Peralta et al., 2005 , 2008 Spooner et al., 2005 ). The evolutionary history of Solanum genomes has also been investigated from the perspective of chromosome organization. The study by Szinay et al. ( 2012 ) involving cross-species BAC FISH painting of Solanum species revealed few large rearrangements in the short arm euchromatin of chromosomes 6, 7 and 12, whereas Anderson et al. ( 2010 ) demonstrated pairing loops, multivalents and kinetochore shifts in synaptonemal complex spreads of hybrids between members of the tomato clade, suggesting paracentric and pericentric inversions and translocations between the homeologous chromosomes. Furthermore, comparative genomics suggest a Solanum genome landscape in which chromosome evolution for the majority of the 12 chromosomes has been far more dynamic than currently appreciated (Peters et al., 2012 ). Collectively, these findings demonstrate that evolutionary relationships among the wild relatives should be considered provisional (Peralta et al., 2008 ).

The availability of high-throughput sequencing technologies has provided unprecedented power to determine genome variation across entire clades, at both the structural and genotype level. Initiatives such as the 1001 Genomes Project for Arabidopsis thaliana, the Drosophila sequence project, and the 1000 Genomes Project for human have demonstrated the existence of a vast amount of intra-species specific polymorphic sequence features such as insertion/deletion events (InDels), repeats and single-nucleotide polymorphisms (SNPs) for hundreds of genes (Weigel and Mott, 2009 The 1000 Genomes Project Consortium, 2010 MacKay et al., 2012 ), and have illustrated that there is no such thing as ‘the genome’ for a particular species. Rather, the range of physiological and developmental traits appears to be reflected in the tremendous amount of sequence variants contributing to intra-specific variation. Considering the overwhelming inter-species genetic variability, tomato germplasm collections represent a gene pool with unprecedented possibilities to address new breeding demands imposed by climate change, world population increase, and consumer needs. Here we have studied this genetic variation by genome sequencing a selection of representative tomato accessions, which has become possible through the recent development of the S. lycopersicum Heinz 1706 reference genome (The Tomato Genome Consortium, 2012 ). In addition to this reference genome for the Lycopersicon species, we describe construction of reference genomes for three other related species representing the Arcanum, Eriopersicon and Neolycopersicon groups, respectively, providing an expanded resource for detailed comparative genomic studies in the near future. We also present results for robust/high-confidence detection and identification of sequence polymorphisms, heterozygosity levels and introgressions, and assess the genetic diversity within the tomato clade from a phylogenetic and evolutionary perspective. This study provides an invaluable dataset for advanced ‘omics’ studies on sequence trait relationships and the molecular mechanisms of tomato genome evolution, as well as the development of genotyping-by-sequencing breeding approaches.


Genetic Variation In Sensitivity To Estrogen May Mask Endocrine Disruption

Genetically different strains of laboratory mice vary dramatically in their sensitivity to estrogen, report researchers at the University of California, Davis, in the Aug. 20 issue of the journal Science.

The findings by Jimmy Spearow, a reproductive geneticist, and Marylynn Barkley, a reproductive endocrinologist, call into question the validity of current laboratory-animal-based safety tests of estrogen-like chemicals and suggest that an individual's genetic makeup should be considered when prescribing estrogen and related hormones for medical purposes.

"The use of laboratory animals that genetically are quite resistant to estrogen for the evaluation of possible reproductive effects of various chemicals might be misleading and may mask our appreciation of how global exposure to estrogen-like chemicals threatens wildlife, domestic animals and humans," said Spearow, a research geneticist in UC Davis' Neurobiology, Physiology and Behavior Section.

Estrogen is a naturally occurring hormone that is mimicked by other chemicals dubbed "endocrine disruptors" because they appear to hinder reproduction in fish, wildlife and other mammals by interfering with the normal function of the endocrine system. Such chemicals are found in certain pesticides, plastics, detergents and estrogens derived from plants.

The U.S. Environmental Protection Agency is preparing to screen thousands of pesticides and industrial chemicals for several endocrine-disrupting effects. Previous studies have indicated that estrogen-like endocrine disruptors found in the environment can cause decreased sperm counts, deformed genitals, aberrant mating behavior and sterility in wildlife.

Spearow and Barkley, who study reproductive hormones using mice as a research model, became interested in the possible genetic control over susceptibility to endocrine disruption by estrogen.

"Many commercial outbred lines of laboratory animals have been bred for large litter size and vigor," Spearow explained. "As a result, the males from these strains tend to have larger testes and a decreased sensitivity to the estrogen-triggered mechanism that temporarily 'turns off' the reproductive system."

He theorized that the process of breeding mice and rats that are genetically predisposed to producing large litters of offspring would also result in animals that are less sensitive to estrogen.

"Our concern was that the use of laboratory animals selected for large litter size in product-safety testing might underestimate the role of those estrogen-like chemicals in disrupting reproductive development and function," he said.

To test that notion, Spearow decided to study the effects of estradiol -- a common form of estrogen found in fish, amphibians, reptiles, birds and mammals -- on young male mice of different strains. He examined several strains of mice including: C57BL/6J (B6) mice that are widely used in producing genetically customized mice for biomedical research C17 mice that were developed by random selection followed by inbreeding S15 mice that were developed by selection for large litters followed by inbreeding and CD-1 mice that produce large litters and are frequently used in toxicological and pharmacology studies.

When the mice were all 22 or 23 days old, the researchers surgically placed tiny tubules filled with increasing doses of estradiol under their skin. The implants were prepared in such a way as to gradually release estradiol.

When the mice were 43 days old, the researchers checked for possible endocrine-disrupting effects resulting from the estradiol by measuring the weight of the mice's testes. They discovered that testis weight of mice receiving empty, control implants differed between different strains of mice.

More importantly, while estradiol treatments suppressed testis weight in all strains of mice, strains differed dramatically in their sensitivity to estradiol. Of the treated mice, the B6 mice appeared to be most sensitive, experiencing a 60 percent suppression of testis weight even at the lowest dose of estradiol. C17 and S15 mice were almost as sensitive as B6 mice to the suppression of testis weight in response to estradiol. The CD-1 strain of mice, known for large litters, showed a high resistance to estrogen, exhibiting only a 30 percent suppression of testis weight even with the highest estradiol doses.

Testes of several mouse strains also were examined to see if sperm development and production was affected by the estradiol treatments. Spearow found that low doses of estradiol eliminated sperm development in both the B6 and C17 strains. Sperm maturation in CD-1 mice, however, was not inhibited by low doses of estradiol and showed little or no inhibition in response to the highest doses of estradiol. This provided further evidence that the highly prolific CD-1 strain of mice is much more resistant to the endocrine-disrupting effects of estrogen.

"It is clear that CD-1 is over 16 times more resistant to endocrine disruption by estrogen than B6 and C17 strain mice," Spearow said. "Furthermore, extrapolation of the CD-1 data suggests that this line of mice is about 100 times more resistant than those other strains.

"This study and a related study in rats potentially explain why doses of estrogenic chemicals resulting in endocrine disruption in fish and wildlife failed to disrupt reproductive development in previous laboratory animal studies," Spearow added. "The laboratory-animal endocrine-disruption studies to date seem to have used estrogen-resistant lines of mice and rats for product-safety testing."

He suggested that this demonstration of major genetic differences in sensitivity to the disruption of reproductive development and sperm formation in young male mice has widespread implications.

"Because genes controlling prolificacy are also associated with differences in estrogen sensitivity, there is likely to be a broad variation in estrogen sensitivity in various animal populations and species, including humans," he said. "Accurate monitoring of endocrine disruption will require that we consider an animal's genetic sensitivity to estrogen as well as its environmental exposure to estrogen-like chemicals."

Spearow contends that the issue of genetic variation in susceptibility to endocrine disruption should not be ignored by the Environmental Protection Agency in its testing of thousands of chemicals for this activity.

"Considering these genetic variations in the estrogen sensitivity of an individual or species will be important not only when testing for endocrine-disrupting properties in industrial chemicals and pesticides, but also when determining therapeutic doses of estrogen and related steroid compounds in human medicine," Spearow and Barkley emphasized.

For example, an individual's genetically controlled response to estrogen should be considered when determining the appropriate dose of hormones used in contraceptives, hormone replacement therapy, and prevention and treatment of breast and prostate cancer, they explained.

In other studies, Spearow has discovered major differences between strains of mice in how females respond to fertility drugs to produce estrogen and ovulate. Furthermore, he has mapped genes controlling hormone-induced ovulation rate and ovarian estrogen production to specific chromosomal regions. Information on these genes would optimize fertility drug treatments and improve the hormonal induction of reproduction in humans, farm animals and an increasing number of captive-bred endangered species.

Collaborating on this study were UC Davis undergraduate students Paul Doemeny, Robyn Sera and Rachael Leffler.


Watch the video: Κληρονομικές παθήσεις του οφθαλμού - Οφθαλμική Γενετική (November 2022).