We are searching data for your request:

**Forums and discussions:**

**Manuals and reference books:**

**Data from registers:**

**Wait the end of the search in all databases.**

Upon completion, a link will appear to access the found materials.

Upon completion, a link will appear to access the found materials.

Under scenarios of stabilizing or disruptive selection, we can add a quadratic component to our model of phenotype and fitness like so.

Specifically, I am not clear on where the 1/2 comes from nor do I understand the derivation of gamma, which is equal to as follows:

x is the phenotype of interest, w(x) is the fitness of said phenotypic value. uBS is the mean phenotype before selection. Any help would be greatly appreciated.

Coefficient $1/2$ is a matter of definition/convenience. One could have written $eta x_i + gamma x_i^2+ar{w}$, but then a factor of $2$ would surface in the other forumals.

The other two equations are adapted to one-dimensional case (one trait) from the Lande and Arnold's paper The measurement fo selection on correlated characters. In particular, the equation with covariance is their equation (13a), which follows from their more general equation (4) for quadratic selection. The last equation is the restatement of their (14b).

**Update**

Here I retake the notation of the above-mentioned article by Lande, but adopt it to the case of a single trait. Then the variance matrix $P$ is just a number (variance). Equations (14a) and (13b) define $$gamma=frac{1}{P^2}C, (A) C=Covleft[w, (z-ar{z})^2
ight], $$ which is the definition of $gamma$ in the OP. It is essential here to use the correct expression for the covariance: $$ Covleft[f(z), g(z)
ight] = int left[f(z)-ar{f}
ight]left[g(z)-ar{g}
ight]p(z)dz= int f(z)g(z)p(z)dz -ar{f}ar{g} $$ In our case we thus have: $$ C = int w(z)(z-ar{z})^2p(z)dz - ar{w}overline{(z-ar{z})^2}, $$ where $ar{w}=1$ by definition, whereas $overline{(z-ar{z})^2}=P$ is the variance of the trait. We thus have $$ C = int w(z)(z-ar{z})^2p(z)dz - P, (B) $$ where $$ p(z)=frac{1}{sqrt{2pi P}}e^{-frac{(z-ar{z})^2}{2P}}. $$

Let us now consider the integral $$ int frac{partial^2w(z)}{partial z^2}p(z)dz= frac{partial w(z)}{partial z}p(z)|_{-infty}^{+infty} - int frac{partial w(z)}{partial z }frac{partial p(z)}{partial z }dz= frac{partial w(z)}{partial z}p(z)|_{-infty}^{+infty} - w(z)frac{partial p(z)}{partial z }|_{-infty}^{+infty}+ int w(z)frac{partial^2 p(z)}{partial z^2 }dz $$ The first two terms vanish, since $w(z)$ is bounded, whereas $p(z) ightarrow 0$ as $z ightarrow pm infty$. We thus have $$ int frac{partial^2w(z)}{partial z^2}p(z)dz= int w(z)frac{partial^2 p(z)}{partial z^2 }dz= int w(z)frac{partial^2 }{partial z^2 }frac{1}{sqrt{2pi P}}e^{-frac{(z-ar{z})^2}{2P}}dz= int w(z)left[frac{(z-ar{z})^2}{P^2}-frac{1}{P} ight]frac{1}{sqrt{2pi P}}e^{-frac{(z-ar{z})^2}{2P}}dz= int w(z)left[frac{(z-ar{z})^2}{P^2}-frac{1}{P} ight]p(z)dz= frac{1}{P^2}int w(z)(z-ar{z})^2p(z)dz - frac{ar{w}}{P}= frac{1}{P^2}left[int w(z)(z-ar{z})^2p(z)dz - P ight]=frac{C}{P^2}=gamma, $$ where we used $ar{w}=1$, and equations (A) and (B) above.

Coevolutionary interactions depend upon a phenotypic interface of traits in each species that mediate the outcome of interactions among individuals. These phenotypic interfaces usually involve performance traits, such as locomotion or resistance to toxins, that comprise an integrated suite of physiological, morphological and behavioral traits. The reciprocal selection from species interactions may act directly on performance, but it is ultimately the evolution of these underlying components that shape the patterns of coevolutionary adaptation in performance. Bridging the macroevolutionary patterns of coevolution to the ecological processes that build them therefore requires a way to dissect the phenotypic interface of coevolution and determine how specific components of performance in one species exert selection on complimentary components of performance in a second species. We present an approach for analyzing the strength of selection in a coevolutionary interaction where individuals interact at random, and for identifying which component traits of the phenotypic interface are critical to mediating coevolution. The approach is illustrated with data from a predator-prey arms race between garter snakes and newts that operates through the interface of tetrodotoxin (TTX) and resistance to it.

Because performance traits are operationally defined in terms of meeting some organismal-level challenge, they are typically considered to exist on the “frontline” of selection ( Arnold, 1983 Garland *et al.*, 1990 Garland and Carter, 1994). At first glance, then, performance traits would be expected to evolve relatively quickly because of the strength of selection they experience. However, performance traits are complex characters that comprise a variety of underlying components ( Arnold, 1983). Any performance trait, such as prey-handling time or thermal tolerance, is the culmination of a variety of physiological, behavioral and morphological factors interacting within an individual. Although selection may target performance directly, it is through the evolution of these underlying component traits that performance itself evolves ( Arnold, 1983 Geffeney *et al.*, 2002).

Understanding the evolution of performance therefore requires a truly integrative approach to dissecting the mechanistic relationships among component traits and among these components and performance itself. The heuristic and statistical model described by Arnold (1983) to reveal these relationships has taken us a long way toward understanding how selection is filtered through performance to influence the evolution of underlying traits. This path-based approach remains one of the more powerful tools in integrative biology and continued refinements of the tool have increased the breadth of problems and types of interactions among traits that can be considered under its mantle ( Arnold, 1988 Kingsolver and Schemske, 1991 Scheiner *et al.*, 2000).

Nonetheless, the application of the trait-performance-fitness paradigm remains primarily the realm of ecological and evolutionary physiology. As a result, performance traits are typically considered in the context of interactions between individual organisms and their environments. Many of the most critical challenges, and consequently the strongest selection, that an organism meets come from ecological interactions with other species. When a performance trait exists with respect to another species, the selective environment of importance might itself be a performance trait made up of other underlying characters. Such is the case for coevolutionary interactions.

Coevolution is driven by reciprocal selection that results from ecological interactions among individuals of two or more species ( Thompson, 1994). Such reciprocal selection occurs at a phenotypic interface that comprises the trait (or traits) that determine fitness outcomes of the interaction ( Brodie and Brodie, 1999*b*). The phenotypic interface of coevolution will almost invariably involve complementary performance traits, because the critical aspect of any species interaction is how well one species “performs” against the other ( Fig. 1). For example, in a plant-herbivore interaction, the phenotypic interface might involve the induction of chemical defenses by the plant and the resistance to those same chemicals by the herbivore. The plant performance trait would include not only baseline levels of defensive compounds, but also mechanisms of recognition of herbivory and biochemical regulation of defensive compound production. Herbivore performance could include both physiological and behavioral components of resistance to the effects of defensive compounds along with behavioral traits that allow avoidance of defenses.

It should be clear that the reciprocal nature of such interactions presents an immediate problem for understanding the evolution of performance at the phenotypic interface of coevolution. Such performance traits do double duty as both targets and agents of selection. In other words, the performance traits at the phenotypic interface simultaneously cause selection on and experience selection by one another. Dissecting the relative importance of the underlying components as targets of selection on the one hand and agents of selection on the other requires simultaneous consideration of complementary performance traits in each of the two interacting species.

Our goal in this paper is to suggest an approach to quantifying and visualizing selection on performance traits at the phenotypic interface of coevolution. This challenge requires understanding how traits in one species influence performance and consequently fitness in an interaction, as well as understanding which traits of the coevolutionary partner influence fitness through interactions with specific traits in the focal species. We suggest that fitness in one species can be expressed as a function of not only the traits present in those individuals, but also as a function of the traits of the coevolutionary partner with which it interacts. The basic technique builds on the covariance approach to understanding selection and allows expression of functions as surfaces so that reciprocal selection can be visualized and more intuitively interpreted.

We first outline the general theoretical approach to the problem and then suggest a method for determining the critical fitness functions from empirical data. The approach is based on direct observation of fitness or performance in interactions between pairs of individuals in both species for which phenotypic values are known. Because we are unaware of the existence of such data sets, we illustrate the approach with hypothetical data based on a well-studied empirical system, an arms race between garter snakes (*Thamnophis sirtalis*) and toxic newts (*Taricha*) ( Brodie and Brodie, 1999*b* Brodie *et al.*, 2002). Some *a priori* knowledge of the phenotypic interface, such as is available for the newt-snake system, is needed to identify the performance traits of interest and their underlying components. Using this sort of information, expected reciprocal selection functions can be deduced and data simulated to illustrate the estimation of these functions.

## Optimization Problem Types - Smooth Nonlinear Optimization

A smooth nonlinear programming (NLP) or nonlinear optimization problem is one in which the objective or at least one of the constraints is a **smooth nonlinear function** of the decision variables. An example of a smooth nonlinear function is:

. where X_{1}, X_{2} and X_{3} are decision variables. Nonlinear functions may be convex or non-convex, as described below. A quadratic programming (QP) problem is a special case of a smooth nonlinear optimization problem, but it is usually solved by specialized, more efficient methods. Nonlinear functions, unlike linear functions, may involve variables that are raised to a power or multiplied or divided by other variables. They may also use transcendental functions such as exp, log, sine and cosine.

NLP problems and their solution methods require nonlinear functions that are **continuous**, and (usually) further require functions that are **smooth** -- which means that **derivatives** of these functions with respect to each decision variable, i.e. the function **gradients**, are continuous.

A continuous function has no "breaks" in its graph. The Excel function =IF(C1>10,D1,2*D1) is discontinuous if C1 is a decision variable, because its value "jumps" from D1 to 2*D1. The Excel function =ABS(C1) is continuous, but nonsmooth -- its graph is an unbroken "V" shape, but its derivative is discontinuous, since it jumps from -1 to +1 at C1=0.

An NLP problem where the objective and all constraints are **convex** functions can be solved efficiently to global optimality, up to very large size interior point methods are normally very effective on the largest convex problems. But if the objective or any constraints are **non-convex**, the problem may have multiple feasible regions and multiple locally optimal points within such regions. It can take time **exponential** in the number of variables and constraints to determine that a non-convex NLP problem is infeasible, that the objective function is unbounded, or that an optimal solution is the "global optimum" across all feasible regions.

Although functions can be non-smooth but convex (or smooth but non-convex), you can expect much better performance with most Solvers if your problem functions are all **smooth and convex**.

#### Solving NLP Problems

There are a variety of methods for solving NLP problems, and **no single method is best for all problems**. The most widely used and effective methods, used in Frontline's solvers, are the Generalized Reduced Gradient (GRG) and Sequential Quadratic Programming (SQP) methods, both called *active-set* methods, and the Interior Point or Barrier methods.

NLP solvers generally exploit the smoothness of the problem functions by **computing gradient values** at various trial solutions, and moving in the direction of the negative gradient (when minimizing the positive gradient when maximizing). They usually also exploit second derivative information to follow the curvature as well as the direction of the problem functions. To solve constrained problems, NLP solvers must take into account feasibility and the direction and curvature of the constraints as well as the objective.

As noted above, if the problem is non-convex, NLP solvers normally can find only a **locally optimal solution**, in the vicinity of the starting point of the optimization given by the user. It is frequently possible, but considerably more difficult, to find the globally optimal solution. To learn more about this issue, click Global Optimization Methods.

## Results

### Paternity analysis and male reproductive fitness

We quantified MRS from 10 replicate groups, each of which contained six males and eight females that were housed together for 7 days (see Methods). MRS was calculated as the proportion of offspring sired by each male (*n*=60) over the total offspring produced in his replicate group. Seventy-two of the eighty females used in the experiment produced broods (*n*=532 mean brood size±s.d.=7.39±3.59 range=1–16) from which 530 offspring could be assigned unambiguously to one sire (99.6%). Four females produced a single offspring and therefore multiple paternity could not be assessed in these broods. Our microsatellite paternity analyses confirmed that 47 females ( ∼ 69%) produced offspring sired by two or more males (mean±s.d. sires=2.18±1.01 range 1–5). Because female guppies are able to store sperm for several months, some females (*n*=54) produced second broods, which we also genotyped for parentage analysis (see Methods). Although we do not consider these second broods for estimating MRS (see below), they were helpful in identifying females that had demonstrably mated with more males than those that were identified only through paternity analysis of the first brood (see Methods). The analysis of paternity in these second broods revealed an additional four females that had mated with two or more males during the mating trials (that is, females producing offspring in the first brood that were sired by just one male but who subsequently produced at least one offspring from a different male). With the inclusion of these data in our analysis, our revised estimate of female mating rate increased to 2.40±1.1 s.d. mates per female (range 1–5). The number of sires per brood was positively correlated with brood size (Pearson's *r*=0.246, *P*=0.044, *n*=68 Fig. 1a).

(**a**) Correlation (Pearson’s *r*=0.246, *P*=0.044, *n*=68) between number of sires per brood and brood size, where different sizes for circles correspond to different number of cases. Frequency distributions of male (**b**) postcopulatory success (PCS, *n*=54), (**c**) mating success (MMS, *n*=60) and (**d**) reproductive success (MRS, *n*=60), respectively.

The mean male PCS, corrected for the number of males competing in a single brood (see Methods), was 0.38±0.24 s.d. (range=0.00–0.81, Fig. 1b). Males sired offspring with an average of 2.53±1.94 s.d. females (range 0–7) but mated with an average of 2.88±1.85 s.d. females (range 0–7). Since the number of females that produced a brood varied across tanks (mean±s.d.=7.2±0.79, range=6–8), estimates of MMS were expressed as the number of females with which the male mated over the total number of females that produced offspring (mean±s.d.=0.40±0.26, range=0–1.00, Fig. 1c). The mean proportion of offspring sired by males (MRS) was 0.17±0.15 s.d. (range=0–0.75, Fig. 1d).

### Partition of MRS variance into its MMS and PCS components

Our variance-partitioning analysis revealed that ∼ 40% of the variance in MRS was explained by MMS, 38% by PCS and 41% by the covariance between MMS and PCS (Table 1). As pointed out in ref. 26, the sum of these components can exceed 100% because the total variance is not simply the sum of its component variances and covariances but higher-order terms (the products of variances and covariances) and skewness in the data also contribute to the total. MMS and PCS were positively correlated (Pearson's *r*=0.581, *n*=54), indicating that males that do well in obtaining mates also do well when competing for fertilization. The observed correlation coefficient was significantly higher than the expected (simulated) correlation coefficient due to the estimation of MMS from paternity data (mean simulated *r*=0.108, *P*=0.0001, Monte Carlo simulation based on 10,000 replicates see Methods), indicating that the observed covariation between MMS and PCS is greater than expected by chance.

### Multivariate selection analysis and fitness surfaces

We detected significant positive linear (*β*) selection on gonopodium length (Table 2) and significant negative nonlinear selection (*γ*) on iridescent area and gonopodial thrust frequency (see *γ* coefficients on diagonal in Table 2). We also found evidence for positive correlational selection on gonopodium length and iridescent colouration (see Table 2). We conducted canonical rotation of the *γ* matrix, which generates a matrix of new composite trait scores (eigenvectors, **m1, m2, …m7**, in which trait representation is similar to that of a principal component analysis), each describing a major axis of selection in the fitness surface 27,28 . Following this, we detected nonlinear selection on four **m** vectors, revealing significant disruptive (**m2**) and stabilizing selection (**m4**, **m5** and **m6** see Table 3 and Figs 2 and 3). The **m2** vector was primarily loaded by gonopodium length (+) and secondarily by body size (+), sperm velocity (−) and iridescent area (+). The **m4** vector yielded a negative eigenvalue and was primarily loaded by courtship display rate (+) and body size (+), while the (negative) **m5** vector was loaded by orange colouration (+), gonopodium length (+), body size (−) and iridescent colouration (−). Finally, **m6**, with the highest negative significant eigenvalue, was strongly associated with gonopodial thrust rate (+) and sperm velocity (−). Fitness surfaces were obtained by fitting thin-plate splines on the significant major axes of selection (**m2**, **m4**, **m5** and **m6**). We illustrate the strongest pattern of disruptive selection with vectors **m2**–**m6** (Fig. 2) and stabilizing selection with vectors **m4**–**m5** (Fig. 3). Other possible combinations (that is **m2**–**m4**, **m2**–**m5** and so on) yielded little further information (see Supplementary Figs 1–4).

Three-dimensional (**a**) and contours (**b**) fitness surfaces. The vectors **m2** and **m6** represent the strongest axes disruptive and stabilizing selection **m2** is positively loaded by gonopodium length and iridescent and body area and negatively by sperm velocity **m6** is loaded positively by gonopodial thrust rate and negatively by sperm velocity. Standardized fitness is shown.

Three-dimensional (**a**) and contours (**b**) fitness surfaces. The **m4** vector is mainly positively loaded by display behaviour (and partially by body area), while **m5** is loaded positively by orange colouration (and weakly by gonopodium length, body area and iridescent area). Standardized fitness is shown.

## How strong is selection in nature?

Numerous studies have measured phenotypic selection in natural populations using the methods described above ( Endler 1986). We are therefore in a position to synthesize these studies and look for more general patterns of selection. Such a synthesis has been undertaken recently. Kingsolver and colleagues (2001) reviewed selection studies published between 1984 and 1998 and identified 63 studies of 62 species involving a wide range of taxa, geographic areas, and types of traits. These studies yielded 993 estimates of directional selection (β_{α}). Positive and negative values of occur with equal frequency, so it is more informative to consider the absolute value, |β_{α}|, as an indicator of the magnitude of directional selection.

A frequency distribution of |β_{α}| shows a wide range of values, with small values most common but with a long “tail” of higher values ( figure 2 Kingsolver et al. 2001). For example, the median value was 0.16, and 13 of the values were greater than 0.5, indicating very strong selection. To put this in perspective, imagine a population in which a heritable trait (*h* 2 = 0.5 see box 2) experiences persistent directional selection of median magnitude (β_{α} = 0.16). In fewer than 50 generations, the population mean would shift by 3 standard deviations, thereby exceeding the initial range of variation in the population. Thus, phenotypic selection in many natural populations is strong enough to cause substantial evolutionary changes in tens to hundreds of generations, which is a very short timescale in evolutionary terms ( Reznick et al. 1997, Hendry and Kinnison 1999, Hoekstra et al. 2001).

Several complications temper this important conclusion, however ( Kingsolver et al. 2001, Hereford et al. 2004, Hersch and Phillips 2004). First, studies that fail to detect strong or significant selection are less likely to be published, particularly if the study has a small sample size. This leads to a publication bias, in which studies with larger effects are more likely to be reported than those with smaller effects. There is some indication of such publication biases in the selection data, slightly inflating the average magnitude of selection detected ( figure 2 Kingsolver et al. 2001, Hersch and Phillips 2004). Second, many selection studies have small sample sizes that limit their statistical power. For example, as illustrated in figure 2, only 25 of the individual values of are significantly β_{α} different from zero at the 95 significance level (one would expect 5 of the values to be significant as a result of chance alone). Consequently, most studies have insufficient statistical power to detect selection of typical magnitude ( Hersch and Phillips 2004). Thus, selection is potentially potent, albeit typically difficult to detect.A third limitation is that most studies measure selection in terms of one or more components of fitness (e.g., aspects of an individual's survival, mating success, or fecundity) rather than total lifetime fitness (e.g., the total number of surviving offspring that an individual produces). Indeed, less than 5 of the available measurements of phenotypic selection involve total lifetime fitness, which is difficult to measure in most natural field populations ( Kingsolver et al. 2001). This is important because the magnitude and even the direction of selection on a trait may differ for different components of fitness. On the other hand, a recent statistical analysis by Knapczyk and Conner (forthcoming) indicates that sampling error does not bias estimates of the average strength of phenotypic selection, and suggests that publication bias is detectable only for selection estimates with very small sample sizes.

A recent alternative approach to assessing the magnitude of selection is to standardize the selection gradient using the mean value of the trait rather than the standard deviation ( Hereford et al. 2004). The mean-standardized gradient, β_{μ} has a useful and natural interpretation: Selection on fitness itself would produce a β_{μ} of 1. A recent survey of selection studies from 1984 through 2003 reported a bias-corrected median value for β_{μ} of 0.31, and more than 20 of the values exceeded 1, indicating that selection on these traits was stronger than stronger than selection on fitness itself ( Hereford et al. 2004). As Hereford and colleagues (2004) note, such large values “cannot be representative of selection on all traits.” However, there are a number of limitations to the use of mean-standardized measures of selection. First, the interpretation of β_{μ} is valid only for traits that represent true ratios and where the zero value is not arbitrary. This limitation excludes many interesting phenotypic traits, such as phenology and seasonal timing, and composite traits, such as principal components ( Kingsolver et al. 2001). A second, practical issue is that because the information needed to compute β_{μ} is not always reported in published studies, this approach excludes up to 70 of the available data on phenotypic selection. Third, analyses indicate that large values of β_{μ} are consistently associated with small values of the co-efficient of variation (CV), the ratio of the standard deviation to the mean of the trait. For example, for values of β_{μ} greater than 1, the median value of CV was 0.10—a mean 10 times greater than the standard deviation. In contrast, for values of β_{μ} less than 1, the median value of CV was 0.26. There is no obvious biological reason for very strong selection to be associated with small CV values (i.e., with traits that show small variation relative to the mean), and a statistical explanation for this pattern is more likely.

Given the enormous diversity of organisms, we are usually interested not in average selection but rather in differences in selection among different components of fitness, agents of selection, and targets of selection. One important issue to resolve is whether the relative magnitude of phenotypic selection due to variation in survival or fecundity (natural selection) is greater than that due to variation in mating success (sexual selection). The data on directional selection gradients (β_{α}) indicate that sexual selection is significantly stronger than natural selection ( figure 3). For example, the median magnitude of sexual selection is more than twice as great as that of natural selection, a pattern that holds for diverse plant and animal taxa. This result suggests that competition for mates may be important for rapid evolution in nature. Many people view evolution as a “struggle for existence.” Yet the struggle for existence may often be less important than the struggle to mate.

## Synthetic analyses of phenotypic selection in natural populations: lessons, limitations and future directions

There are now thousands of estimates of phenotypic selection in natural populations, resulting in multiple synthetic reviews of these data. Here we consider several major lessons and limitations emerging from these syntheses, and how they may guide future studies of selection in the wild. First, we review past analyses of the patterns of directional selection. We present new meta-analyses that confirm differences in the direction and magnitude of selection for different types of traits and fitness components. Second, we describe patterns of temporal and spatial variation in directional selection, and their implications for cumulative selection and directional evolution. Meta-analyses suggest that sampling error contributes importantly to observed temporal variation in selection, and indicate that evidence for frequent temporal changes in the direction of selection in natural populations is limited. Third, we review the apparent lack of evidence for widespread stabilizing selection, and discuss biological and methodological explanations for this pattern. Finally, we describe how sampling error, statistical biases, choice of traits, fitness measures and selection metrics, environmental covariance and other factors may limit the inferences we can draw from analyses of selection coefficients. Current standardized selection metrics based on simple parametric statistical models may be inadequate for understanding patterns of non-linear selection and complex fitness surfaces. We highlight three promising areas for expanding our understanding of selection in the wild: (1) field studies of stabilizing selection, selection on physiological and behavioral traits, and the ecological causes of selection (2) new statistical models and methods that connect phenotypic variation to population demography and selection and (3) availability of the underlying individual-level data sets from past and future selection studies, which will allow comprehensive modeling of selection and fitness variation within and across systems, rather than meta-analyses of standardized selection metrics.

This is a preview of subscription content, access via your institution.

## Author contributions

TY conceived the idea, performed the experiments and analysed the data TY and OP wrote the manuscript TY designed the research with the contributions from OP, MN and NT.

**Fig. S1** Methodology for the field survey of the root anatomical traits of the wild Poaceae species.

**Fig. S2** Root tissue areas of the wild Poaceae species.

**Fig. S3** Numbers and average areas of the xylem and aerenchyma in the roots of the wild Poaceae species.

**Fig. S4** Differences in plant height among the wild Poaceae species

**Fig. S5** Principal component analyses of the root anatomical traits of the wild Poaceae species.

**Fig. S6** Linear and nonlinear regression analyses of the soil water content and root tissue ratio of the wild Poaceae species.

**Fig. S7** Response of the root tissue ratio of the wild Poaceae species to the increased soil water content.

**Table S1** Soil water content in the surrounding of the wild Poaceae species after three nonrainy days or after three intermittent rainy days.

**Table S2** Principal component analyses of the root anatomical traits of the wild Poaceae species.

Please note: Wiley Blackwell are not responsible for the content or functionality of any Supporting Information supplied by the authors. Any queries (other than missing material) should be directed to the *New Phytologist* Central Office.

Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

## Projecting Evolutionary Change

Table 5 presents the projected changes on the assumption that conditions will continue to mirror the averages encountered by this population over the past 60 years. In the next 10 generations mean TC among women is projected to decline from the average of 224 over the past 60 years to 216 (209.3–222.5) mg/100 mL (95% C.I. see *Methods*). Because the environment has changed over the past 60 years (Fig. 2) and will continue to change, these results suggest that whatever changes in environment occur evolutionary changes will lead to mean cholesterol levels among women that are ≈0.8 (0.14–1.46) mg/100 mL lower in the next generation than they would be in the absence of evolution.

Similarly, we expect that as a result of evolution, in the next generation mean body WT among women will increase by 0.2 (−0.20 to 0.62) kg then stabilize HT will decrease a bit, ≈0.2 (0.03–0.39) cm SBP will decrease by ≈0.25 (−0.05 to 0.53) mmHg DBP will remain essentially unchanged blood GLU will decrease slightly by 0.8 (−0.09 to 0.29) mg/100 mL age at menopause will increase by ≈1.0 (−0.23 to 2.15) months and age at first birth will decrease by ≈0.5 (−0.6 to 1.7) months. The rates of projected evolution in haldane units (SD per generation) range from 0.032 (HT) to 0.002 (DBP), slower than those estimated for Galapagos finches and Trinidadian guppies but comparable to those estimated for New Zealand chinook salmon and Hawaiian mosquitofish (13). In sum, as a result of evolution future generations of women in this population are predicted to be slightly shorter and stouter, to have lower values for TC and SBP, to have their first child slightly earlier, and to reach menopause slightly later than they would have otherwise. These are small, gradual evolutionary changes in the middle to lower range of those observed in contemporary populations of nonhuman species.

## Sparse evidence for selection on phenotypic plasticity in response to temperature

Phenotypic plasticity is frequently assumed to be an adaptive mechanism by which organisms cope with rapid changes in their environment, such as shifts in temperature regimes owing to climate change. However, despite this adaptive assumption, the nature of selection on plasticity within populations is still poorly documented. Here, we performed a systematic review and meta-analysis of estimates of selection on thermal plasticity. Although there is a large literature on thermal plasticity, we found very few studies that estimated coefficients of selection on measures of plasticity. Those that did do not provide strong support for selection on plasticity, with the majority of estimates of directional selection on plasticity being weak and non-significant, and no evidence for selection on plasticity overall. Although further estimates are clearly needed before general conclusions can be drawn, at present there is not clear empirical support for any assumption that plasticity in response to temperature is under selection. We present a multivariate mixed model approach for robust estimation of selection on plasticity and demonstrate how it can be implemented. Finally, we highlight the need to consider the environments, traits and conditions under which plasticity is (or is not) likely to be under selection, if we are to understand phenotypic responses to rapid environmental change.

This article is part of the theme issue ‘The role of plasticity in phenotypic adaptation to rapid environmental change’.

### 1. Introduction

Rapid changes to the global climate in the Anthropocene are generating similarly rapid responses in phenotypic traits of many taxa [1], much of which may be driven by phenotypic plasticity: the change in the expression of phenotype by a given genotype as the environment it experiences changes [2]. Once simply regarded as random noise [3,4], phenotypic plasticity and its contribution to evolutionary dynamics are now the focus of a continually expanding research area [5–8] that aims to determine how both wild and domestic populations of plants and animals might respond to environmental change (e.g. [9–11]). However, the nature of selection on phenotypic plasticity in response to changing environmental conditions in general, and to changing climate in particular, is less well understood. Here, we use a systematic review of published studies on plasticity in response to temperature to assess the evidence to date for quantitative selection on thermal plasticity.

If phenotypic plasticity (hereafter ‘plasticity’) improves a genotype's fitness when environmental change occurs, it can be considered to be *adaptive* [2,12] (studies of phenotypic plasticity abound with diverse terminology, hence we set out the definitions we use here in table 1). This enticing concept has propagated a frequent ‘adaptationist’ assumption (see [2,13]) that any observed plasticity should be adaptive. However, as we discuss below, plasticity could obviously also be *non-adaptive*, in other words not related to fitness, or even *maladaptive*, whereby it reduces fitness [2,14,15]. Determining whether plasticity is likely to be adaptive or not requires an understanding of the patterns of selection acting on it—namely whether variation in plasticity is related to variation in fitness [2,16–20]. However, despite the long-standing realization that the adaptive nature of plasticity may be complex and the burgeoning interest in the role that plasticity plays in eco-evolutionary dynamics, understanding the nature of selection on plasticity across contexts is still a challenging and open research area [2,5,6,8].

Table 1. Definitions of terms used in studies of plasticity and selection on plasticity.

Here, we examine the strength of the evidence for selection on phenotypic plasticity, with a particular focus on phenotypic responses to varying temperatures. While anthropogenic environmental change involves many abiotic factors (e.g. water availability, CO_{2}, extreme or aseasonal weather events) that might affect phenotypic traits, temperature is arguably the most prominent variable to be shifting under the changing climate [21–23], and certainly one of the most important environmental parameters determining fundamental life-history rates or reproduction and survival, as well as distributions and dispersal of biota [24]. Shifts in the mean, variability, and extremes of temperatures around the globe are affecting the phenotypic responses of diverse species and threatening the stability and persistence of terrestrial and aquatic ecosystems from the tropics to the poles [25–31]. Experimental studies on plasticity and acclimation often therefore assess changes in organism phenotype and performance across temperatures to determine whether ‘at risk’ populations are likely to be able to respond to predicted rates of climate change within their lifetimes [32–34]. The obvious need to understand species' responses to global temperature changes and the accelerating research interest in phenotypic plasticity generates a clear need to understand whether plasticity in response to temperature change is under selection.

Our aims with this review are therefore threefold. We consider: (i) the assumptions and assessment of whether plasticity is under selection (ii) a review of published literature of empirical estimates of the nature of selection on phenotypic plasticity to temperature and (iii) methods and statistical approaches to guide future estimation of selection on plasticity.

### 2. The assessment of selection on plasticity

#### (a) The spectrum of the adaptive nature of plasticity

There is good evidence for adaptive benefits of plasticity in some traits. Classic examples include the development of defensive structures in the presence of predators, which increases survival probability (e.g. protuberance of dorsal spines in water fleas *Daphnia pulex* [35] and increased shell thickness in the freshwater snail *Physa acuta* [36]), or plasticity in the growth of plants to avoid conspecific shading (e.g. in the orange jewelweed *Impatiens capensis*, in which more elongated plants have higher fitness at high density, whereas shorter plants have higher fitness at low density [37]). An example of selection for increased plasticity in response to temperature is found in an Australian herb, the waxy bluebell *Wahlenbergia ceracea*, in which low-elevation populations (which experience warmer and more variable temperatures than those at high elevations) have greater plasticity in height in response to growth temperature, which results in a greater number of seed capsules [38]. In this system, increased plasticity in height allows the plants to optimize growth habit and light interception based on the relative suitability of the conditions they experience. The term ‘acclimation’ is often used to refer to plasticity that encompasses short term physiological changes in response to a changed environment and there is a large body of work on acclimation testing whether these responses are likely to be adaptive [26,39–42]. For example, in the snow gum *Eucalyptus pauciflora*, alteration of pigment complexes in response to cold improves recovery of photosynthetic performance in spring [43], and hence is most likely adaptive. Similarly, reversible alteration of gut morphology in response to seasonal variation in food availability in laboratory mice *Mus musculus* and white rats *Rattus norvegicus* leads to better energy balance over the course of the year [44].

However, there are other situations where a change in phenotype may not represent an adaptive response, or may even be maladaptive, for example when competition (e.g. in cases of density-dependent population regulation) or resource limitation hinder development and reproduction [6,15,45,46]. As an extreme generic example, reduced food availability will probably result in loss of individual condition, and hence lower rates of both survival and fecundity. Such a change in phenotype is in line with the standard definition of plasticity (table 1), but is unlikely to be adaptive: what may be considered as plasticity to an evolutionary biologist could be seen as density-dependence by a population ecologist. The likelihood of plastic responses being maladaptive may also increase when the environment to which an individual is exposed differs markedly from that in which its ancestors evolved [47], as could occur when the environment changes rapidly. For example, exposing high elevation genotypes of the alpine herbs *W. ceracea* (see above) and *Campanula thyrsoides* to warm conditions typical of lower elevation sites elicits phenotypic and phenological shifts, but is accompanied by significant fitness reductions [38,48].

Further, both costs and limits of plasticity may constrain its dynamics [6,15,49–52]. Even adaptive plastic responses may come with costs. Costs may be owing to maintaining the ‘machinery’ that confers the ability to be plastic or the costs of producing a plastic response (so that the plastic genotype has lower global fitness over multiple environments). Plasticity may also incur costs if it results in the ‘wrong’ phenotype being produced in a new environment [49,50]. Limits refer to developmental, physiological, temporal, and ecological constraints on the expression of plasticity beyond limits to the expression of the phenotype itself, which includes trade-offs between linked traits. But while costs and limits are frequently invoked in theoretical models of plasticity (e.g. [9,53]) and when anticipating species' responses to climate change [54], detecting the highly variable constraints on plasticity remains a significant challenge [6,15,55].

As the above examples illustrate, the fitness implications of plasticity may range from maladaptive through to adaptive, but conclusions as to where along this spectrum a given scenario falls requires more than just subjective inference of likely benefits. If there is variation in plasticity among individuals within a population, then a quantitative analysis of selection on plasticity in a population at a given time can provide valuable insights into its adaptive nature. For plasticity to be under selection requires variation in plasticity (e.g. in reaction norm slopes see table 1) to be related to variation in fitness [2,16–20]. Natural selection can only directly ‘see’ phenotypic trait values that are expressed in a given environment (individual points on a reaction norm), rather than the plasticity itself (reaction norm slope)—but selection on plasticity will summarize the net effect of selection on the change in trait values expressed, which will reflect combined benefits and costs of plasticity [6]. A genotype's average trait value (reaction norm intercept) and trait plasticity (reaction norm slope) can indeed be strongly correlated, and when this is the case, plasticity may be under indirect selection when the trait value is under selection [18,49,50]. As we outline below, these processes can be investigated within the statistical framework of a Lande-Arnold [56] selection analysis. Thus, overall selection on plasticity will be determined by selection on the expressed trait values themselves combined across the continuum of environments [9,56–59]. While any such analysis of current selection obviously cannot provide a full picture of the pressures that have shaped plasticity in the past, it can indicate the current nature of selection: evidence that selection favours increased plasticity might indicate adaptive benefits to plasticity, whereas evidence against plasticity would indicate the opposite.

#### (b) Perspectives on plasticity from previous meta-analyses

To date, several meta-analyses have aimed to evaluate the spectrum of the adaptive nature of plasticity, and have largely indicated that plasticity cannot always (and indeed not necessarily often) affect fitness or be considered as an adaptive response [55,60–62]. These meta-analyses each took different approaches. Acasuso-Rivero *et al.* [60] compared estimates of coefficient of variation of trait expression across environments of life-history (close to fitness) versus non-life-history (further from fitness) traits, and concluded that both categories of traits are similarly plastic. By contrast, Davidson *et al.* [61] assessed the relationship between plasticity and fitness proxies in invasive versus non-invasive plant species, and concluded that although invasive species are generally more plastic, the plasticity itself did not confer a fitness benefit. Palacio-López *et al.*'s [62] meta-analysis focused on reciprocal transplant experiments and found that about one-third of all trait responses appeared to be adaptive, where plants could alter their phenotype to match the non-resident environment.

To our knowledge, the only assessment to date of estimates of selection on plasticity has been van Buskirk & Steiner's [55] review of selection gradient coefficients for the effect of plasticity on fitness. They found 27 studies that contained suitable data from which they were able to estimate selection gradients on plasticity. Their analysis showed, remarkably, exactly equal frequency of positive and negative selection gradient coefficients (262 positive, 262 negative and 12 zero-slope) of fitness against trait plasticity across environments. This is obviously precisely the expected outcome of regressing a random variable against fitness. However, of the 27 studies in [55], only three related to temperature. As it is now 10 years since this review, and given the increased interest in the effects of warming temperatures on biological populations, we aimed here to determine whether additional empirical studies had been conducted or could be identified. Hence, we performed a systematic review and meta-analysis that explicitly targeted selection on plasticity in the context of rapid climate change, to address the following question: is there evidence for selection on plasticity in response to temperature?

#### (c) Quantifying selection on plasticity

Analysis of selection on plasticity requires estimation of the association between a genotype's plasticity and its fitness. There are many different methods to quantify and model plasticity [63–65]. The simplest conceptual method is to regress the phenotypic trait value against the environments in which it was measured to visualize a reaction norm, the slope of which provides a measure of plasticity for each genotype or individual. A selection analysis then typically tests for associations between these measures of plasticity and a genotype's overall (‘global’) fitness, measured across the different environments experienced. Measuring fitness across individuals' entire lifetimes is challenging but not impossible: for example, lifetime reproductive success (LRS) has been measured for several wild animal populations [66]. In cases where fitness is not readily measured directly (e.g. in long-lived trees), then components of fitness such as fecundity in animals, or number of seeds produced or survival of seedlings in one season in plants, may be suitable substitutes as proximal fitness estimates [67].

Selection on plasticity can then be estimated from the regression of global fitness (either the average or summed fitness across all environments) on the respective plasticity values (e.g. [15,68,69]). This approach is effectively a selection gradient analysis on reaction norms [56,70,71], from which the direction and strength of selection on plasticity can be assessed by the relationship between global fitness and plasticity of the different genotypes. However, it has the disadvantage of requiring a two-step approach (first extracting estimates of plasticity, and second associating them with fitness, but typically without accounting for the error inherent in the first step [72]). In §4 below, we consider alternative approaches that circumvent this problem. Given the potential for correlations between a genotype's average trait value (the elevation or intercept of their reaction norm) and their plasticity (the slope of the reaction norm [49,73]), it is also important to separate direct selection on slopes from selection acting indirectly through associations with trait value. This is most efficiently dealt with by estimating selection gradients from an analysis that also considers genotypes' intercepts [49,55].

Care also needs to be taken in interpretation of the resulting selection gradients given their dependence on the average direction of plasticity. We set out the alternative, potentially confusing, scenarios in figure 1. Importantly, when the average reaction norm slope (e.g. of the trait against temperature regression) is positive (figure 1*a*), then if plasticity is under positive selection, the most plastic genotypes will have higher fitness and the selection gradient will be positive (figure 1*b*). Likewise, a selection gradient around zero is a lack of selection on plasticity (figure 1*c*), and a negative selection gradient is negative selection against plasticity (figure 1*d*). However, when the reaction norm slope (e.g. of the trait against temperature regression) is on average negative (figure 1*e*), then the converse is true: a negative selection gradient on reaction norm slopes indicates plasticity is selected to increase (figure 1*f*), selection gradient around zero is again lack of selection on plasticity, and finally a positive selection gradient indicates plasticity is selected to decrease (figure 1*g*). These issues are pertinent to the meta-analysis we present below.

Figure 1. Interpretation of selection on plasticity will be dependent on the direction of average plasticity. The two halves of the figure represent the contrasting scenarios of positive (top half, panels (*a*–*d*)) or negative (bottom half, panels (*e*–*h*)) plasticity in a trait in response to temperature, and the corresponding selection analysis when plasticity is selected to increase, stay constant, or decrease. The scenarios on the right side of the figure (*b*–*d* *f*–*h*) illustrate how fitness changes with plasticity. Note that for simplicity we do not include variation between genotypes in average trait values here. *b*_{0}, *b*_{1} and *b*_{2} are the slopes of the three genotypes' reaction norms, and *β* is the selection gradient from the regression of fitness (*w*) on slope values. (*a*) The scenario where the average slope of the plastic response is positive (e.g. the trait increases with increasing temperature), illustrated by three genotypes that vary in reaction norm slope (*b _{n}*) of phenotypic trait (

*x*) across a continuous gradient of temperatures (

*t*). The large arrow with yellow-red gradient indicates least to most plasticity, here and elsewhere. (

*b*) Where plasticity is positive and selected to increase, the genotype with the greatest plasticity across temperatures (yellow

*b*

_{2}= +2) has the highest fitness (

*w*). Here, the linear selection gradient coefficient for the fitness ∼ plasticity relationship is positive (

*β*> 0). (

*c*) Where plasticity is selected to stay constant, plasticity does not affect fitness (all genotypes have equal fitness) and the selection gradient is zero (

*β*= 0). (

*d*) Where plasticity is positive but selected to decrease, the genotype with the least trait plasticity across temperatures (black

*b*

_{0}= 0) has the highest fitness and the selection gradient is negative (

*β*< 0). (

*e*) The converse scenario to that described in (

*a*) here, the reaction slope of plasticity is negative (e.g. the trait value declines with increasing temperature). (

*f*) Where plasticity is negative and selected to increase, the genotype with the greatest plasticity across temperatures (yellow

*b*

_{2}= −2) has the highest fitness and the selection gradient is negative (

*β*< 0). (

*g*) The same outcome as the scenario described in (

*c*) above (

*β*= 0). (

*h*) Where plasticity is negative but selected to decrease, the genotype with the least plasticity across temperature (black

*b*

_{0}= 0) has the highest fitness and the selection gradient is positive (

*β*> 0).

As an alternative to the reaction norm approach of describing the shape of the phenotypic response across multiple environments, plasticity can also be modelled with a ‘character-state’ approach, which considers the phenotypic values expressed in each discrete environment as different traits [70]. In the same way that changes in variance across environments in a character-state model are equivalent to variance in the slope of reaction norms [74,75], selection on reaction norm slopes (via a covariance between slope and fitness) will generate changes in selection across environments (i.e. changes in the covariance between trait and fitness in each environment [76,77]). As such, the abundance of evidence of selection on phenotypic traits changing with time and environments [78,79] can arguably be taken as indirect selection on plasticity, and one which evolutionary theory predicts will shape the evolution of adaptive plasticity [9]. However, such patterns could also be driven by changes in the variance in fitness between environments and so could occur without the variance in reaction norm slopes that is required for selection on plasticity. The character-state inference also provides no indication of the nature of selection on reaction norms.

### 3. Review of the evidence for selection on thermal plasticity

#### (a) Motivation and literature search

We conducted a systematic review of the literature, with the aim of identifying empirical studies that have quantitatively assessed the nature of *selection on plasticity* across environmental gradients that have a temperature basis. To this end, we employed the PRISMA framework [80] by searching the Web of Science with the following search terms: topic: (*selection* near/3 *plasticity* or *selection* near/3 *reaction norm* or *selection* near/3 *genotype* near/1 *environment* or *selection* near/3 *G* × *E* or *selection* near/3 ‘*G×E*’) and topic: (*temperature* or *thermal* or ‘*climate change*’ or ‘*climate-change*’ or ‘*global warming*’ or ‘*warming world*’ or *heat* or *hot* or *cold*) in July 2018. The Boolean operator ‘near/*n*’ allows *n* words to appear between the topic words (e.g. ‘selection near/3 plasticity’ will capture phrase variants such as selection on/for/of thermal/phenotypic plasticity). Our search was thus explicitly targeted at those studies that investigated *selection on plasticity*.

Our initial search resulted in 139 articles. We then screened the titles and abstracts of these articles to determine which met all of the following five criteria: (i) analysed empirical data (ii) included a measure of trait plasticity (iii) included a measure or proxy of thermal environment (iv) reported a measure of fitness or some component or close proxy of fitness (reproduction or survival), and finally (v) assessed the relationship between the trait plasticity across thermal environment and the fitness component. Screening at this level reduced the number of articles that matched these criteria to 47. We found five papers that presented data that were eligible for qualitative assessment in that they contained estimates of selection coefficients on one or more measures of plasticity, and three additional papers from previously known sources. From each study, we extracted the following details: class and species of study organism, type of selection (e.g. directional, stabilizing), type of data collection (e.g. laboratory or field, wild population or transplant), type of environmental gradient (e.g. temperature, year as temperature proxy), plastic phenotypic trait, sample size, the type of analysis, selection gradient coefficient and associated standard error, whether the selection gradient was standardized (only standardized gradients with errors could be compared in the meta-analysis) and whether it was reported as significant (*p* < 0.05 in the original study). Where the average slope of the reaction norm between the trait and environment was negative (figure 1*e*), we reversed the sign of the selection coefficient so that a positive *β* indicated selection for steeper reaction norms (more plastic genotypes) and a negative *β* slope indicated selection for less steep reaction norms (less plastic genotypes). We also recorded relevant details on the context of the study and the authors' interpretation of their findings.

#### (b) Qualitative systematic review summary

The eight studies that explicitly tested for selection on plasticity [38,68,81–86] are summarized in table 2. These contained a total of 42 estimates of selection coefficients: 39 examples of tests for directional selection and two of stabilizing selection across two major taxa across four species of plants and three species of birds. All the studies on birds were field studies and the thermal environments of these studies were all indirect substitutions for temperature (e.g. a climate index, or year). The phenotypic traits that were quantified for plasticity were either size (e.g. plant height) or growth in the plant studies, or phenology (e.g. laying date) in the bird studies. All of the size-based traits in plants had a positive correlation with environment (e.g. plant height increased as temperature increased), whereas all phenological traits in birds had a negative correlation with thermal environment (i.e. laying date occurred earlier in the year as temperature increased). Fitness measures were all close proxies for reproduction and three studies used measures of LRS (‘total’ fitness).

Table 2. Qualitative summary of studies that investigated selection on plasticity in thermal environments identified by our systematic review. (NAO = North Atlantic Oscillation cond. = measurement conditions the average slope of plasticity on environment is the sign of the regression of plasticity against temperature the average slope of fitness on plasticity is such that positive slopes indicate directional selection for more plasticity and negative slopes indicate directional selection for less plasticity the original author's interpretation is given as non-significant when the associated *p*-value was reported as non-significant in the author's analysis, otherwise if significant the coefficient direction and their interpretation of selection on plasticity is given in bold.)

a Indicates mutant lines of *A. thaliana* (see [68] for further details).

b Year substituted for temperature where temperature increases significantly over time (see [84] for further details).

Across the eight studies, there were 19 negative, one zero and 20 positive linear selection coefficients (table 2). Of these, there were two significant negative coefficients (indicating selection for less steep reaction norms and less plasticity, in flowering time of *Arabidopsis* [68] and *Wahlenbergia* [38]), and seven significant positive coefficients (indicating selection for steeper reaction norms and more plasticity, again in *Arabidopsis* flowering time [82], and also in *Wahlenbergia* height, rosette diameter, and leaf number [38], and breeding time in collared flycatchers [83] and great tits [84]). There were two nonlinear selection coefficients, of which one was significant and indicated stabilizing selection on plasticity (favouring intermediate plasticity, in breeding time of common guillemots [85]). We therefore have evidence for three different types of selection on plasticity from a relatively small sample of significant coefficients, and fourfold more examples finding no evidence of any selection on plasticity (i.e. non-significant selection gradients). The two estimates in opposing directions on *Arabidopsis* flowering time highlight just how inconsistent the pattern of selection on plasticity can be, although considerable spatial, temporal and genetic differences between the sources of lines used in the two studies could obviously also be contributing to this difference [68,82]. We also note that the two collared flycatcher estimates [83] involve measures of fitness that are highly correlated, so do not represent independent points.

Our findings are therefore qualitatively congruent with previous reviews (especially van Buskirk & Steiner [55]) that plasticity is apparently inconsequential for fitness more often than not. To quantitatively test this assertion, we then conducted a meta-analysis on that subset of these studies with suitable coefficients.

#### (c) Meta-analysis of directional selection on plasticity in response to temperature

The dataset used for the qualitative systematic review was subset to those studies that reported estimates of standardized linear directional selection gradients and their associated standard errors, so that the selection gradient coefficient *β* could be used as the measure of effect size (following [87]). This reduced dataset contained 22 standardized selection gradients across four species from five studies: two on plants [68,81] and three on birds [83–85]. All five studies also had included trait means (or intercepts) in their models of *fitness**∼**plasticity*, to account for correlation between the trait mean and plasticity [49]. We conducted a multi-level meta-analysis using the *metafor* package [88] in R v. 3.5.1 [89]. Random effects of study and observation within study were included in the analysis to control for potential non-independence of data and to estimate residual variance [90]. All estimates are means with 95% confidence intervals (CI) and measurement error variance was the squared selection coefficient standard error as in [87]. We further quantified heterogeneity between selection gradients by calculating modified *I* 2 statistics for multi-level models (the ratio of true heterogeneity to the total variance including sampling error [90,91]): I between 2 represents variation between studies and I within 2 represents variation within studies.

The mean standardized selection gradient coefficients (*β*) of all included studies were weakly positive (figure 2), however there was no evidence for selection on plasticity (*β*_{mean} = 0.06 (−0.02 to 0.13 95% CI), *p* = 0.136). Heterogeneity between selection gradients was low by conventional standards [91] both between and within studies ( I between 2 = 18.8 % , I within 2 = 9.8 % ) . The overall weak positive selection gradient from these 22 estimates of plasticity comprising both size and phenology related traits across thermal environments supports the null model: plasticity is not under significant directional selection. It is also worth noting that—as predicted by theory [49]—there was evidence for stronger directional selection on the mean trait value (intercept) rather than plasticity (slope) in several of the studies in our review [83–85]. In Brommer *et al*. [83], the expected evolutionary response of the population of collared flycatchers to increased mean annual temperature would be earlier laying dates (stronger selection on the intercept), but no substantial change in laying date plasticity (weaker selection on reaction norm slopes).

Figure 2. Forest plot showing among-study heterogeneity in mean standardized linear selection gradient coefficients (*β*) of fitness in relation to trait thermal plasticity. Number of selection gradients reported for each study is shown by *n* and by the relative size of the data points, species used in the study are represented by silhouettes, and mean estimates are shown ±95% confidence intervals (CIs). The 95% CIs are calculated from the *metafor* model and do not match with standard error estimates from each individual study. The overall meta-analysis *β*_{mean} ± 95% CIs is represented by the dashed black vertical line and grey shaded area, and the solid red vertical line is centred at *β* = 0 (no selection on plasticity). These studies collectively indicate no evidence for significant selection on thermal plasticity. (Online version in colour.)

However, we recognize that this is a very small sample from which to interpret anything (other than the distinct need for additional data) therefore, we do so with caution. The unambiguous outcome of our meta-analysis is that we require more direct tests for selection on plasticity in response to climate change from more taxonomic groups to evaluate these patterns if we are to posit any sort of informed conclusion about selection on thermal plasticity. Quantifying selection on plasticity is clearly challenging, but in the hope of encouraging more studies, we conclude by setting out recent developments in relevant statistical methods.

### 4. A multivariate mixed model approach to analysing selection on plasticity

We outlined above common approaches taken to assess selection on plasticity, but these are not without their problems. Several issues with the analysis of plasticity have been raised in recent years, including, but not limited to, the problems of multi-step analyses, misleading conclusions when other covariates such as mean trait values are not included [49,72], the oversimplification of reaction norms across just two environments [64], and consideration of only single traits rather than multivariate phenotypes [92]. In this final section, we outline a mixed model approach to analysing selection on plasticity that avoids these potential drawbacks.

The inference of selection on plasticity requires measures of individual plasticity and individual fitness. As outlined above, in the case of a linear reaction norm, a straightforward approach to estimate selection on a plastic response is to regress a genotype's global fitness against the slope of plasticity (e.g. figure 1*b*) using selection gradient analyses [38,56,68]. The simplest implementation of this approach is to use linear regressions for each individual of *trait**∼**environment* to provide estimates of the linear slope of plasticity (reaction norm slope figure 1*a*), which can then be standardized and used as predictor variables for modelling individual fitness in a separate model: *fitness**∼**slope of plasticity* (e.g. figure 1*b*). Random regression mixed models can also provide estimates of plasticity slopes from best linear unbiased predictors, and fitness can then be regressed on these to estimate the selection gradient on plasticity. With both approaches, reaction norm intercepts (elevations) also need to be fitted to account for correlated selection [49]. However, both methods require two steps of models and thereby an undesirable reliance on ‘statistics-on-statistics’ [72,93]. Deriving estimates of selection on plasticity in this way neglects the uncertainty associated with estimates of plasticity, which could generate misleading levels of statistical confidence [93].

These potential pitfalls can be avoided by using multivariate random regression mixed models of trait and fitness, such that selection is assessed directly from estimates of the covariance of fitness with reaction norm slopes within a single model [73,93]. First, consider a random regression mixed model to model the variation between individuals in their change of the trait *x* across environments, with *x _{i}*

_{,j}being the measurement of each individual

*i*at time

*j*in environment

*t*:

_{j}Selection on both trait values and plasticity can then be assessed from the covariance of individuals' intercepts and slopes with their fitness, by extending to a bivariate model that also includes fitness:

The selection differentials in **S** represent the total selection on reaction norm intercepts and slopes, incorporating both direct and indirect selection. These can then be transformed to give a vector of selection *gradients***β** on intercepts and slope, via β = P 2 − 1 S , where **P _{2}** is the 2 × 2 variance–covariance matrix for intercept and slopes of

*x*(i.e. a subset of

**P**). The selection gradients in

_{ind}**β**are then the direct selection on intercept and slope respectively, correcting for the covariance between them [56]. Where relatedness information is available for individuals, such analyses can also be extended to consider the additive genetic components of the relevant variances and covariances [99]. To our knowledge, adding additive genetic components of (co)variance to a multivariate model random regression model with fitness has not yet been attempted. It offers promising potential, but the demands on the data in doing so will be substantial.

On a technical note, analyses of fitness are rarely straightforward. Selection differentials or gradients should be calculated using relative fitness (absolute fitness divided by the population mean [56]), and models are typically fitted assuming Gaussian errors see [93,96] for examples for selection on reaction norms. However, where the fitness measure follows a non-Gaussian distribution, as is typically the case with skewed distributions of fitness, a generalized linear mixed model (GLMM) of absolute fitness will be preferable [95,100]. The resulting covariances returned by the model will then be between the trait on the data scale and fitness on a ‘latent’ (link-function) scale. These estimates need to be transformed if data-scale estimates of selection are required [101]. However, in the case of a GLMM with a log-link function (e.g. Poisson, over-dispersed Poisson, or negative binomial distribution), it is possible to exploit the fact that the latent-scale covariance with absolute fitness is equivalent to the data-scale covariance of relative fitness [102]: consequently, and conveniently, the covariance components of **P _{ind}** on the latent scale can simply be treated as selection differentials

**S**. By extension, estimates of

**β**as indicated above will also provide data-scale selection gradients.

The mixed model framework also offers a powerful means of extension beyond equations (4.1) and (4.2). In an ideal experiment or analysis, the range of environments would have at least three levels that go beyond historical averages, so that the assumption of linear reaction norms can be directly tested, and if required, nonlinear, higher-order reaction norm components can be estimated [64]. The models outlined above can also be extended to include reaction norms in response to more than one environmental variable (say *t*_{1} and *t*_{2}), such that equation (4.1) would contain random regression terms of (*t*_{1,j} : ind_{x}_{,i}) and (*t*_{2,j} : ind_{x}_{,i}), and **P _{ind}** would be a 4 × 4 matrix with additional terms of σ x ∼ t 2 2 and σ x , x ∼ t 2 . For example, the models used by Hayward

*et al.*[96] contain random regressions of sheep weight on both parasite load and age. It is also possible to consider an additional response variable

*y*, so that the model becomes an analysis of the random regressions of both

*x*and

*y*and their relationship with fitness—although, again, data demands will be high. In the electronic supplementary material for this paper, we set out the implementation of a bivariate model in MCMCglmm [95] with R code and an example dataset, with the aim of encouraging use of this approach.

### 5. Conclusion and future directions

A search of the Web of Science on topic: (*temperature* or *thermal*) and topic: (*plasticity* or *acclimation*) refined to biologically-relevant categories returns more than 4400 articles in the last 5 years alone. How is it that the literature abounds with studies reporting responses to temperature, with adaptive interpretation for a wide array of traits across diverse organisms, and yet there are so few quantitative tests of whether thermal plasticity is under selection? Quantifying selection on plasticity in heterogeneous environments is clearly challenging, but its contribution to our understanding of the role of plasticity in response to rapid environmental change will be substantial.

We have outlined here the reasons why phenotypic plasticity may not always be adaptive, and why analysis of the selection on plasticity can inform our understanding of the adaptive nature of plasticity. We have also shown that there appear to be very few published estimates of selection on plasticity across thermal environments. Those few selection estimates for plasticity in response to temperature support the equivocal evidence for the adaptive nature of, or for selection on, plasticity found by other generalized meta-analyses that considered a wide range of environmental types. However, given we found only a handful of studies that examine this question, we consider it premature to conclude that there is no selection on plasticity in response to temperature. We are also very aware that our systematic review may have failed to identify all published estimates of selection coefficients. Despite the various challenges inherent in estimating these parameters, they will be invaluable for basic and applied fields as rapid environmental change continues to drive mean annual temperatures upwards and increase the frequency of extreme temperature and weather events.

Global patterns of advancement in spring events (earlier onset of reproductive-related traits or behaviours) have been documented for decades across diverse species and geographical regions [103,104]. Shifts in phenology owing to plasticity may be beneficial for individuals to respond to variation in their present environment across time, but this plasticity does not seem to alter fitness in a substantial way that selection could act on this variation. The lack of substantial evidence for selection on plasticity in response to thermal environments does not necessarily mean that plasticity plays no role in evolutionary responses to environmental change.

The evolutionary dynamics of wild populations in response to current environmental changes will reflect the interplay between genetic and environmental variation and phenotypic plasticity, among other factors. Although selection may have previously favoured plasticity, it may not be sufficient to match environmental conditions that have large inter-annual variation because optimal reaction norms (and selection on them) are inconsistent across time and space [105]. For example, in the bush brown butterfly *Bicyclus anynana*, when previously consistent signals for wet–dry seasonal transitions are disrupted by climate stochasticity, plasticity that was once adaptive for optimizing growth and behaviour to match resource availability may no longer confer fitness benefits because of a mismatch between phenotype and the altered environment [106]. Environmental conditions that natural populations are exposed to are obviously not static, and crucially, extreme climatic events (e.g. heatwaves, frosts, droughts) are now occurring more frequently and with greater intensity or duration [107]. Although incorporating thermal variation and extreme events as treatments in experimental designs can be challenging [108], these dimensions of environmental stochasticity can have a disproportionate impact on selection and the evolution of plasticity relative to shifts in mean environmental conditions [109–111]. Thus, empirical studies that estimate selection on plasticity in response to climate variability and extreme events will be especially valuable. These data are required if we are to discern the conditions under which plasticity is adaptive or not, and for which taxa, phenotypic traits, environments, and contexts selection operates on plasticity. A key objective now is to apply appropriate statistical models to obtain robust estimates of selection on plasticity, as these will be fundamental to understand and predict phenotypic responses to rapid environmental change. For now, thousands of articles on thermal plasticity notwithstanding, the trail of selection on thermal plasticity remains fairly cold.

### Data accessibility

The dataset supporting this article have been uploaded as part of the electronic supplementary material.

## 17 Answers 17

fmincon() , as you mentioned, employs several strategies that are well-known in nonlinear optimization that attempt to find a local minimum without much regard for whether the global optimum has been found. If you're okay with this, then I think you have phrased the question correctly (nonlinear optimization).

The best package I'm aware of for general nonlinear optimization is IPOPT[1]. Apparently Matthew Xu maintains a set of Python bindings to IPOPT, so this might be somewhere to start.

[1]: Andreas Wachter is a personal friend, so I may be a bit biased.

I work in a lab that does global optimization of mixed-integer and non-convex problems. My experience with open source optimization solvers has been that the better ones are typically written in a compiled language, and they fare poorly compared to commercial optimization packages.

If you can formulate your problem as an explicit system of equations and need a free solver, your best bet is probably IPOPT, as Aron said. Other free solvers can be found on the COIN-OR web site. To my knowledge, the nonlinear solvers do not have Python bindings provided by the developers any bindings you find would be third-party. In order to obtain good solutions, you would also have to wrap any nonlinear, convex solver you found in appropriate stochastic global optimization heuristics, or in a deterministic global optimization algorithm such as branch-and-bound. Alternatively, you could use Bonmin or Couenne, both of which are deterministic non-convex optimization solvers that perform serviceably well compared to the state-of-the-art solver, BARON.

If you can purchase a commercial optimization solver, you might consider looking at the GAMS modeling language, which includes several nonlinear optimization solvers. Of particular mention are the interfaces to the solvers CONOPT, SNOPT, and BARON. (CONOPT and SNOPT are convex solvers.) A kludgey solution that I've used in the past is to use the Fortran (or Matlab) language bindings to GAMS to write a GAMS file and call GAMS from Fortran (or Matlab) to calculate the solution of an optimization problem. GAMS has Python language bindings, and a very responsive support staff willing to help out if there's any trouble. (Disclaimer: I have no affiliation with GAMS, but my lab does own a GAMS license.) The commercial solvers should be no worse than fmincon in fact, I'd be surprised if they weren't a lot better. If your problems are sufficiently small in size, then you may not even need to purchase a GAMS license and licenses to solvers, because an evaluation copy of GAMS may be downloaded from their web site. Otherwise, you would probably want to decide which solvers to purchase in conjunction with a GAMS license. It's worth noting that BARON requires a mixed-integer linear programming solver, and that licenses for the two best mixed-integer linear programming solvers CPLEX and GUROBI are free for academics, so you might be able to get away with just purchasing the GAMS interfaces rather than the interfaces and the solver licenses, which can save you quite a bit of money.

This point bears repeating: for any of the deterministic non-convex optimization solvers I've mentioned above, you need to be able to formulate the model as an explicit set of equations. Otherwise, the non-convex optimization algorithms won't work, because all of them rely on symbolic analysis to construct convex relaxations for branch-and-bound-like algorithms.

UPDATE: One thought that hadn't occurred to me at first was that you could also call the Toolkit for Advanced Optimization (TAO) and PETSc using tao4py and petsc4py, which would have the potential added benefit of easier parallelization, and leveraging familiarity with PETSc and the ACTS tools.

UPDATE #2: Based on the additional information you mentioned, sequential quadratic programming (SQP) methods are going to be your best bet. SQP methods are generally considered more robust than interior point methods, but have the drawback of requiring dense linear solves. Since you care more about robustness than speed, SQP is going to be your best bet. I can't find a good SQP solver out there written in Python (and apparently, neither could Sven Leyffer at Argonne in this technical report). I'm guessing that the algorithms implemented in packages like SciPy and OpenOpt have the basic skeleton of some SQP algorithms implemented, but without the specialized heuristics that more advanced codes use to overcome convergence issues. You could try NLopt, written by Steven Johnson at MIT. I don't have high hopes for it because it doesn't have any reputation that I know of, but Steven Johnson is a brilliant guy who writes good software (after all, he did co-write FFTW). It does implement a version of SQP if it's good software, let me know.

I was hoping that TAO would have something in the way of a constrained optimization solver, but it doesn't. You could certainly use what they have to build one up they have a lot of the components there. As you pointed out, though, it'd be much more work for you to do that, and if you're going to that sort of trouble, you might as well be a TAO developer.

With that additional information, you are more likely to get better results calling GAMS from Python (if that's an option at all), or trying to patch up the IPOPT Python interface. Since IPOPT uses an interior point method, it won't be as robust, but maybe Andreas' implementation of an interior point method is considerably better than Matlab's implementation of SQP, in which case, you may not be sacrificing robustness at all. You'd have to run some case studies to know for sure.

You're already aware of the trick to reformulate the rational inequality constraints as polynomial inequality constraints (it's in your book) the reason this would help BARON and some other nonconvex solvers is that it can use term analysis to generate additional valid inequalities that it can use as cuts to improve and speed up solver convergence.

Excluding the GAMS Python bindings and the Python interface to IPOPT, the answer is no, there aren't any high quality nonlinear programming solvers for Python yet. Maybe @Dominique will change that with NLPy.

UPDATE #3: More wild stabs at finding a Python-based solver yielded PyGMO, which is a set of Python bindings to PaGMO, a C++ based global multiobjective optimization solver. Although it was created for multiobjective optimization, it can also be used to single objective nonlinear programming, and has Python interfaces to IPOPT and SNOPT, among other solvers. It was developed within the European Space Agency, so hopefully there's a community behind it. It was also released relatively recently (November 24, 2011).