Explanation of ecological parameters with some model examples

Explanation of ecological parameters with some model examples

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

I think an ecological parameter is:

A variable, measurable property whose value is a determinant of the characteristics of an ecosystem.


But what could these parameters be? In other words, what are some examples of 'parameters'?

I assume they possibly vary with ecosystem.

Here are some examples of ecological parameters:

  • temperature
  • yearly maximal temperature
  • rain intensity
  • Longest period without precipitation
  • soil acidity
  • salinity
  • nitrogen pollutants
  • average daytime
  • maximal wind speed
  • solar irradiance

You can just have a look at the description of any biome of interest on wikipedia to get an idea of what are the most common parameters used to describe biome. Precipitation, temperature and how their seasonality are very typical descriptors of an ecosystem.

Ecological Isolation Explained With Examples

What is ecological isolation and how does it prevent the occurrence of inter-species hybrids? In this BiologyWise article, we will answer these questions, and at the same time provide examples of this isolation mechanism, to make it easier for you to understand the same.

What is ecological isolation and how does it prevent the occurrence of inter-species hybrids? In this BiologyWise article, we will answer these questions, and at the same time provide examples of this isolation mechanism, to make it easier for you to understand the same.

Ecological Speciation Definition

When ecologically-based divergent selection creates reproductive barriers between populations, it is referred to as ecological speciation.

Would you like to write for us? Well, we're looking for good writers who want to spread the word. Get in touch with us and we'll talk.

Ecological isolation is one of the five pre-zygotic isolation mechanisms that prevent interbreeding between species. Interbreeding does little or no good to the population of the species in question, as most hybrids are sterile. On the contrary, it expends energy of these species on a process that is far from fruitful. Mechanisms of reproductive isolation are not just important because they prevent the birth of sterile offspring, but also because they play a crucial role in speciation.

The conceptual model

The development of a conceptual model can be an integral part of designing and carrying out any research project. Conceptual models are generally written as diagrams with boxes and arrows, thereby providing a compact, visual statement of a research problem that helps determine the questions to ask and the part of the system to study. The boxes represent state variables, which describe the state or condition of the ecosystem components. The arrows illustrate relationships among state variables, such as the movement of materials and energy (called flows) or ecological interactions (e.g., competition). Shoemaker (1977) provides an excellent discussion about how to develop conceptual models.

The development of a conceptual model is an iterative process. The skeleton of a conceptual model begins to take shape when a general research question is formulated. For example, suppose the goal of a research project is to determine the relationship between different strategies for stocking exotic salmon in the Great Lakes and the concentrations of potentially toxic contaminants in the salmon and their alewife prey. The initial conceptual model might consist of two linked boxes labeled “alewife” and “chinook salmon,” with an additional arrow labeled “stocking” pointing to the salmon's box (Figure 2a). We have chosen to place two-way arrows between the boxes to reflect the flow of biomass and contaminants from alewife to salmon and the effect of salmon on the alewife an alternative model might have used only one arrow, since the flow of material between boxes is the result of predation by salmon on alewife. Details would then be added to the conceptual model based on the answers to questions such as, Are there other important species besides alewife and chinook salmon? What mechanistic processes should be included? What environmental factors influence each species? What currency should be used to describe compartment interactions (e.g., elements, biomass, individuals, energy)?

After making refinements driven by such questions, the conceptual model might have alewife, chinook salmon, rainbow smelt, and lake trout (Figure 2b), although the research interest might still be with the original two species. The next round of refinements to the conceptual model might be based on available data or consultation with ecologists who have studied the interactions of the four species shown in Figure 2b. For example, if contaminant concentrations are a function of prey body size, and if predators seek certain size classes of prey, then size structure might be added to the model to more accurately reflect these ecological features and to better simulate contaminant intake by predators (Figure 2c). Depending on the nature of the research question, the addition of size structure might be made for just the alewife and chinook salmon. This simple example assumes that there are changes only in the state variables, but there could also be changes in the relationships among the state variables.

In general, a parsimonious approach is best for creating an appropriate conceptual model. The model should strike a balance between incorporating enough detail to capture the necessary ecological structure and processes and being simple enough to be useful in generating hypotheses and organizing one's thoughts. Creating a good conceptual model forces an ecologist to formulate hypotheses, determine what data are available and what data are needed, and assess the degree of understanding about key components of the system. Because outside viewpoints and questions often force clarification of biases and assumptions, discussing the evolving conceptual model with colleagues can be helpful. Group construction of a conceptual model can also be a useful consensus-building tool in collaborative research ( Walters 1986, Carpenter 1992). Conceptual models should therefore be included in dissertation and grant proposals, especially in the early stages of project development. Revisions of the initial conceptual model then become focal points for discussion in subsequent meetings of the dissertation committee or research planning group.

Biological homeostasis

Homeostasis is one of the fundamental characteristics of living things. It is the maintenance of the internal environment within tolerable limits.

The internal environment of a living organism's body features body fluids in multicellular animals. The body fluids include blood plasma, tissue fluid and intracellular fluid. The maintenance of a steady state in these fluids is essential to living things as the lack of it harms the genetic material.

With regard to any parameter, an organism may be a conformer or a regulator. Regulators try to maintain the parameter at a constant level, regardless of what is happening in its environment. Conformers allow the environment to determine the parameter. For instance, endothermic animals maintain a constant body temperature, while ectothermic animals exhibit wide variation in body temperature.

This is not to say that conformers may not have behavioral adaptations that allow them to exert some control over the parameter in question. For instance, reptiles often sit on sun-heated rocks in the morning to raise their body temperatures.

An advantage of homeostatic regulation is that it allows the organism to function more effectively. For instance, ectotherms tend to become sluggish at low temperatures, whereas endotherms are as active as always. On the other hand, regulation requires energy. One reason snakes are able to eat just once a week is that they use much less energy for maintaining homeostasis.

Homeostasis in the human body

All sorts of factors affect the suitability of the human body fluids to sustain life these include properties like temperature, salinity, and acidity, and the concentrations of nutrients such as glucose, various ions, oxygen, and wastes, such as carbon dioxide and urea. Since these properties affect the chemical reactions that keep bodies alive, there are built-in physiological mechanisms to maintain them at desirable levels.

However, it should be noted that homeostasis is not the reason for these ongoing unconscious adjustments. Homeostasis should be thought of as a general characterization of many normal processes in concert, not their proximal cause per se. Moreover, there are numerous biological phenomena which do not conform to this model, such as anabolism.

Description and Functionality

Processes implemented

This package implements three crucial phases of ENM: calibration, final model creation and evaluation, and extrapolation risk analysis (Fig. 1). Model calibration is performed in two steps: creation of large numbers of candidate models, and evaluation and selection of best models. Candidate models are created using Maxent, with different values of Maxent’s regularization multiplier parameter, combinations of feature classes, and distinct sets of environmental predictors. For each parameter setting, two models are created: one based on the complete set of occurrences, and the other based on the training data only (see data set description in Requirements and Dependencies). Model selection is based on significance, predictive ability, and complexity, in that order of priority: i.e., models are filtered first to detect those that are statistically significant the omission rate criterion is applied to this reduced set of models finally, among the significant and low-omission candidate models, those with values of delta AICc lower than two are selected. Significance and omission rates are calculated on models created with training data, using separate testing data subsets model complexity is calculated on models created with the complete set of occurrences (excluding independent records, see below). We note that the full set of results of this three-part evaluation are provided, so users are able to apply their own sets of criteria.

Figure 1: Schematic description of the ecological niche modeling process, and steps that can be performed using the kuenm package.

Creation of final models in Maxent and transfers to other times or regions can be performed using the parameters selected during calibration. Final models can be created with three options of extrapolation: free extrapolation, extrapolation with clamping, and no extrapolation. Under free extrapolation settings, responses in areas environmentally different from the calibration area follow trends in the calibration environmental data. With the extrapolation and clamping setting, the response in areas with environments distinct from those in the calibration area is clamped to levels presented at the periphery of the calibration region in environmental space. Finally, under the no extrapolation setting, the response is set to zero if the environments in transfer areas are more extreme than those in areas across which the models were calibrated. Final models are evaluated based on statistical significance and omission rates using independent data (see below, in Requirements and Dependencies) when such data are available (Table 1). This evaluation performed as a post-modeling calibration process is not common enough in ENM however, it can be useful, especially when other independent data (e.g., information on species distributions generated in explorations after creation of models) can be used to test models.

Functions Description
kuenm_start Generates an R Markdown file that serves as a guide to perform the main processes implemented in kuenm. This file contains a brief description of each process and chunks of code that will help beginner users in performing each of the analyses. This file can be saved in distinct formats (e.g., HTML, DOCX, and PDF) to record all the code to be used and other user comments, making the research more sharable and reproducible.
kuenm_cal Creates Maxent candidate models. These models are created with multiple combinations of regularization multipliers, feature classes, and sets of environmental predictors. For each combination, it creates one Maxent model with the full set of occurrences, and another with training occurrence data only. Inputs are names of files and folders present in the working directory. Outputs include a folder containing all of the models and a file with Java codes for running candidate models (batch in Windows or bash in Unix), these files are written in the working directory and not stored in memory to avoid RAM limitations.
kuenm_ceval Completes the process of calibration by evaluating candidate model performance and selecting the best ones, based on significance (partial ROC Peterson, Papeş & Soberón, 2008), omission rates (derived from thresholded models based on E= user specified omission percentage see Anderson, Lew & Peterson, 2003), and complexity (AICc Warren, Glor & Turelli, 2010). Inputs are names of files and folders present in working directory. Outputs are written directly to the working directory, and include a file with the complete table of evaluation results, a summary of the model selection process, a table containing the evaluation metrics for only the best models, a figure of model performance across all models, and an HTML file reporting all of the results of the process to guide interpretation.
kuenm_mod Takes the result of model evaluation and creates final models with the parameter sets selected as best. Model projections are allowed, and are called by defining the folder in which subdirectories with transfer environmental data are located these transfers are performed automatically. Inputs are names of files and folders present in working directory. Three options of extrapolation are facilitated using this function when transfers are performed (free extrapolation, extrapolation and clamping, and no extrapolation see Owens et al., 2013) and more than one of these options can be performed in a single run. Final models and their transfers are written directly to the working directory.
kuenm_feval Evaluates final models based on partial ROC statistics and omission rates as assessed with independent occurrence data. Models created with the best parameter settings can be evaluated if independent data are available, to assess and evaluate their quality. Inputs are names of files and folders in the working directory the output of this evaluation (a table with the results) is written directly to the directory.
kuenm_mmop Calculates the mobility-oriented parity (MOP Owens et al., 2013) metric for comparing sets of environmental conditions between the calibration area (M) and multiple areas or scenarios to which models are transferred (G). Inputs are names of files and folders in the working directory. The output maps represent the degree of similarity between conditions in M and G, wherein values of zero correspond to areas of strict extrapolation. All results are written to the working directory.
kuenm_omrat Calculates omission rates of single models based on single or multiple threshold values (E see Anderson, Lew & Peterson, 2003) specified by the user. Inputs and outputs are objects stored in memory results indicate the rate of omission of independent occurrence data used for evaluating models created with training data.
kuenm_proc Calculates statistical significance of single models based on the partial ROC and a threshold value (E see Peterson, Papeş & Soberón, 2008) specified by the user. Inputs and outputs are objects stored in memory outputs include a table with the partial ROC summary and the outcomes of the iterated analyses.
kuenm_mop Calculates the MOP metric for comparisons of environmental conditions between a calibration area and a single area or scenario to which models will be transferred. Inputs and outputs are objects stored in memory output includes a map resulting from this analysis.

Although Maxent allows assessing extrapolation via the multivariate environmental similarity surface metric (MESS Elith, Kearney & Phillips, 2010), the mobility-oriented parity (MOP) index, implemented in kuenm is a metric proposed by Owens et al. (2013) that offers more robust measures of extrapolative conditions in final model transfers. In addition, the kuenm package allow users to use a function (kuenm_start, optional) that creates an R Markdown file that contains a brief guide to perform the main analyses implemented. This file records all user comments and lines of code used for running analyses, and can be saved in various formats, so users can share and reproduce their research easily (Table 1).

Statistics of model performance and extrapolation risk

The statistics of model performance implemented in this package are partial ROC as a measure of statistical significance, omission rates, and AICc. Partial ROC is calculated instead of the full area under the ROC curve because the latter is not appropriate in ENM (Lobo, Jiménez-Valverde & Real, 2007 Jiménez-Valverde, 2012), and partial ROC represents a more suitable indicator of statistical significance (Peterson, Papeş & Soberón, 2008). Statistical significance is determined by a bootstrap resampling of 50% of testing data, and probabilities are assessed by direct count of the proportion of bootstrap replicates for which the AUC ratio is ≤1.0 (Peterson, Papeş & Soberón, 2008). Model evaluation, however, must go beyond significance, to measure performance as well. Performance here is measured using omission rates, which indicate how well models created with training data predict test occurrences these rates are calculated by default at a threshold of E = 5% (Anderson, Lew & Peterson, 2003), but this threshold can be changed depending on user choice. Finally, to evaluate model complexity, AICc, delta AICc, and AICc weights, are calculated AICc values indicate how well models fit to the data while penalizing complexity to favor simple models (Warren & Seifert, 2011).

Users are able to assess extrapolation risks in transfer areas with the MOP metric. The package calculates multivariate environmental distances between sites across the transfer region (G) and the nearest portion of the calibration region (M or accessible area Soberón & Peterson, 2005) to identify regions that present situations of strict or combinational extrapolation. MOP is a metric improved for the purposes of ecological niche modeling with which to estimate extrapolation risks because it assesses environmental difference from the nearest part of the M region, whereas the MESS metric implemented within Maxent evaluates difference from the centroid of the M region in environmental space. Given the irregular nature of most environmental spaces, then, MOP is a more appropriate metric of extrapolation in niche model transfers.

Requirements and dependencies

To maintain simplicity and avoid memory limitations in using this package owing to the large file sizes involved in partial and final outcomes of the analyses developed by this package, a data organization structure is needed (Fig. 2). This structure allows users to run functions from a single directory per species that contains all input data needed and that is where the results will be written directly when performing model calibration, final model creation, and MOP analyses for transfer scenarios. Input data necessary to start analyses include (1) the complete set of occurrences for calibration (i.e., species occurrence records that have been filtered and thinned adequately) (2) training occurrences (part of the complete set of occurrences set aside for creating candidate models to be evaluated with testing data) (3) set of occurrences for testing candidate models (the other part of the complete set of records) and (4) one or more sets of environmental variables to be used in creating candidate models. Occurrences for training and testing models can be subsetted in multiple ways (see partition methods in Muscarella et al., 2014), but some degree of independence is desired. In addition, an entirely independent set of occurrence data (i.e., data not used during calibration that ideally come from other sources and are not spatially autocorrelated with calibration data) can be used to test final models when available. Other sets of environmental data representing distinct scenarios are required when model transfers are desired. Rtools (in Windows), Java Development Kit, and Maxent are necessary for using kuenm R libraries imported are listed in Table S1 . Additional information and a step by step guide for using the main functions of this package can be found in its GitHub repository (

Figure 2: Directory structure and data for starting (A) and when finished (B) using kuenm R package functions.

Process of Ecosystem Succession

The ecological succession is a complex process and it may take thousands of years. Frederic Clements in 1916 for the first time proposed the sequential phases of an ecological succession. The process of succession is completed through a series of sequential steps as given below:

(1). Nudation
(2). Invasion
(3). Competition and Co-action
(4). Reaction
(5). Stabilization (climax)

Ø Definition: Nudation is the development of a bare area (an area without any life form).

Ø It is the first step in ecological succession.

Ø The causes of nudation are:

(a). Topographic: Soil or topography related causes such as soil erosion, sand deposit, landslide and volcanic activity results in the formation of a bare area.

(b). Climatic: Destruction of the community due to glaciers, dry period and storm.

(c). Biotic: It includes forest destruction, agriculture and disease epidemics which results in the total destruction of the population in an area.

Ø Definition: Invasion is the successful establishment of a species in the bare area.

Ø It is the second step in ecological succession.

Ø A new species reaches the newly created bare area and they try to establish there.

Ø The process of invasion is completed in THREE steps:

(A). Migration

(C). Aggregation

(A). Migration (Dispersal):

$ Seeds, spores, propagules of a species reach the bare area due to migration.

$ The migration can be achieved through air or water medium.

$ Ecesis is the process of successful establishment of a species in the bare area.

$ The seeds or spores that reached the new area due to migration will germinate, grow and reproduce.

$ Only a few progenies will survive due to the harsh environmental condition prevailing in the area.

(C). Aggregation:

$ After ecesis, the individuals of a species increase their number and they stay close to each other. This process is called aggregation

image source: cc wikipedia

(3). Competition and Co-action

Ø Aggregation results in the increase of the number of species within a limited space.

Ø This results in competition between individuals for food and space.

Ø The competition may be intra-specific (individuals within a species) or inter-specific (individuals between species).

Ø Individuals of a species affect each other’s life in various ways and this is called co-action.

Ø Competition and co-action results the survival of fit individuals and the elimination of unfit individuals from the ecosystem.

Ø A species with wide reproductive capacity and ecological amplitude only will survive.

Ø Reaction is the most important stage in the ecological succession.

Ø It is the modification of the environment through the influence of living organism present on it.

Ø Reaction cause change in soil, water, light and temperature of the area.

Ø Due to these modifications, the present community becomes unsuitable for the existing environmental conditions.

Ø Such communities will be quickly replaced by another community.

Ø The whole sequence of communities that replaces one another in the given area is called sere (sera).

Ø The various communities contributing sere are called seral communities or seral stages.

(5). Stabilization (Climax)

Ø It is the last stage in ecological succession.

Ø The final or terminal community becomes more or less stabilized for longer period of time.

Ø This community can maintain its equilibrium with the climate of the area.

Ø This final community is called the Climax Community (climax stage).

Ø The climax community is not immediately replaced by other communities.

Ø Climax community is determined by the climate of the region.

Ø Example of climax community: Forest, Grassland, Coral Reef

Different Types of Climaxes in an Ecological Succession:

(1). Climatic climax:

Ø In this climax, the climax community of the succession is determined by the climate of the region.

Ø The climatic climax will have only one climax community.

(2). Edaphic climax

Ø Here the climax community in the succession is determined by the soil (edaphic factor) of the region.

Ø The edaphic factors may include soil moisture, topography, soil texture and soil nutrients.

(3). Catastrophic climax:

Ø Here the climax community is vulnerable to many catastrophic events such as wildfire, snowfall and floods.

Ø The catastrophic factors replace the climax community completely and this area will be immediately invaded by new species.

Characteristics of a Climax Community

The climax community in a succession shows the following characteristics:

Ø The vegetation of the climax community will have high ecological amplitude.

Ø They possess high tolerance towards the environmental conditions.

Ø They show rich diversity in species composition.

Ø The species composition remains constant for many years.

Ø The community possesses a complex food chain system.

Ø The ecosystem will be balanced and self-sustainable.

Ø There will be equilibrium between gross primary productivity and respiration.

Ø The energy used from the sunlight and energy released after decomposition will be balanced.

Ø The uptake of nutrients from the soil and the release of nutrients back to the soil by decomposition will be in equilibrium.

Ø The individuals of the community lost by its death are replaced by the individuals of the same species.

Ø The climax community is considered as the manifestation of the climate prevailed in the area.

Review Questions

Ø What is ecological succession?
Ø Explain the process of ecological succession.
Ø Differentiate community and population.
Ø What is meant by pioneer community? Give example
Ø What is seral community?
Ø Define sere
Ø What is meant by nudation? What are the causes of nudation?
Ø Define ecesis
Ø Describe competition in ecological suceesion.
Ø What is meant by climax community? Give examples
Ø What are the characteristics of climax community?

Ecological factors

In ecology, Ecological factors are variables in the environment that impact on organisms and contribute to their characteristic modes of behavior. They are factors that affect dynamic change in a population or species in a given ecology or environment are usually divided into two groups: abiotic and biotic.

Abiotic factors are geological, geographical, hydrological, and climatological tite parameters. A biotope is an environmentally uniform region characterized by a particular set of abiotic ecological factors. Specific abiotic factors include:

  • Water, which is at the same time an essential element to life and a milieu
  • Air, which provides oxygen, nitrogen, and carbon dioxide to living species and allows the dissemination of pollen and spores
  • Soil, at the same time a source of nutriment and physical support
    • Soil pH, salinity, nitrogen and phosphorus content, ability to retain water, and density are all influential

    Biocenose, or community, is a group of populations of plants, animals, microorganisms. Each population is the result of procreations between individuals of the same species and cohabitation in a given place and for a given time. When a population consists of an insufficient number of individuals, that population is threatened with extinction the extinction of a species can approach when all biocenoses composed of individuals of the species are in decline. In small populations, consanguinity (inbreeding) can result in reduced genetic diversity, which can further weaken the biocenose.

    Biotic ecological factors also influence biocenose viability these factors are considered as either intraspecific or interspecific relations.

    • Intraspecific relations are those that are established between individuals of the same species, forming a population. They are relations of cooperation or competition, with division of the territory, and sometimes organization in hierarchical societies.

    An antlion lies in wait under its pit trap, built in dry dust under a building, awaiting unwary insects that fall in. Many pest insects are partly or wholly controlled by other insect predators.

    • Interspecific relations—interactions between different species—are numerous, and usually described according to their beneficial, detrimental, or neutral effect (for example, mutualism (relation ++) or competition (relation --). The most significant relation is the relation of predation (to eat or to be eaten), which leads to the essential concepts in ecology of food chains (for example, the grass is consumed by the herbivore, itself consumed by a carnivore, itself consumed by a carnivore of larger size). A high predator to prey ratio can have a negative influence on both the predator and prey biocenoses in that low availability of food and high death rate prior to sexual maturity can decrease (or prevent the increase of) populations of each, respectively. Selective hunting of species by humans that leads to population decline is one example of a high predator to prey ratio in action. Other interspecific relations include parasitism, infectious disease, and competition for limited resources, which can occur when two species share the same ecological niche.

    The existing interactions between the various living beings go along with a permanent mixing of mineral and organic substances, absorbed by organisms for their growth, their maintenance, and their reproduction, to be finally rejected as waste. These permanent recycling of the elements (in particular carbon, oxygen, and nitrogen) as well as the water are called biogeochemical cycles. They guarantee a durable stability of the biosphere (at least when unchecked human influence and extreme weather or geological phenomena are left aside). This self-regulation, supported by negative feedback controls, ensures the perenniality of the ecosystems. It is shown by the very stable concentrations of most elements of each compartment. This is referred to as homeostasis. The ecosystem also tends to evolve to a state of ideal balance, called the climax, which is reached after a succession of events (for example a pond can become a peat bog).

    The ecological framework

    The ecological framework is based on evidence that no single factor can explain why some people or groups are at higher risk of interpersonal violence, while others are more protected from it. This framework views interpersonal violence as the outcome of interaction among many factors at four levels—the individual, the relationship, the community, and the societal.

    • At the individual level, personal history and biological factors influence how individuals behave and increase their likelihood of becoming a victim or a perpetrator of violence. Among these factors are being a victim of child maltreatment, psychological or personality disorders, alcohol and/or substance abuse and a history of behaving aggressively or having experienced abuse.
    • Personal relationships such as family, friends, intimate partners and peers may influence the risks of becoming a victim or perpetrator of violence. For example, having violent friends may influence whether a young person engages in or becomes a victim of violence.
    • Community contexts in which social relationships occur, such as schools, neighbourhoods and workplaces, also influence violence. Risk factors here may include the level of unemployment, population density, mobility and the existence of a local drug or gun trade.
    • Societal factors influence whether violence is encouraged or inhibited. These include economic and social policies that maintain socioeconomic inequalities between people, the availability of weapons, and social and cultural norms such as those around male dominance over women, parental dominance over children and cultural norms that endorse violence as an acceptable method to resolve conflicts.

    The ecological framework treats the interaction between factors at the different levels with equal importance to the influence of factors within a single level. For example, longitudinal studies suggest that complications associated with pregnancy and delivery, perhaps because they lead to neurological damage and psychological or personality disorder, seem to predict violence in youth and young adulthood mainly when they occur in combination with other problems within the family, such as poor parenting practices. The ecological framework helps explain the result—violence later in life—as the interaction of an individual risk factor, the consequences of complications during birth, and a relationship risk factor, the experience of poor parenting. This framework is also useful to identify and cluster intervention strategies based on the ecological level in which they act. For example, home visitation interventions act in the relationship level to strengthen the bond between parent and child by supporting positive parenting practices.

    Ecological Niche Modeling Algorithms and Tools

    Modeling algorithms in ecological niche modeling have been described elsewhere (47, 72�), generating starting points for new modelers. Algorithms to develop ecological niche models can be divided into three categories: presence-absence, presence-background, and presence-only. Presence-absence algorithms need a set of localities where the organism occurs (i.e., presence) and a set of localities where the organisms does not occur (i.e., absence). Presence-absence models are calibrated by comparing environmental conditions where the organism is present vs. where it is absent and are generally useful to reconstruct the distribution of diseases at fine scale and short periods, resulting in the need of accurate localities and high-resolution environmental variables. These models, however, have limited capacities to be projected to different areas or periods, instead, their signals are space and time specific. Many algorithms are available including regression (e.g., Generalized Linear Models and Generalized Additive Models) (Figure 1) and classification (e.g., Boosted Regression Trees, Random Forest, and Support Vector Machines) (Figure 9A) algorithms, with protocols described in detail elsewhere (75).

    Figure 9. Classification and hypervolume models. (A) Classification algorithms require environmental values where the species is present (red points) and absent (black points). Presence and absence data are linked to environmental values (arrows) to quantify the probability (question mark) of identifying a locality (gray point) as a suitable or unsuitable. Classification algorithms use these data to inform a series of rules (dashed lines) that vary among algorithms. Temp, temperature Precip, precipitation. (B) Hypervolume algorithms quantify the density or cluster of presence records of the organisms in environmental dimensions. Hypervolumes measure the distance (gray cycles) among occurrences (red points) in an environmental space (arrows) to determine a best-fit model (red buffer).

    Occurrence data are generally robust, while absence data are largely questionable in quality and of limited availability [discussed in (14)]. To solve this problem, researchers generally “simulate” absence data to be able to use presence-absence algorithms. A common approach to simulate absence data is to generate random points across the study area. Presence-absence models that use simulated (i.e., fake) absence data during calibration are termed presence-background models. Presence-background algorithms thus use the same regression and classification algorithms used for presence-absence models, with the unique philosophical variation regarding the interpretation of absences vs. background points. Also, because the background corresponds to the study area, calibration of these algorithms is highly sensitive to variations in the extent of the study area extent selected.

    Maxent is a popular ecological niche modeling algorithm based on logistic-like regressions comparing densities of occurrences (presences), densities of random points (background), and continuous environmental variables using diverse sets of parameters in the calibration process (47). Maxent protocols have been summarized in a series of software including Wallace (76), dismo (75), ENMeval (77), and KUenm (78) packages in R. Wallace is a user-friendly analytical environment to calibrate Maxent models, making it a good starting point for new users since it contains detailed instructions (76). Dismo provides less details regarding the different assumptions and complementary scientific literature, but it is a good starting point for new users interested on modeling in programming environments (75). ENMeval is essentially the programming environment of Wallace and allows more detailed parameterization and evaluation of models (77). KUenm allows detailed, reproducible ecological niche models using Maxent and provides detailed model calibration and selection not available in the other packages (78), overcoming some of the perils of niche model applications for infectious diseases regarding differentiation between good and bad models (46). The KUenm package would be an ideal choice for advanced users since parameterization and installation would require advanced programming skills.

    Presence-only algorithms focus solely on the environmental values linked to each occurrence record for calibration. As a result, calibration of these modes is insensitive to changes in the extent of the study area. Classic presence-only methods include environmental envelopes, which are ellipsoids, squares, or convex-hull that surround the occurrences in an environmental space (Figure 9), with algorithms that include Bioclim (75) and NicheA (71). Emerging presence-only methods include hypervolumes estimated using estimators of density (79) and cluster of occurrences in the environmental space (80). Protocols for hypervolume estimations have been described elsewhere (34, 74), and their use is expected to become common for NR estimations due to the automatization of their workflows and computational optimization.


    We are grateful to Kristina Anderson-Teixeira and Mingkai Jiang for their reviews which improved this manuscript to a great extent. We further thank Gab Abramowitz, Veronika Eyring, Michael Fienen, Andy Fox, Yuan Gao, Birgit Hassler, Xin Huang, Randall Hunt, Lifen Jiang, and Jeremy White for their helpful overview on example cyberinfrastructure tools. The PEcAn project which organized the workshop where the authors of this paper came together is supported by the NSF (ABI no. 1062547, ABI no. 1458021, ABI no. 1457897, ABI no. 1062204, DIBBS no. 1261582), NASA Terrestrial Ecosystems, the Energy Biosciences Institute, and an Amazon AWS education grant. We would also like to thank Boston University for providing the venue for the workshop that inspired this article. I.F. and T.V. acknowledge funding from the Strategic Research Council at the Academy of Finland (decision 327214), the Academy of Finland (decision 297350), and Business Finland (decision 6905/31/2018) to the Finnish Meteorological Institute. T.Q. is funded by the UK NERC National Centre for Earth Observation. J.B.F. contributed to this work from the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration. California Institute of Technology. J.B.F. was supported in part by NASA programs: CARBON and CMS. S.P.S. was partially supported by NASA CMS (grant #80NSSC17K0711), and through the DOE Reducing Uncertainties in Biogeochemical Interactions through Synthesis and Computation Science Focus Area (RUBISCO SFA), which is sponsored by the Earth & Environmental Systems Modeling (EESM) Program in the Climate and Environmental Sciences Division (CESD), and the Next-Generation Ecosystem Experiments (NGEE-Arctic and NGEE-Tropics) supported by the Office of Biological and Environmental Research in the Department of Energy, Office of Science, as well as through the United States Department of Energy contract no. DE-SC0012704 to Brookhaven National Laboratory. M.D.K. acknowledges funding from the Australian Research Council (ARC) Centre of Excellence for Climate Extremes (CE170100023), the ARC Discovery Grant (DP190101823) and support from the NSW Research Attraction and Acceleration Program. F.M.H. was partially supported by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, which is managed by UT-Battelle, LLC, for the U.S. Department of Energy under contract DE-AC05-00OR22725. Additional support was provided by the Data Program, by the Reducing Uncertainties in Biogeochemical Interactions through Synthesis and Computation Science Focus Area (RUBISCO SFA) in the Earth & Environmental Systems Modeling (EESM) Program, and by the Next-Generation Ecosystem Experiments (NGEE-Arctic and NGEE-Tropics) Projects in the Terrestrial Ecosystem Science (TES) Program. The Data, EESM, and TES Programs are part of the Climate and Environmental Sciences Division (CESD) of the Office of Biological and Environmental Research (BER) in the U.S. Department of Energy Office of Science.