Biodiversity management and unplanned fires

Edit: Missed my seminar? You can listen to a recording with a copy of the slides:

http://unimelb.adobeconnect.com/p913xeq1mjx/

I’m doing a seminar today at Creswick on fire and biodiversity (9:30 a.m., Melbourne time). My talk will discuss unplanned fires, how their incidence can be modelled to predict impacts on biodiversity, and then how those impacts can be managed. A key point I wish to make is that unplanned fires should be anticipated and that relatively simple models can help understand their impacts.

 

I’ll draw on research from my PhD (20 years ago!), plus some more recent work such as this, this, and this.

If you’d like to listen, I believe the seminar will be streamed live via the link on this page.

Creswick is one of The University of Melbourne’s regional campuses. QAECO is here for a retreat, but it is a bit of a nostalgia trip for me – I spent two years of my undergraduate degree on this campus.

The University of Melbourne campus at Creswick is delightful, and a complete nostalgia trip for me (photo by Peter Vesk; https://twitter.com/pveski/status/540247445121167360/photo/1)

The University of Melbourne campus at Creswick is delightful, and a complete nostalgia trip for me (photo by Peter Vesk)

Image | Posted on by | Tagged , , , , , , , , , , ,

Brunswick in the Victorian 2014 election

I do like an election analysis! After having some fun predicting the result in Indi for the last federal election, I thought I’d look at my home seat of Brunswick in the 2014 Victorian state election.

Jane Garrett, the sitting member for Labor, won Brunswick in 2010 with a 2-candidate preferred (2CP) vote of 19545 votes against Cyndi Dawes, the Greens candidate, on 16974; Garrett won with a margin of 3.5%.

In 2014, the Greens candidate is Tim Read. With 54.08% of the vote counted, the 2CP count gives Garrett 53.9% of the 2CP vote and a lead of 1897 – on the face of it a slight swing to her. Garrett might be re-elected quite safely. However, I’m not so sure it is clear cut just yet.

Jane Garrett and Tim Read are contesting the election for the seat of Brunswick in the 2014 Victorian election (photos from their Twitter accounts).

Jane Garrett and Tim Read are contesting the election for the seat of Brunswick in the 2014 Victorian election (photos from their Twitter accounts).

In 2010, Garrett won 56.3% of the ordinary votes from the 16 booths in Brunswick plus the postal votes. If these are the votes counted to date in the 2CP (it seems they are), then Garrett has actually suffered a swing of 2.4% against her.

In 2010, Garrett won 45.6% of early, provisional and absent votes; she didn’t get a majority of any of those vote types on a 2CP basis, and these are the votes yet to be counted. The question is whether Read, the Greens candidate in this election, can claw back enough votes to overcome the current deficit of almost 2000 votes. It might seem unlikely, but let’s look at whether it is possible.

From 9407 2CP early, provisional and absent votes in 2010, the Greens won by 827 votes. If we add the -2.4% swing seen in the 16 booths and postal votes to 45.6%, we might predict Garrett would get 43.2% of these remaining votes. If there were 10,000 remaining votes, Garrett might get 4320 votes, and Read would then get 5680 – not enough to overcome Read’s current deficit of 1897.

However, there were many more postal and early votes in this election. Antony Green reports that almost 30% of Brunswick’s electors cast postals or early votes in 2014. With an electoral role of 46954, that is almost 14000 votes. Take out the currently listed postals that have already been counted, and we’d expect 12000 early votes, almost 10000 more than in 2010.

In 2010, these early votes favoured the Greens, but it is very difficult to predict where they might fall in 2014 – many more people have taken advantage of the early voting in 2014, so it is hard to guess their preferences.

But add on almost 7000 absent and provisional votes (there were 6773 formal absent votes in 2010; only 400 provisionals), and we might expect 19000 further votes in the election for the seat of Brunswick. If Read wins 55% of these, he will claw back 1900 votes from Garrett – enough for him to win. That seems within the realm of possibilities if we see a swing of 2.4% against Garrett in these votes, noting that the Greens won 52% of the early votes and 55% of the absent votes in 2010.

The election in Brunswick is not over yet. It comes down to where the early and absent votes fall. If they move away from Garrett, Read might spring a surprise.

Edit (3:00 p.m., 30 November): Kevin Bonham doesn’t expect the early votes to be quite as different from the ordinary votes this year as compared to 2010. I’d tend to agree with his assessment. It is hard to see such a large swag of early votes being radically different from those cast on the day. Still, it is worth keeping an eye on this one.

Edits: I corrected some typos in the original version where I typed the wrong figures for the percentage of early and absent votes.

Posted in Communication | Tagged , , , , | 6 Comments

Effects of stand age on fire severity

Our new paper shows that the probability of crown fire in mountain forests under extreme weather conditions is greatest when trees are about 15 years old. This has implications for debates about how timber harvesting influences the risk of fire. Please email me if you would like a copy.

Probability of canopy consumption versus stand age during extreme weather on Black Saturday. The different zones represent data from sites in different areas and different time zones relative to the wind change. The points are clustered into age classes. The solid line is the mean fitted relationship and the dashed lines are 95% credible intervals.

Probability of canopy consumption versus stand age during extreme weather on Black Saturday. The different zones represent data from sites in different areas and different time zones relative to the wind change. The points are clustered into age classes, showing that the results are relatively consistent across zones. The solid line is the mean fitted relationship and the dashed lines are 95% credible intervals (reproduced from Taylor et al. in press).

Crown fire is a major driver of the dynamics of mountain ash forest, so changes in the chance of crown fire with stand age are important. Crown fire also has a major impact on risk to humans. Crown fire typically has much greater intensity than fires that remain on the forest floor, partly because more fuel is being consumed in the crown. And the probability of houses being burnt and people dying increases with fire intensity.

The relationship between stand age and the incidence of crown fire is also somewhat controversial. Some studies suggest that the probability of crown fire in mountain ash forests decreases with stand age, while others suggest it increases. These previous studies have tended to only look for monotonic relationships, and have sometimes not controlled for weather conditions at the time of fire.

Our recent paper shows that, under the most extreme fire weather conditions, the probability of crown fire is very low in the youngest forests (<5 years), but then increases rapidly until around 15 years of age. After that point, the probability of crown fire decreases substantially with age.

This pattern occurs in response to changes in the forest structure and fuel availability. In mountain ash forests, fuel loads approach their maximum levels at around 15-20 years of age. Beyond that age, fuel loads remain high, but various factors will reduce the risk of crown fire. The most obvious effect is that the crowns are further from the ground, so the fires are less easily able to climb into the crown. Secondly, moister forest elements (e.g., rainforest plant species) can become more prevalent over time, and these can reduce fire intensity.

One of the other interesting aspects of this paper is our method of analysis. We used a spatially-correlated probit regression model to model the probability of fire. This is a form of the multi-variate probit model used in our recent joint species distribution model. The difference is that in the fire paper we modelled the correlation as a simple (negative exponential) function of distance. That means that fire was perfectly correlated in its incidence for points immediately adjacent to each other (i.e., at zero distance apart, pairs of points either both burnt or both remained unburnt). As distance between points increased, the correlation decayed toward zero (i.e., at large distances between points, the incidence of fire was independent).

The controversy has hit the press, largely focusing on the consequence for fire risk. Essentially, if an area is logged or burnt a decade or three previously, then the risk of crown fire is substantially increased. In areas where the Black Saturday fires burnt, many of these younger areas were places where 1939 regrowth had been harvested. Therefore, we see headlines such as “Study finds logging increased intensity of Black Saturday fires“.

If you’d like to know more, please read the paper – email me if you don’t have access.

Taylor, C., McCarthy, M.A., and Lindenmayer, D.B. (in press). Non-linear effects of stand age on fire severity. Conservation Letters. [Online]

Posted in CEED, Communication, New research, Probability and Bayesian analysis | Tagged , , , , | 1 Comment

My talk at #ISEC2014

I’m speaking tomorrow on the last afternoon of the International Statistical Ecology Conference in Montpellier. I’ll be arguing that usual metrics (e.g., AIC) to measure the performance of species distribution models (SDMs) might not actually be relevant for selecting models that are to be used for management. There are two reasons for this.

The first reason is the metrics are usually based on the ability to fit the calibration data, not the quality of their predictions where they will be used.

Secondly, most metrics of SDMs consider statistical fit (e.g., AIC can be thought of as a bias-corrected estimate of deviance), which also might not be relevant to management. Mangers will want to select the model that will provide the best management outcome. The link between the usual metrics and management outcomes is indirect at best.

Instead, I develop a metric that is relevant when using SDMs to optimize eradication or searches of species. This metric aims to reflect the number of occurrences of a species that will remain undetected when using a species distribution model for spatial allocation of search effort.

The paper by Lawson et al. (2014) suggests that metrics need to reflect the use, but I think falls short of stating that the metrics that are used are actually not as relevant as they could be.

Slides from my talk are available here.

Relevant references that I build on are:

  • Guillera-Arroita, G., Hauser, C. E. and McCarthy, M. A. (2009) Optimal surveillance strategy for invasive species management when surveys stop after detection. Ecology and Evolution 2014; 4(10):1751–1760. doi: 10.1002/ece3.1056
  • Hauser, C. E. and McCarthy, M. A. (2009) Streamlining ‘search and destroy’: cost-effective surveillance for invasive species management. Ecology Letters, 12: 683–692. doi: 10.1111/j.1461-0248.2009.01323.x
  • Lawson, C. R., Hodgson, J. A., Wilson, R. J., Richards, S. A. (2014) Prevalence, thresholds and the performance of presence–absence models. Methods in Ecology and Evolution, 5: 54–64. doi: 10.1111/2041-210X.12123
  • McCarthy, M. A., Thompson, C. J., Hauser, C., Burgman, M. A., Possingham, H. P., Moir, M. L., Tiensin, T. and Gilbert, M. (2010) Resource allocation for efficient environmental management. Ecology Letters, 13: 1280–1289. doi: 10.1111/j.1461-0248.2010.01522.x
Posted in CEED, Communication, Detectability, Ecological indices, Ecological models, New research, Probability and Bayesian analysis | 1 Comment

Triage does not mean abandoning the most threatened species

Species triage has been in the news lately. The reports might create the impression that we should give up trying to save the most threatened species. Let me be clear:

The science underpinning species triage does not say saving the most threatened species is pointless.

So, what does the science of species triage say? Well, you can read about some of it here and here and here. These links provide copies of the papers, but the one sentence summary is:

Species triage says we should spend resources in a way that saves the most species.

The slightly longer overview of triage is:

  1. With a limited budget, spend resources on species where the benefits are largest;
  2. Determine what is likely to be lost with the current budget even if it were spent as efficiently as possible; and
  3. Determine how many more species could be saved with increases in the budget.

Species triage recognises that conservation requires choices. If we have some money available that could be spent on, for example, two species, triage answers the question of how best to spend that money on each.

Triage also helps answer questions regarding other choices. For example, should we aim to prevent the extinction of all Australia’s bird species at the cost of approximately $10 million per year, or use that money for another four hours of defence spending? Which of those options would more efficiently protect Australia’s heritage?

$10 million per year is predicted to save all of Australia's bird species from extinction. That is equivalent to 4 hours of defence spending in Australia, and less than the cost of maintaining one fighter bomber for a year.

$10 million per year is predicted to save all of Australia’s bird species from extinction over 80 years. $10 million is equivalent to 4 hours of defence spending in Australia, or approximately the cost of maintaining one fighter bomber for half a year. Species triage is similar, but requires us to consider trade-offs in allocation of effort among species rather than between species and aircraft.

The scientific literature tells us that the following attributes influence the optimal allocation of resources:

  1. The efficiency of management for each species. For example, with a given effort, what is the expected reduction in extinction risk for each species, taking into account the chance of success?
  2. The specific objective. For example, are we trying to minimize the number of extinct species or the number of threatened species?
  3. The relative value we place on species. Would we mourn the loss of some species more than others, or do we consider all species equal?
  4. The size of the budget; the optimal allocation of resources to species depends on what we have to spend.

I’ll illustrate the above four points with our analysis of Australia’s birds (McCarthy et al. 2008).

1. Using data compiled by Stephen Garnett on money spent on species and their changes in conservation status, we estimated how investment in each species changed their probabilities of decline and improvement. The findings were revealing.

First, the probability of extinction could be reduced with relatively modest investment in each species (e.g., the dashed line in the figure below). So it is not all doom and gloom; some investment in conservation in Australia has been remarkably successful. While we could surely do better, there are at least some clear benefits of management.

Example of probabilities of transition among conservation classes for a species versus total investment over an 8 year period. This shows how investment changes the probability that an endangered species becomes vulnerable (increases in abundance), becomes critically endangered (declines) or goes extinct. Approximately $50,000 per year reduces extinction risk of endangered species to near zero. Redrawn from McCarthy et al. (2008).

Example of probabilities of transition among conservation classes for a species versus total investment over an 8 year period. This shows how investment changes the probability that an endangered species becomes vulnerable (increases in abundance), becomes critically endangered (declines) or goes extinct. Approximately $50,000 per year ($400,000 over 8 years) reduces extinction risk of endangered species to near zero. Redrawn from McCarthy et al. (2008).

Second, and less happily, the data indicated that the probability of improving the conservation status of a species increased very slowly (if at all) with increased investment – the thick line in the graph above. So environmental managers in Australia seem quite good at reducing extinction of birds, but we need to improve our recovery efforts or make greater efforts.

2. The analysis also showed that the choice of objective influenced where best to focus effort. If the objective is to minimize the number of extinct bird species, it is actually optimal to invest in the most endangered species. However, we should invest in species with lower extinction risks for other objectives, such as when minimizing the number of threatened species.

3. Not all species are equal. Some species are valued more by society. It seems unlikely that Australians would spend as much money preventing the extinction of a plant louse than is spent on Tasmanian devils, even if I think the plant louse is the most important species in the world. Different people might value species depending on their taxonomic distinctiveness (i.e, how much evolutionary history they represent), their role in ecosystems, their charisma, or their potential to generate tourism revenue. These are all reasons why we might mourn the loss of one species more than another. And these relative values that are placed on species will influence how effort is allocated.

4. The size of the budget influences how money should be spent. Our analysis shows that even the relative amount of money spent on each species will change with the budget. For example, when aiming to minimize the number of threatened species (weighted by the level of threat), we should invest the most money in either endangered species or critically endangered species, depending on the size of the budget.

With Australia spending almost $30 million on bird conservation over an 8 year period, our analysis suggests that investing most heavily in the endangered and critically endangered species can be rational. However, changing the objective changes the optimal allocation among species.

Optimal investment in critically endangered (CR, endangered (EN) and vulnerable (VU) bird species when aiming to minimize the number of threatened species in 8 years (solid lines) or 40 years (dashed lines) time.

Optimal investment in critically endangered (CR), endangered (EN) and vulnerable (VU) bird species when aiming to minimize the number of threatened species (weighted by extinction risk) in 8 years (solid lines) or 40 years (dashed lines) time. Results are show versus the total budget available in each 8 year period. From McCarthy et al. (2008).

So, what does this all mean? Species triage is not straight-forward. The optimal way to spend resources depends on numerous factors. One critical factor is the efficiency of management.

A second factor is what level of extinction we are willing to accept. As Stephen Garnett says, “the extent to which we conserve species says a great deal about who we are as a civilised society”. Our grandchildren would be justifiably appalled if a wealthy country permitted the preventable extinction of iconic species.

See also David Lindenmayer’s response to triage.

But species triage is confronting. It typically highlights that with current efforts, extinctions are inevitable*. That might seem like applying cold hard economics to decide which species we aim to conserve, and those we let go. However, in fact it is warm cuddly economics, because it allows us to maximize the number of species saved with the resources available. Indeed, doing anything else but applying triage is cold and hard, because it would allow more extinctions.

Triage compels us to determine what level of investment would be required to achieve acceptable levels of extinction. That allows us to make informed choices about trade-offs between conservation of species and other areas of national spending. In short, species triage is simply rational decision making (Bottrill et al. 2008).

 

* Edit: If we don’t like that answer, we need to increase the effort, not spend resources less efficiently!

References

Bottrill M. C., Joseph L. N., Carwardine J., Bode M., Cook C., Game E. T., Grantham H., Kark S., Linke S., McDonald-Madden E., Pressey R. L., Walker S., Wilson K. A. and Possingham H. P. (2008) Is conservation triage just smart decision making? Trends in Ecology & Evolution, 23: 649-654. doi: 10.1016/j.tree.2008.07.007 [PDF here]

Joseph L. N., Maloney R. F. and Possingham H. P. (2009) Optimal allocation of resources among threatened species: a project prioritization protocol. Conservation Biology, 23: 328–338. doi: 10.1111/j.1523-1739.2008.01124.x [PDF here]

McCarthy M. A., Thompson C. J. and Garnett S. T. (2008) Optimal investment in conservation of species. Journal of Applied Ecology, 45: 1428–1435. doi: 10.1111/j.1365-2664.2008.01521.x [PDF here]

Posted in CEED, Communication, Ecological models | Tagged , , , , , , , , , , , , , , , | 19 Comments

Joint Species Distribution Models

Update: The paper is now available (free) from Methods in Ecology and Evolution.

Update 2: In the tutorial on how to use this method, we refer to the file “fit.JSDM.r”. This is the R script on the journal website named, somewhat opaquely, “mee312180-sup-0004-Rcode.R”. So save that R script, and re-name it before running the tutorial.

Species might tend to occur together, or they might tend to occur apart. Factors driving these patterns can include environmental variables or species interactions. Species distribution models can predict the probability of occurrence of species, but they rarely account for the joint occurrence of multiple species.

I had been working on this idea of modelling co-occurrence with Kirsten Parris using her frog data, and also with Laura Pollock and Peter Vesk using Laura’s eucalypt occurrence data. We were aiming to model co-occurrence within species distribution models. Well, it has taken a few years to figure out (to be precise, it took a few years to find out that someone else had figured it out), but we now have species distribution models that account for the joint occurrence of species.

To help understand what we are trying to do, imagine a simple case with two species each having a 50% chance of being at a site. If the species occur independently of each other, then the probability they are both present would be 25% (50% × 50%), and the probability that both are absent would also be 25%. The remaining 50% is made up of one species being present and the other absent or vice versa.

However, if the species only occurred together, then the probability they are both present would be 50%, as would the probability they are both absent. One species would never occur without the other.

In the other extreme (of “exclusion”), the probability the first species is present and the other absent is 50%, while the probability the first is absent and the other present is also 50%; the species would never occur together and never both be absent from a site.

Two frog species, each occurring in two of four sites. In the left-hand figure, they occur independently of each other. In the middle, they only occur together, and in the right-hand figure they occur exclusively of each other.

Two frog species, each occurring in two of four sites. In the left-hand figure, they occur independently of each other. In the middle, they only occur together, and in the right-hand figure they occur exclusively of each other. These different patterns of co-occurrence can be described by different residual correlations in the binary occurrence of the species.

Intermediate levels of correlation lie between these extremes of association. Some degree of association can be explained by relationships with shared explanatory variables (e.g., two species of tree might tend to be found on rocky hilltops), but residual correlation might exist. Species might exclude each other competitively, or by other processes. Alternatively, other factors that have not been considered might make the species more positively associated with each other than predicted by their response to measured variables. If factors that influence co-occurrence are not included in the model, then residual correlation in the occurrence of species will arise.

On top of these residual correlations, we also need to model spatial variation in the occurrence of species. We don’t want to assume the probability of occurrence of each species is 50% everywhere! It seems to be getting complicated, but a solution exists.

A few years ago, I worked on simulating correlated events – in that case it was a model of spatial correlation in the occurrence of fire (McCarthy and Lindenmayer 1998, 2000). The approach worked by simulating correlated normal variates, and then converting these to Bernoulli variates (zeroes and ones). This was achieved by setting the Bernoulli variate to one (fire occurred at the site) if the normal variate exceeded zero and setting the Bernoulli variate to zero otherwise (there was no fire at the site). The means of the normal distribution reflected the required probability of the event occurring.

I realised that we could use this same framework to model the occurrence of species. The only difference was that we wanted to estimate the correlation structure, while previously I had assumed a particular correlation structure and then simulated it.

Here’s how the method works. But before considering multiple species, it is first necessary to understand how the occurrence of species can be modelled using a latent normal random variate. Rather than modelling the probability of occurrence of a species, and then determining the presence or absence as a draw from a Bernoulli distribution (1 for presence, 0 for absence), we can model the species as being present at a site when a draw from a normal distribution is greater than zero.

Assuming the normal distribution has a standard deviation of 1, the mean of the normal distribution solely controls the probability of occurrence. A small mean translates to a low probability of occurrence, and a large mean translates to a high probability of occurrence. This is simply a representation of probit regression using a latent variable (probit regression is very similar to logistic regression but with a different link function in the generalized linear model).

The probability of occurrence of two frog species (j = 1, the tree frog, or j = 2, the toad) at a particular site i depicted using probability density functions of the latent normal variate Zij. The species would occur at the site when the latent random variable, which has a standard deviation of 1, is greater than 0. Thus, the means of the latent variables determine the probability of occurrence of each species, which equal the shaded areas under the density functions greater than zero (0.69 and 0.16). These representations of individual species ignore patterns of co-occurrence.

The probability of occurrence of two frog species (j = 1, the tree frog, or j = 2, the toad) at a particular site i depicted using probability density functions of the latent normal variate Zij. The species would occur at the site when the latent random variable, which has a standard deviation of 1, is greater than 0. Thus, the means of the latent variables determine the probability of occurrence of each species, which equal the shaded areas greater than zero under the density functions (0.69 and 0.16). These representations of individual species ignore patterns of co-occurrence.

Considering co-occurrence of n species requires an n-dimensional normal distribution. I’ll illustrate the approach with the simplest case of two species, which then requires a bivariate normal distribution. The probability of occurrence of each species is again controlled by the mean of its underlying normal distribution. If this latent variate is greater than zero, then the species is present, and it is absent otherwise. Residual correlation in the occurrence of the two species is controlled by the correlation coefficient of the bivariate normal.

Thus, the location (mean) of the bivariate normal distribution controls the mean occurrence of each species, while the correlation in the distribution controls the patterns of residual co-occurrence. Regression equations with estimated coefficients are used to model the means for each species (hence influence the probability of occurrence), and the residual correlation is controlled by the correlation matrix, which is also estimated.

Co-occurrence patterns of two species, modelled using a bivariate normal distribution represented as contour plots of probability density, with correlation 0.0 (left), 0.75 (middle) and –0.75 (right). The numbers on the contours (the concentric ellipses) are the probability densities that encompass 0.1, 0.3, 0.5, 0.7 and 0.9 of the volume under the bivariate normal distribution. Each species occurs at the site when the corresponding random variate is greater than 0. Thus, species 1 (the tree frog) occurs when Zi1 is greater than zero (the right-hand quadrants), and species 2 (the toad) occurs when Zi2 is greater than zero (the upper quadrants). The joint probabilities of occurrence are indicated by the values in the corners. In all cases shown, the probability of occurrence of species 1 is 0.69 (the sum of the probabilities in the right-hand quadrants) because the mean of Zi1 remains unchanged. Similarly, the probability of occurrence of species 2 remains 0.16 because the mean of Zi2 remains unchanged. The correlation changes the probabilities of co-occurrence, but not the unconditional probabilities of occurrence for each species.

Co-occurrence patterns of two species, modelled using a bivariate normal distribution represented as contour plots of probability density, with correlation 0.0 (left), 0.75 (middle) and –0.75 (right). The numbers on the contours (the concentric ellipses) are the probability densities that encompass 0.1, 0.3, 0.5, 0.7 and 0.9 of the volume under the bivariate normal distribution. Each species occurs at the site when the corresponding random variate is greater than 0. Thus, species 1 (the tree frog) occurs when Zi1 is greater than zero (the right-hand quadrants), and species 2 (the toad) occurs when Zi2 is greater than zero (the upper quadrants). The joint probabilities of occurrence are indicated by the values in the corners. In all cases shown, the probability of occurrence of species 1 is 0.69 (the sum of the probabilities in the right-hand quadrants) because the mean of Zi1 remains unchanged. Similarly, the probability of occurrence of species 2 remains 0.16 because the mean of Zi2 remains unchanged. The correlation changes the probabilities of co-occurrence, but not the unconditional probabilities of occurrence for each species.

However, the difficulty in using this approach lay in estimating the correlation structure. Enter Bayesian methods. MCMC methods allow estimation of multivariate normal distributions. So, that was a logical way of approaching this particular problem. Multivariate normal distributions are defined by a set of means (one mean for each variate), and a variance-covariance matrix.

However, the sticking point was that when converting the normal distribution to a Bernoulli, I needed to use a multivariate distribution with standard deviations equal to one (the variance-covariance matrix needed to be a correlation matrix) – or at least so I thought. I couldn’t figure out how to constrain the matrix to be a correlation matrix and still estimate the parameters for a useful size of problem in a reasonable amount of time.

I muddled away on this problem for a few years, without much success. Eventually, however, I googled “multi-variate probit regression”, after realising that would be an appropriate name for my model. Lo and behold, there it was – Chib and Greenberg (1998) simply re-scaled the regression coefficients rather than re-scaling the variance-covariance matrix. The problem was solved; essentially we needed to re-scale the linear predictor of the probit regression (the mean of the normal distribution) rather than re-scaling the covariances. The approach even has a Wikipedia page (although that needs some work, including a reference to Chib and Greenberg 1998!).

Compared to the previous figure, the mean and standard deviation of the latent variable for the tree frog have changed by the same proportion. Therefore, the probability that the latent variable exceeds zero (the probability of presence) is unchanged.

Compared to the previous figure, the mean and standard deviation of the latent variable for the tree frog have changed by the same proportion (both multiplied by two). Therefore, the probability that the latent variable exceeds zero (the probability of presence) is unchanged.

You can see why this rescaling works by comparing the figure on the left to the previous one above. In the figure to the left, the standard deviation and the mean for the latent variable for the species have both been doubled, so the probability that the latent variable exceeds zero is the same in both cases.

Having found a solution, a group of us from QAECO banded together to write a paper. However, we first needed to see if anyone else in ecology had discovered this idea. Well, Bob O’Hara had, referencing Chib and Greenberg. So, I contacted Bob, and it turned out that Nick Golding had also been working on the same model, and had developed an R package. So we all combined forces, driven by Laura Pollock and Reid Tingley with Will Morris and his R programming grunt, to write a paper.

Our paper aims to make the approach of Chib and Greenberg (1998) accessible to ecologists, providing R and BUGS code to run the analyses. And I’m pleased to say that it has just been accepted in Methods in Ecology and Evolution (Pollock et al. in press).

In the meantime, Clarke et al. (in press) published a paper in Ecological Applications that used the method, and we had already found a variant based on logistic regression by Otso Ovaskainen (2010). Plus, Dave Harris is working on the same topic using a different approach. It seems everyone is converging on the same idea. We hope you like it; you can read the submitted version of the paper here; it will be open access once it is in print read it here (no subscription required).

References

Chib, S. & Greenberg, E. (1998) Analysis of multivariate probit models. Biometrika, 85, 347-361.

Clark, J.S., Gelfand, A.E., Woodall, C.W. & Zhu, K. (in press) More than the sum of the parts: Forest climate response from Joint Species Distribution Models. Ecological Applications, http://dx.doi.org/10.1890/13-1015.1.

McCarthy, M.A. & Lindenmayer, D.B. (1998) Multi-aged mountain ash forest, wildlife conservation and timber harvesting. Forest Ecology and Management, 104, 43-56.

McCarthy, M.A. & Lindenmayer, D.B. (2000) Spatially-correlated extinction in a metapopulation of Leadbeater’s possum. Biodiversity and Conservation, 9, 47-63.

Ovaskainen, O., Hottola, J. & Siitonen, J. (2010) Modeling species co-occurrence by multivariate logistic regression generates new hypotheses on fungal interactions. Ecology, 91, 2514-2521.

Pollock, L.J., Tingley, R., Morris, W.K., Golding, N., O’Hara, R.B., Parris, K.M., Vesk, P.A., and McCarthy, M.A. (in press) Understanding co-occurrence by modelling species simultaneously with a Joint Species Distribution Model (JSDM). Methods in Ecology and Evolution. http://dx.doi.org/10.1111/2041-210X.12180

Posted in CEED, Ecological models, New research, Probability and Bayesian analysis | Tagged , , , , , , , , , , , , | 5 Comments

The yellow-bellied sapsucker & I

I work in The University of Melbourne’s School of Botany, yet I study a range of organisms, not just plants. I’m particularly pleased when I get to work on invertebrates, because they are critical to ecosystem function on Earth, yet they are sorely under-appreciated. However, my work on invertebrates needs to be in collaboration with experts, because I have not special expertise there. One invertebrate expert I work with is the awesome Melinda Moir.

Mel publishes prolifically. On top of her work on co-extinction and the links between threatened plants and the threatened invertebrates that live on them, Mel also makes major contributions to taxonomy.

Acizzia mccarthyi

Images of Acizzia mccarthyi from the paper by Gary Taylor and Melinda Moir: http://booksandjournals.brillonline.com/content/journals/10.1163/1876312x-00002107

In a world of research that is overly infatuated with impact factors of journals, taxonomy gets it rough. Naming and classification of species provides the critical foundation of ecology and many other areas of biology, and it is particularly important for the under-appreciated invertebrates, yet the impact factors of taxonomy journals do not reflect this importance.

Well, I’m going to do my bit to help that. At the slightest opportunity I’ll do my best to cite one of her most recent papers in which, with Gary Taylor, Mel has described three new species of plant lice.

I’ll admit to a major conflict of interest here. Because, in addition to naming species in honour of Sarah Barrett (who works on threatened flora for the Western Australian government) and Lesley Hughes, a third species was named Acizzia mccarthyi.

Now, it might have been named in recognition of a prominent proboscis and loud shirts, or there might have been references to yellow-bellied sapsuckers. However, the dedication notes that the honour is for my research, and for supporting “the investigation into the coextinction of Australian insects that resulted in the discovery of this new species”. I also like the fact that Acizzia mccarthyi is described as being “atypical”, and hangs out on Acacia veronica with Acizzia veski.

Mel, thanks. It is an absolute pleasure.

Now, everyone, get out there and cite some taxonomy papers!

Posted in New research | Tagged , , , , , , , , | 4 Comments