Simian AIDS was discovered after human AIDS. How do they know the jump didn't occur in reverse?
Here is the first report of simian AIDS:
"Examination of the species-specific annual mortality rates of macaques at the center during the previous 4 yr showed a significant increase in deaths in 1980 and 1981"
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC393899/
Also, it is convenient they do not share the mcmc traces. Based on these settings I bet they were ugly:
>"For each data set, at least 3 MCMC chains of 250 million steps were computed. Parameters and trees were sampled every 50,000th step. Samples were combined with LogCombiner (77) and between 10 to 30% of each MCMC chain was discarded as burn-in. MCMC mixing was diagnosed using visual trace inspection and calculation of effective sample sizes in Tracer (77). We report the posterior mean and 95% Bayesian credible intervals for evolutionary parameters."
"Our estimated location of pandemic origin explains the observation that Kinshasa exhibits more contemporary HIV-1 genetic diversity than anywhere else"
This is totally circular. As shown in their figure S2, they used genetic diversity in a region as an indication of earlier presence.
My limited understanding is that once discovered scientists found the families of simian viruses that demonstrated evolutionary and divergence that cannot be explained in going the other way or the timeframe of human hiv.
Thanks. I wonder if this analysis will be reproducible though (they don't mention any parameters...):
>"Phylogenetic analysis
Phylogenetic relationships were estimated from comparisons of predicted protein sequences. Sequences were aligned using CLUSTAL (Higgins and Sharp, 1988, 1989). Evolutionary distances between all pairs of sequences were computed using Kimura's empirical method (Kimura, 1983; eqn. 4.8) to estimate the number of superimposed amino acid replacements; sites at which there was a gap in any sequence in the alignment were excluded from all comparisons. Phylogenetic relationships were estimated from these distances by the neighbor-joining method (Saitou and Nei, 1987). The reliability of branching orders was estimated by the bootstrap approach (Felsenstein, 1985). These methods were implemented using CLUSTAL V (Higgins et al., 1992).
Nucleotide accession numbers All sequences were submitted to GenBank and are available under accession numbers U03994-U04018."
It looks like those sequences are a mixture of the various proteins (env, pol, etc) so it would take some time to figure out which is which. I don't feel it like right now. It seems like getting the sequences from genbank and then aligning using default params with clustalw would be closest to their method. If someone wants to do it:
http://www.ncbi.nlm.nih.gov/genbank/http://www.genome.jp/tools/clustalw/
> Simian AIDS was discovered after human AIDS. How do they know the jump didn't occur in reverse?
I'm not sure we (that is, humans) realized that SIV was a thing until HIV prompted us to look in those places. It's the classic example of not knowing what to look for until you've found it.
I agree, the time of discovery proves nothing. But I asked how they can tell either way since there is no data from earlier. However, I would think captive animals were monitored more closely for strange diseases than people in rural mid-1900s Africa.
You'd probably be wrong about that. Even today in most of these places, a captive animal that dies is just 'sick' and disposed of. Or often sold off as food if the death symptoms aren't obviously apparent, as would be the case with HIV. There is no functioning FDA in central africa.
>"Our estimated location of pandemic origin explains the observation that Kinshasa exhibits more contemporary HIV-1 genetic diversity than anywhere else"
>This is totally circular. As shown in their figure S2, they used genetic diversity in a region as an indication of earlier presence.
It seems like they also used phylogenetic history. Diversity alone is not a sufficient to identify origination.
>A very high genetic diversity of HIV-1 has
been reported, not only in Kinshasa and the
north and south of the DRC (12, 13, 31, 32), but
also in Brazzaville in the RC and, to a lesser extent,
in the Mayombe area of RC near Pointe-Noire,
all of which have been suggested as potential
source locations of the pandemic (22, 33, 34).
We therefore performed phylogeographic analyses
of viruses collected in both the DRC and RC
(table S1) and compared sequence sampling locations
with phylogenetic history to formally test
hypotheses concerning the location of ancestral
viral lineages (30). Our analyses robustly place the spatial origin of the HIV-1 group M pandemic
in Kinshasa [posterior probability (PP) = 0.99]
They could have used "consistent with" rather than "explains", though that may be a bit pedantic and debatable and we wouldn't be in the sorry state of science education/trust if all researchers would communicate like Neil deGrasse Tyson.
Well I know that information about genetic diversity was used to generate the phylogenetic tree since the same sequences are used to determine both, therefore there is data leakage. So I am sure it is circular, it is only a matter of how circular. I think totally circular.
Perhaps I am misunderstanding their model, but look at figure S2. It shows the less closely related the sequences at each location, the more likely that was the originating region relative to the others. See how the leafs of the right tree corresponding to location B are more widely distributed (indicating diversity) than those for location A or C (which are more clustered within the tree)?
These seem to be fairly standard phylogeographic methods that are applied to a mostly uncontroversial dataset. It's basically ancestral state reconstruction where geography are the traits being reconstructed: https://en.wikipedia.org/wiki/Ancestral_reconstruction#Trait... . Although there are concerns with inferring ancestral state using these sorts of methods, the fact that the authors had some historical data means that the inferred ancestral region is far more accurate than would be possible with extant data alone.
Several of the authors are also pretty hardcore Bayesian methods people in the field of phylogenetics specifically applied to the evolution of diseases. It's unlikely that their long MCMCs were due to some kind of coverup; it's possible that either the large number of sequences made convergence more difficult and / or they wanted to be absolutely certain of their results by running their analyses longer than usual. These sorts of phylogeographic methods (especially with an asymmetric movement model and lots of different localities) tend to have likelihood surfaces with many ridges, making MCMC quite difficult for all but the most trivial datasets.
They thinned 250 million steps to 500, then it sounds like they dropped the first 50-100 on top of that post-hoc (the 10-30% burn in gives an impression of manual tinkering).
Obviously this is going to bring up questions regarding convergence, so reviewers should have asked for the diagnostic charts. This would probably have been better suited to ABC:
BEAST (which I assumed they used) is not very fast. 250 million generations on BEAST with the number of parameters they estimated and the size of their dataset would likely take about 2-4 weeks to run. Assuming they ran all 3 chains in parallel this is still a really long time to wait for an analysis. (edit: just checked, looks like they used 700+ sequences, so this is closer to 1-3 months of compute time)
The standard number of samples for phylogenetic analyses is 1,000-10,000 samples, trending towards the smaller as the phylogenies get larger. Since they did 3 independent analyses and combined their samples, as long as they assessed convergence and ensured that all 3 chains were converged this is likely fine. Excluding burnin is also quite typical since the MCMC move operators for phylogenetics are not very good due to the high dimensionality of the search space. ABC is also challenging to work with due to the high dimensionality (though I admit I haven't worked much with these methods in phylogenetics)
Typically in this field it is not enough to say "you didn't run your MCMC chains for long enough" or anything like that. Of course the chains should be run for longer -- theoretically speaking they should be run for infinite amount of time, but there are routine disagreements over whether 100 or 1000 effective samples are sufficient for phylogenetics. But unless there is a serious model inadequacy that the authors haven't addressed there's typically no reason to nitpick about these sorts of things. The authors have covered a lot of their bases by trying a number of different models and priors and I don't see any reason to doubt this particular study based on that.
Disclaimer: I don't know anything about this stuff.
Would they be able to run these MCMC chains in a way that perhaps progressively renders results at greater and greater accuracy? Then they could get some initial "low resolution" results and they wouldn't necessarily have to wait for weeks, but the full data will eventually be available.
In principle, there would be nothing stopping them from looking at the output every step, or every n steps. I don't know if the software they used supports this though.
It is not that 500 steps is too low that concerns me, it is not showing the posteriors for the parameters used (which it sounds like may simply be the custom in this area). I would think if people had such trouble using mcmc that common practice would be publishing diagnostic charts to reassure each other.
As far as any circular reasoning, they're just reiterating the claim in a different way. It's saying "Based on X data, we deduce Y. Y would explain why X happened."
They get that date by assuming humans didn't bring it to the island though, they don't do anything to rule that out:
>"Barring the possibility that humans introduced multiple species-specific SIV lineages to the wild monkey populations of Bioko, the mainland and island SIVdrl variants must have been evolving independently since Bioko became isolated ~10,000 yr B.P., and perhaps longer given the high levels of genetic diversity seen within local SIV populations."
Interesting stuff. I hadn't thought of this until now, but from these few papers I don't think anyone has tried very hard to rule out the human to simian transmission idea.
Chimps and gorillas do get AIDS from SIV. IIRC they're relatively recent infections (though older than human HIV) and SIVgor actually descends from SIVcpz. Rhesus Macaques also get AIDS from SIV (SIV was originally discovered in AIDS-suffering macaques)
Workers who were short of the quotas might have their hands severed. Leopold II is a forgotten monster. Arguably it birthed the international human rights movement.
Click-bait titles dont' mean the article is crap. It's a good article with a clickbait title, plain as that. Clickbait is when some information is left out of the title in an obvious way such that readers will click in pursuit of that piece of information, rather than clicking it in pursuit of expanding on the headline.
I don't disagree. The story is interesting. I just realized that anyone who read the headline would want to know what city. I probably came across snarkier than intended.
Definitely, highly recommended book. Learned of it from Quammen's "Spillover"[0] (a great — if somewhat terrifying — book in its own right) mentioning it as a source in the AIDS chapter and it was a fantastic read. Note that Quammen released his own AIDS-history book this hear, "The Chimp and the River"[1] (haven't read it yet though I intend to).
Belgian control of the Congo was rife with pretty extreme abuse -- the obvious, often horrifying consequences of conquest. But the influx of ambitious businesspeople and capital which they claim started the wider outbreak of HIV, is a curious case. Often people will hand-wring about vague cultural factors, but this seems like one of the most concrete major international disasters caused by rapid gentrification and international investment.
In many ways, whether you support e.g. more direct foreign investment areas in India and China or not, I'd prefer the conversations about costs to be as well formulated as this (I also suspect free trade and foreign investment in China and India has done a lot more good than HIV has done bad, even if you just count lives saved).
I think that any comparison with the Belgian rape of the Congo in the 1900's with rapid gentrification and international investment is completely unwarranted.
Here is the first report of simian AIDS: "Examination of the species-specific annual mortality rates of macaques at the center during the previous 4 yr showed a significant increase in deaths in 1980 and 1981" http://www.ncbi.nlm.nih.gov/pmc/articles/PMC393899/
Also, it is convenient they do not share the mcmc traces. Based on these settings I bet they were ugly:
>"For each data set, at least 3 MCMC chains of 250 million steps were computed. Parameters and trees were sampled every 50,000th step. Samples were combined with LogCombiner (77) and between 10 to 30% of each MCMC chain was discarded as burn-in. MCMC mixing was diagnosed using visual trace inspection and calculation of effective sample sizes in Tracer (77). We report the posterior mean and 95% Bayesian credible intervals for evolutionary parameters."
http://dx.doi.org/10.1126/science.1256739
Another thing, they write:
"Our estimated location of pandemic origin explains the observation that Kinshasa exhibits more contemporary HIV-1 genetic diversity than anywhere else"
This is totally circular. As shown in their figure S2, they used genetic diversity in a region as an indication of earlier presence.