In recent work, we (Yarkoni et al., 2011) developed an automated means to obtain activation coordinate data (like those contained in BrainMap) from the full text of published articles; currently, the database contains data from 3,489 articles from 17 different journals. These data (which are available Selleckchem RO4929097 online at http://www.neurosynth.org) provide a less-biased means to quantify base rates
of activation (though biases clearly remain due to the lack of complete and equal coverage of all possible mental states in the literature). Figure 1 shows a rendering of base rates of activation across the studies in this database. What is striking is the degree to which some of the regions that are the most common targets of informal reverse
inference (e.g., anterior cingulate and anterior insula) have the highest base rates and therefore are the least able to support strong reverse inferences. A thorough analysis of reverse inference using meta-analytic data is difficult because it requires manual annotation of each data set in order to specify which mental processes are engaged by the task. Databases such as BrainMap rely upon relatively coarse ontologies of mental function, which means that although one can assess the strength of inferences Epacadostat cell line for broad concepts such as “language,” it is not possible to perform these analyses for finer-grained concepts that are likely to be of greater interest to many researchers. An alternative approach relies upon the assumption that the words used in a paper should bear a systematic relation to the concepts that are being examined. Yarkoni et al. (2011) used the automatically extracted activation coordinates for 3,489 published articles, along with the full text of those articles, to test this form of reverse inference: instead of asking how predictive an activation map is for some particular mental process (as manually annotated by an expert), this analysis asked how well one can predict
the presence of a particular term in the paper given activation in a particular region. Although there are clearly a number of reasons why this approach might fail, Yarkoni et al. (2011) found that for many terms it was possible to accurately predict activation in specific regions given the presence of the term (i.e., forward inference), as well as to predict the likelihood of tuclazepam the term in the paper given activation in a specific region (i.e., reverse inference). We also found that it was possible to classify data from individual participants with reasonable accuracy, as well as to classify the presence of words in individual studies against as many as ten alternatives, which suggests that these meta-analytic data can provide the basis for relatively large-scale generalizable reverse inference. A challenge to the use of literature mining to perform reverse inference is that it is based on the language that researchers use in their papers and may thus tend to reify informal reverse inferences.