repliCATS glossary

Glossary for Participants

Table of contents

Terms relating to the research project

Terms used in the online platform

Terms relating to the reliability of research claims in the social and behavioural sciences

Terms specifically related to statistical concepts

  • p values and statistical significance
  • Type 1 and Type 2 errors
  • Effect sizes
  • Cohen’s d
  • Correlation coefficients (and related measures)
  • Partial Eta squared
  • Confidence Intervals

Terms relating to the research project

SCORE

The overarching research project funded by the US agency DARPA (Defence Advanced Research Project Agency), which aims to develop automated ways of assessing confidence in research claims within the Social and Behavioural Sciences. The University of Melbourne repliCATS team is undertaking one component of the SCORE research project. Other teams from around the world are undertaking other components of the project.

repliCATS

The team at the University of Melbourne that is participating in the SCORE research program. The repliCATS project is designed to elicit expert judgements on the reliability of research claims within the Social and Behavioural Sciences, to aggregate them into useful measures of reliability, and to understand the reasoning behind these judgements.

IDEA protocol

The process used by the repliCATS team to elicit judgements on the reliability of research claims within the Social and Behavioural Sciences from a diverse group of knowledgeable individuals. IDEA stands for “Investigate”, “Discuss”, “Estimate” and “Aggregate”. In practice for participants the IDEA protocol involves three stages: Considering a claim and making a first-round judgement; then discussing these judgements within a small team; finally making a second-round judgement. The IDEA protocol has been found to improve judgements under uncertainty.


Terms used in the online platform

Plausibility

The likelihood that you would assign the claim based on your background knowledge and experience, before considering the details of the experiment i.e. prior plausibility. We know the word ‘plausible’ will mean different things to different people. The word ‘plausible’ means different things to different people. For some people, almost everything is ‘plausible’, while other people have a stricter interpretation. You could also consider words like ‘possible’ or ‘realistic’ here. We ask you to maintain a consistent standard between different claims and try to let us know if some claims have a higher prior plausibility for you than others. See also the entry in the Training document.

Research Claim

A single major finding from a published study (for example, a journal article), as well as details of the methods and results that support this finding. A Research Claim is not equivalent to an entire article. Sometimes the claim as described in the abstract does not exactly match the claim that is tested. In this case, you should consider the Research Claim to be that which is described in the inferential test, as the next stage of SCORE will focus on testing the replicability of the test results only.

Replication

An independent repeat of an experiment with a specified degree of similarity to the methodological and/or analytic procedures documented in an original study. Replications are typically divided into “direct” (with higher degrees of similarity) and “conceptual” (with lower degrees of similarity). This is not a sharp division. Whether a replication is considered direct or conceptual depends on a range of things including the extent to which the theoretical context of the claim is understood and any use-context for the Research Claim.

Direct Replication

A Replication that follows the methods of the original study with a high degree of similarity, varying aspects only where there is a high degree of confidence that they are not relevant to the research claim. The aim of a direct replication is to improve confidence in the reliability and validity of an experimental finding by starting to account for things such as sampling error, measurement artefacts, and questionable research practices.


Terms relating to the reliability of research claims in the social and behavioural sciences

<Social and Behavioural Sciences

Defined within the SCORE research project as research that appears in a specific list of journals. The disciplines that these journals cover include psychology, economics, political science, sociology, law, education, business and marketing.

Conceptual Replication

A Replication that involves independently repeating an original experiment while purposefully altering specific aspects of the original methods (i.e. a research group is able to investigate the claim using methods of their choice). The aim of conceptual replications is to test whether the claim is supported when using different methods, and the extent to which a research claim can generalize to new circumstances. The SCORE project is interested in whether claims are conceptually replicable, but the main task for the repliCATS team is to determine whether claims are directly replicable.

Publication bias

Refers to the bias against publishing statistically non-significant, or negative, results. This bias comes from both editors and reviewers, and from authors, self-selecting out of publishing non-significant results because of anticipated rejection. The effect of publication bias is to inflate the number of statistically significant results in the published literature compared with the number in studies actually performed. In heavily biased literatures we expect a higher rate of false positives than the baseline. See also the entry in the Training document.

Questionable Research Practices (QRPs)

A range of practices that are relatively common in experimental research, but which affect the interpretation of statistical results, typically in such a way as to overstate the reported effect. Examples of QRPs include p-hacking (making decisions about data collection and analysis after checking for statistical significance), cherry picking (failing to report non-statistically significant relationships that were tested), and Hypothesizing After Results are Known (HARKing, presenting ad hoc findings as though they had been predicted all along).

Private or Personal Knowledge

Knowledge about a research claim that is not contained within the public literature, such as one’s own experience with undertaking similar research, or one’s prior assessments of the quality of work from a particular source.


Terms specifically related to statistical concepts

Statistical concepts are particularly relevant to the question of whether a claim will replicate. Moreover, there are many misconceptions about the meanings of such concepts, even by practising researchers, and many misapplications of their use within the literature. We will not attempt to provide concise descriptions of these terms. We have provided a separate training document that details more precise meanings and proper uses of important terms, and references to the literature. This training document also contains some background information on questionable research practices and replication rates in previous studies. This training document can be downloaded here.

The training document covers the following terms, amongst others:

  • p values and statistical significance
  • Type 1 and Type 2 errors
  • Effect sizes
  • Cohen’s d
  • Correlation coefficients (and related measures)
  • Partial Eta squared
  • Confidence Intervals