Abstract

Where are we?
Since the first report on digital artery repair by Harold Kleinert (Kleinert et al., 1963), we have been able to perform: hand and digital replantation; toe to hand transfers; free-flaps for hand reconstruction; hand transplantation; and improved nerve microsurgery. In spite of these technical improvements, the quality of our clinical research is only fair. We still confuse ‘prevalence’ with ‘incidence’ when we write about the occurrence of a hand disease. We still use the word ‘retrospective’ in our studies, when this does not give information about the clinical design of our studies. A case-series study, a case-control study and a historic cohort study are all retrospective studies, but with different clinical designs and different levels of evidence. The quality of reporting in randomized controlled trials (RCTs) is still poor, with median scores of two points on the modified Jadad scale. The Jadad scale can be used to classify the quality of an RCT from 0 (worst quality) to 5 (best quality); only 41% of RCTs related to the upper limb disorders used patient-related outcome (PRO) (Gummenson et al., 2004). Despite the increasing use of wrist arthroscopy, the efficacy of arthroscopic wrist interventions has been studied in only four randomized studies with a median modified Jadad score for of 0.5 (range 0–1). This contrasts with the 50 randomized studies assessing interventions performed through shoulder arthroscopy, with a median modified Jadad score of 3.0 (Tadjerbashi et al., 2014).
If we do not have good RCTs, we cannot synthesize the evidence and until very recently there have been very few systematic reviews (SRs) and meta-analyses (MAs) in hand surgery (Schädel-Höpfner et al., 2008). Historically, more than 80% of the articles in the Journal of Hand Surgery European (JHSE), and 67.6% in the Journal of Hand Surgery American (JHSA) are at a level of evidence of IV (mostly case series). The percentage of published articles with the highest level of evidence (high quality RCTs and SRs) in the JHSE (0.9% level I and 5.0% level II) and the JHSA (8.3% level I and 10% level II) is very low (Rosales et al., 2012).
Where can we go?
If we want to improve our clinical research, I would recommend young researchers and readers of the JHSE to keep in mind the following two ‘keystones’: the originality of the purpose of the study (why?); and the method (how?). Other aspects of the scientific article, such as the results (discoveries) and the discussion (meaning) are much less important.
Introduction: purpose (why?)
In this section you should address the following question. ‘Why is your research important?’ Try to highlight the originality of your research. Start the introduction of your article (which should ideally be written in advance of your study) explaining what we know about this issue. Follow with a few sentences explaining the gap in scientific knowledge (what we do not know?). Finally, establish the research question that you want to answer with your study (purpose and hypothesis). Sometimes, the research question could be an article itself (McCabe et al., 2007).
Patients & methods (how?)
Here you should include the following main subsections: study population, clinical design, and instruments and measures
Study population
Clearly define the cases (a set of standard criteria for deciding whether a person has a particular disease or other health-related condition). Criteria for case-definition could be clinical criteria and complementary test criteria; they could be classified as inclusion and exclusion criteria. By using a standard case definition, we ensure that every case is diagnosed in the same way, regardless of when or where it occurred, or who identified it. Potential problems in categorizing cases are called information bias. Two important aspects of that bias are: reliability (reproducibility or repeatability) and validity (accuracy, correctness). Reliability should be analysed with the intra-observer agreement and the inter-observer agreement using the Kappa coefficient for dichotomous or categorical variables, and the intra-class correlation coefficient for continuous variables. The classical assessment of validity includes sensitivity and specificity. By using the receiver operating characteristics (ROC) curve, the authors can select the cut-off point of any diagnosis criteria that represents the highest sensitivity with the lowest false positive rate and even the accuracy of any case definition criteria.
Clinical design
Once you have established the purpose of the research and defined the study population, the appropriate clinical design has to be chosen: observational studies (descriptive study, case-control study and cohort study); experimental studies (RCT); and finally a SR and MA. The level of evidence of the research study depends on the clinical design.
In this way, we have the descriptive studies (case-series and cross-sectional study), with level IV evidence, characterized by only one measure at a point in time measured retrospectively. The case control study and cohort study clinical designs are commonly used to analyse the cause of the disease. They are different clinical designs with different levels of evidence. The case control design is retrospective and patients enter the study based on ‘the outcome or dependent variable’ (to have or not to have the disease or pathological condition), e.g. sleep disturbance as causation for carpal tunnel syndrome (CTS). In a case control design, we compare patients retrospectively with CTS and patients without CTS, analysing the proportion with sleep disturbance in the patients with CTS divided by the proportion with sleep disturbance in those without CTS, giving the results as an odds ratio. In a cohort clinical design the patients enter the study based on the independent variable; exposed versus not exposed, e.g. sleep disturbance and CTS in a classic cohort study. You have one sample with sleep disturbance, i.e. the exposed group and another sample without sleep disturbance (not exposed). You can undertake a longitudinal, prospective follow-up of both sample populations and analyse the incidence of CTS in people with sleep disturbance divided by the incidence of CTS in patients without sleep disturbance, giving the results based on the relative risk.
A cohort clinical design can be used for outcomes or studies of treatment effect and can achieve a level II of evidence (Rosales et al., 2012). Those kinds of studies are called ‘before and after treatment studies’, with at least two measurements of the outcomes variables at different points in time; one before treatment and a second after treatment. The treatment effect is analysed based on the effect size (ES) and the standardized response mean (SRM). The ES and SRM are the currencies of treatment effect in any clinical research study. They are calculated based on the ‘mean’ scores of the change (scores pre-treatment minus scores post-treatment) divided by the standard deviation pre-treatment (ES) or the standard deviation of the change scores (SRM) (Rosales et al., 2009). You can assess the measures prospectively (classic cohort study) or retrospectively (historic cohort study). A cohort clinical design can also be used for prognosis studies.
The RCT is the clinical design with the highest level of evidence (level I or II) (Rosales et al., 2012) for comparing the effectiveness of two or more treatments in hand surgery. The key points in any RCT are: randomization, blinding, and withdrawals/dropouts. These are the three factors that constitute the modified Jadad scale, for measuring the quality of an RCT (Gummenson et al., 2004). An easy way to optimize the quality of an RCT in hand surgery is to follow the recommendations of the CONSORT statement; a check list guideline for improving the reporting of RCT (Moher et al., 2001). Inadequately reported randomization, for example, has been associated with bias in estimating the effectiveness of interventions. For convenience, the checklist and diagram together are called ‘CONSORT’. The JHSE has incorporated the CONSORT in its guidelines for submitting a RCT (http://www.equator-network.org/reporting-guidelines/consort/)
SRs and MAs are syntheses of reported evidence. This kind of clinical design presents the highest level of evidence in clinical research but is dependent upon the available published articles. We have to differentiate between a narrative review and a SR. A SR is an overview of primary studies that use explicit and reproducible methods. SRs apply scientific strategies that limit bias by the systematic assembly, critical appraisal and synthesis of all relevant studies on a specific topic. The JHSE requires researchers to follow the PRISMA guidelines.
Instruments and measurements
In evaluating the effectiveness of an intervention, it is important to consider the chosen outcome measures and whether they cover all the important aspects of the expected treatment effect. The International Classification of Functioning, Disability and Health (ICF) was developed by the World Health Organization to be applied to various aspects of health (http://www3.who.int/icf). Outcome measures can be classified according to the ICF as body function and structure, activity and participation. Traditionally the measures used to evaluate the results in hand surgery have been mainly based on body function and structure, such as radiographs and measures of grip strength and range of motion (Gummenson et al., 2004). Recently we have health-related outcomes instruments (called PRO instruments), such as the DASH, Michigan Hand questionnaire, Patient Evaluation Measure and PRWE to assess treatment effects. A common mistake is to translate and apply these questionnaires to new populations outside where they have been developed, i.e. in countries like USA, Canada and UK. Because different cultures have different values and norms, the mere linguistic translation of an instrument is often not sufficient. Researchers and hand surgeons with no health instruments in their native language have two options: to create a new measure or to adapt one previously developed in another language, which is known as the cross-cultural adaptation process (Rosales et al., 2002). Once the research team has a new adapted version of the instrument, it still needs to be tested for psychometric properties, such as reliability, validity and responsiveness. All psychometric analyses are important, but the responsiveness analysis (sensitivity to clinical change after treatment) allows one to know which measure should be used to assess the outcome of any treatment in a specific population and to calculate the sample size for a clinical trial (Rosales et al., 2009). I recommend the Consensus-based Standards for the selection of health status Measurement Instruments (COSMIN) checklist, which is the only tool available at this moment, to evaluate the methodology quality of studies on measurements properties in a standardized way (Terwee et al., 2012).
The future of hand surgery
The best articles in clinical research in hand surgery (Atroshi and Gummesson 2009; Atroshi et al., 1999, 2006) are published in the journals with the highest impact factors, such as The Lancet, JAMA and the BMJ. Recently, Thomson Reuters, who provide the yearly impact factors, reported in 2014 that: ‘New Report details how US dominance in scientific research is struggling significantly to keep pace with increased output from Europe and Asia (http://ip-science.thomsonreuters.com/citationimpactcenter/)’. The same has happened to hand surgery in Europe. The impact factor of this Journal is now the highest among hand surgery journals and above 2 for the first time. My hope is that the quality of clinical research can improve in our field and better articles with greater impact on clinical practice will be published in our journal.
