Abstract

T
This new guideline document includes 97 recommendations based on 621 references, a large increase since 2011 (76 recommendations based on 321 references) (1). I can imagine that when crafting guidelines, one constantly balances, on one side, on a fine line between personal sentiment and clinical experience and, on the other, an objective assessment of scientific evidence. The incorporation of new evidence in the current guidelines has led to several new and/or changed recommendations that will impact clinical care. Although the strength of the evidence for some recommendations is far from perfect, I find that this issue is well recognized and properly reflected by the content of the recommendations. Where only suboptimal evidence is available, treatment recommendations are made with caution (e.g., a low levothyroxine starting dosage for mild thyroid disease). Furthermore, where needed, available evidence is utilized to provide open-ended recommendations that enable physicians to make informed decisions for their specific patients (e.g., by incorporating risk stratification).
One of the most influential changes in the current guidelines is the new recommendation on the definition of a pregnancy-specific thyrotropin (TSH) reference range (recommendation 26). It has always been recommended that each institute determine its own pregnancy-specific, population-based reference ranges for TSH (2,3). In the common case that such reference values are unavailable, the use of an upper TSH limit of 2.5 or 3.0 mIU/L was recommended for the first and second/third trimesters, respectively (2,3). Although many centers started using these fixed cutoffs, subsequent studies reported that their use led to a far too high proportion of hyperthyrotropinemia (7.4–27.8%) (4,5), and in the majority of cases, population-based TSH upper limits were well above 2.5 and 3.0 mIU/L (6). In addition to a population-based or fixed cutoff for TSH, the availability of recent studies on reference ranges now facilitates a new approach, which is to adopt pregnancy-specific TSH reference ranges that were obtained using a similar TSH assay and from a similar population. Such an assay-specific approach will most likely improve the diagnostic specificity and reduce overdiagnosis. If the adoption of an assay-specific reference range is not possible, it is now recommended to use a fixed TSH upper limit of 4.0 mIU/L. This is in line with upper TSH limits from large population-based studies in iodine-sufficient areas. Furthermore, it is also consistent with results from a recent American study demonstrating that levothyroxine treatment may reduce the risk of miscarriage in women with a TSH between 4.1 and 10 mIU/L, while levothyroxine treatment may increase the risk of adverse outcomes if the TSH is 2.5–4.0 mIU/L (7).
Another new aspect of the current guidelines is the recommendation that levothyroxine therapy can be considered for women with TSH concentrations >2.5 mIU/L if the thyroid peroxidase antibodies (TPOAb) are positive (recommendation 29). This recommendation is based on recent studies that identified a higher risk of adverse pregnancy outcomes in TPOAb-positive women with a co-occurring high or high-normal TSH concentration (8 –11). This treatment consideration is marked as a weak recommendation because studies are still sparse and need to be replicated. However, there is also good circumstantial evidence that supports this recommendation, since TPOAb-positive women were shown to have an abnormal physiological response to human chorionic gonadotropin (hCG) stimulation (12), and two randomized controlled trials also demonstrated that levothyroxine treatment in TPOAb-positive women is beneficial (13,14). The current guidelines also more explicitly mention that for women with mild thyroid dysfunction, 25–50 μg of levothyroxine is a typical starting dose. This will most likely avoid the potential risk of overtreatment, probably the most important risk associated with this new recommendation. I believe that further studies on risk stratification by, for example, high-normal TSH concentrations, TPOAb positivity, hCG, or body mass index will prove valuable.
Among the surge of new publications in the field, there are also important studies demonstrating that there is an association between methimazole (MMI) or propylthiouracil (PTU) treatment and adverse teratogenic effects (15 –18). In the case that antithyroid drug treatment is necessary or preferred for gestational Graves’ hyperthyroidism, the 2011 guidelines recommended the use of PTU during the first trimester (3). This was based on studies showing a higher risk of teratogenic effects of MMI. However, new studies have now also identified an association of PTU use with a higher risk of fetal anomalies (15 –18). Because PTU is associated with fewer and/or less severe fetal anomalies, the use of PTU during early pregnancy is still the preferred approach (also the recommended duration of treatment has now been extended to the first 16 weeks). In addition, a new and innovative recommendation aims to lower exposure to both MMI and PTU during the critical teratogenic period. Consistent data from studies on non-pregnant individuals show that roughly half of Graves’ disease patients relapse after cessation of antithyroid drug treatment (19,20). Mostly, relapse occurs about eight weeks after cessation of antithyroid drugs, and specific risk factors such as smoking and TSH receptor antibody (TRAb) seropositivity seem useful for identifying high-risk patients (19,20). These numbers should also be generalizable to women in very early pregnancy, and if anything, recurrence of Graves’ hyperthyroidism is expected to take longer given the immune tolerance associated with pregnancy. The combination of these physiological considerations creates a window of opportunity to stop antithyroid drugs after pregnancy is detected (five weeks) until the end of the critical teratogenic period (until approximately weeks 12–16). Therefore, recommendation 46 states that for newly pregnant women with Graves’ disease who are euthyroid on a low dose of MMI (≤5–10 mg/day) or PTU (≤100–200 mg/day), discontinuation of antithyroid drugs can be considered. Additionally, a further risk-stratification strategy is suggested using risk factors of relapse (goiter size, duration of therapy, TRAb measurement, smoking, etc.) to guide decision making for each individual patient. In my opinion, this new recommendation is a very good example of how clinical and physiological knowledge combined with epidemiology and prediction modeling can ultimately be applied to improve patient care. I look forward to future assessments on new opportunities to improve this part of the field, such as high-dose iodine treatment (21,22) and genetic risk stratification (19,23).
The concept of universal screening remains a controversial topic. It becomes clear from the literature assessment presented in the new guidelines that there is still insufficient evidence to recommend universal screening. The two most important missing elements are (i) the lack of adequate evidence on how to identify at-risk women optimally (i.e., using fixed or population cutoffs, or using a risk profile), and (ii) a paucity of data on the benefits and/or harms of levothyroxine treatment. For these two knowledge gaps, newly available evidence now suggests that some of the recommendations, practice patterns, and study designs preceding these new guidelines may have been premature and too progressive. In hindsight, it seems likely that many physicians that strictly followed the diagnostic criteria, set by the previous guidelines, overdiagnosed overt and subclinical hypothyroidism, potentially leading to overtreatment (6,7,24). Similarly, randomized controlled trials have been set up in the past without appropriate knowledge on the expected effect size on the outcome of interest (25) or treatment dosage (26,27). Of course, there is no such gray area when it comes to overt hypothyroidism, for which the benefits of levothyroxine treatment are broadly accepted. Unfortunately, the combined prevalence of subclinical hypothyroidism and hypothyroxinemia exceeds the prevalence of overt hypothyroidism by a factor of >20. This ratio, in combination with the current knowledge gaps in the field, makes universal screening prone to cause harm rather than benefit for the population as a whole. Perhaps the discussion should focus on creating a binary clinical screening test for overt hypothyroidism until adequate evidence supports or refutes the implementation of screening for thyroid dysfunction. It is now up to physicians and researchers in the field to fill relevant data gaps with solid studies, thereby providing the necessary tools to refine recommendations in the future. I look forward to such efforts and the new knowledge that we will obtain from it.
