Department of Family Medicine Faculty Papers
Permanent URI for this collectionhttps://digital.lib.washington.edu/handle/1773/15635
Browse
Recent Submissions
Item type: Item , The Percent Fragility Index(World Wide Journals, 7/1/2023) Heston, Thomas FThis article proposes the Percent Fragility Index (PFI) as an improved measure of statistical fragility in biomedical research. The PFI quantifies the percentage change in outcomes needed to change a study's statistical significance from positive to negative or vice-versa. The PFI improves upon existing indices by providing an intuitive statistic that is easy to grasp and by accommodating both dichotomous and continuous variables. This approach minimizes dependency. on sample size, a limitation of the commonly used Fragility Index (FI) and Fragility Quotient (FQ). The FI measures the minimum number of outcome events required to reverse statistical significance, and the FQ divides the FI by the total sample size. The PFI enhances the interpretability and validity of fragility assessments. PFI facilitates a more critical understanding of research outcomes by offering readers a more precise estimate of study fragility.Item type: Item , Quantifying Uncertainty: Potential Medical Applications of the Heston Model of Financial Stochastic Volatility(B P International, 1/1/2024) Heston, Thomas FThe Heston Model, widely used in financial markets to characterize stochastic volatility, could potentially be useful in accounting for the impact of volatility in the broad field of medicine. This theoretical article highlights the potential uses of the Heston Model to quantify volatility in healthcare, focusing on epidemiology and pharmacology. Conceptually, the ability of the model to quantify unpredictability could provide insight into complex medical processes with variable variability. Rigorous testing would be required to determine the feasibility and validity of applying a financial model to biological processes. Nonetheless, the hypothetical connections between financial market volatility and volatility in medicine merit further exploration. This theoretical article explores a broad overview of possible applications of the Heston Model to the medical field.Item type: Item , The Ochlocratic Trap in Bioethics(Athenaeum, 10/21/2023) Heston, Thomas FThe ochlocratic trap is the tendency to have moral decisions conform to popular majority opinion regardless of their ethical implications. This decision-making method in bioethics can significantly impede moral progress, weakening the foundation for sustainable healthcare systems. Instead of allowing popular opinion to form the basis of our morality, the scientific method can provide a framework for making strong ethical decisions. The consequences of weak morality are profound, resulting in poorly sustainable systems lacking human empathy and economic viability. Treating ethical issues like scientific problems can foster a more rigorous, evidence-based discussion, leading to better medical care globally.Item type: Item , The Cost of Living Index as a Primary Driver of Homelessness in the United States: A Cross-State Analysis(Cureus, 10/13/2023) Heston, Thomas FBackground: Homelessness persists as a critical global issue despite myriad interventions. This study analyzed state-level differences in homelessness rates across the United States to identify influential societal factors to help guide resource prioritization. Methods: Homelessness rates for 50 states and Washington, DC, were compared using the most recent data from 2020 to 2023. Twenty-five variables representing potential socioeconomic and health contributors were examined. The correlation between these variables and the homelessness rate was calculated. Decision trees and regression models were also utilized to identify the most significant factors contributing to homelessness. Results: Homelessness rates were strongly correlated with the cost of living index (COLI), housing costs, transportation costs, grocery costs, and the cigarette excise tax rate (all: P below 0.001). An inverse relationship was observed between opioid prescription rates and homelessness, with increased opioid prescribing associated with decreased homelessness (P below 0.001). Due to collinearity, the combined cost of living index was used for modeling instead of its individual components. Decision tree and regression models identified the cost of living index as the strongest contributor to homelessness, with unemployment, taxes, binge drinking rates, and opioid prescription rates emerging as important factors. Conclusion: This state-level analysis revealed the cost of living index as the primary driver of homelessness rates. Unemployment, poverty, and binge drinking were also contributing factors. An unexpected negative correlation was found between opioid prescription rates and homelessness. These findings can help guide resource allocation to address homelessness through targeted interventions.Item type: Item , Statistical Significance Versus Clinical Relevance: A Head-to-Head Comparison of the Fragility Index and Relative Risk Index(Cureus, 10/26/2023) Heston, Thomas FBACKGROUND: In biostatistics, assessing the fragility of research findings is crucial for understanding their clinical significance. This study focuses on the fragility index, unit fragility index, and relative risk index as measures to evaluate statistical fragility. The fragility indices assess the susceptibility of p-values to change significance with minor alterations in outcomes within a 2x2 contingency table. In contrast, the relative risk index quantifies the deviation of observed findings from therapeutic equivalence, the point at which the relative risk equals 1. While the fragility indices have intuitive appeal and have been widely applied, their behavior across a wide range of contingency tables has not been rigorously evaluated. METHODS: Using a Python software program, a simulation approach was employed to generate random 2x2 contingency tables. All tables under consideration exhibited p-values below 0.05 according to Fisher's exact test. Subsequently, the fragility indices and the relative risk index were calculated. To account for sample size variations, the indices were divided by the sample size to give fragility and risk quotients. A correlation matrix assessed the collinearity between each metric and the p-value. RESULTS: The analysis included 2,000 contingency tables with cell counts ranging from 20 to 480. Notably, the formulas for calculating the fragility indices encountered limitations when cell counts approached zero or duplicate cell counts hindered standardized application. The correlation coefficients with p-values were as follows: unit fragility index (-0.806), fragility index (-0.802), fragility quotient (-0.715), unit fragility quotient (-0.695), relative risk index (-0.403), and risk quotient (-0.261). CONCLUSION: The fragility indices and fragility quotients demonstrated a strong correlation with p-values below 0.05, while the relative risk index and relative risk quotient exhibited a weak association with p-values below this threshold. This implies that the fragility indices offer limited additional information beyond the p-value alone. In contrast, the relative risk index and risk quotient exhibit independence from the p-value, indicating that they may provide important additional information about statistical fragility by evaluating the divergence of observed results from therapeutic equivalence, irrespective of the p-value-based statistical significance.Item type: Item , Safety of Large Language Models in Addressing Depression(Cureus, 12/18/2023) Heston, Thomas FBackground Generative artificial intelligence (AI) models, exemplified by systems such as ChatGPT, Bard, and Anthropic, are currently under intense investigation for their potential to address existing gaps in mental health support. One implementation of these large language models involves the development of mental health-focused conversational agents, which utilize pre-structured prompts to facilitate user interaction without requiring specialized knowledge in prompt engineering. However, uncertainties persist regarding the safety and efficacy of these agents in recognizing severe depression and suicidal tendencies. Given the well-established correlation between the severity of depression and the risk of suicide, improperly calibrated conversational agents may inadequately identify and respond to crises. Consequently, it is crucial to investigate whether publicly accessible repositories of mental health-focused conversational agents can consistently and safely address crisis scenarios before considering their adoption in clinical settings. This study assesses the safety of publicly available ChatGPT-3.5 conversational agents by evaluating their responses to a patient simulation indicating worsening depression and suicidality. Methodology This study evaluated ChatGPT-3.5 conversational agents on a publicly available repository specifically designed for mental health counseling. Each conversational agent was evaluated twice by a highly structured patient simulation. First, the simulation indicated escalating suicide risk based on the Patient Health Questionnaire (PHQ-9). For the second patient simulation, the escalating risk was presented in a more generalized manner not associated with an existing risk scale to assess the more generalized ability of the conversational agent to recognize suicidality. Each simulation recorded the exact point at which the conversational agent recommended human support. Then, the simulation continued until the conversational agent stopped entirely and shut down completely, insisting on human intervention. Results All 25 agents available on the public repository FlowGPT.com were evaluated. The point at which the conversational agents referred to a human occurred around the mid-point of the simulation, and definitive shutdown predominantly only happened at the highest risk levels. For the PHQ-9 simulation, the average initial referral and shutdown aligned with PHQ-9 scores of 12 (moderate depression) and 25 (severe depression). Few agents included crisis resources - only two referenced suicide hotlines. Despite the conversational agents insisting on human intervention, 22 out of 25 agents would eventually resume the dialogue if the simulation reverted to a lower risk level. Conclusions Current generative AI-based conversational agents are slow to escalate mental health risk scenarios, postponing referral to a human to potentially dangerous levels. More rigorous testing and oversight of conversational agents are needed before deployment in mental healthcare settings. Additionally, further investigation should explore if sustained engagement worsens outcomes and whether enhanced accessibility outweighs the risks of improper escalation. Advancing AI safety in mental health remains imperative as these technologies continue rapidly advancing.Item type: Item , Concordance of chest x-ray with chest CT by body mass index(PeerJ, 3/16/2023) Heston, Thomas F; Jiang, John YIntroduction: Patients with suspected thoracic pathology frequently get imaging with conventional radiography or chest x-rays (CXR) and computed tomography (CT). CXR include one or two planar views, compared to the three-dimensional images generated by chest CT. CXR imaging has the advantage of lower costs and lower radiation exposure at the expense of lower diagnostic accuracy, especially in patients with large body habitus. Objectives: To determine whether CXR imaging could achieve acceptable diagnostic accuracy in patients with a low body mass index (BMI). Methods: This retrospective study evaluated 50 patients with age of 63 ± 12 years old, 92% male, BMI 31.7 ± 7.9, presenting with acute, nontraumatic cardiopulmonary complaints who underwent CXR followed by CT within 1 day. Diagnostic accuracy was determined by comparing scan interpretation with the final clinical diagnosis of the referring clinician. Results: CT results were significantly correlated with CXR results (r = 0.284, p = 0.046). Correcting for BMI did not improve this correlation (r = 0.285, p = 0.047). Correcting for BMI and age also did not improve the correlation (r = 0.283, p = 0.052), nor did correcting for BMI, age, and sex (r = 0.270, p = 0.067). Correcting for height alone slightly improved the correlation (r = 0.290, p = 0.043), as did correcting for weight alone (r = 0.288, p = 0.045). CT accuracy was 92% (SE = 0.039) vs. 60% for CXR (SE = 0.070, p below 0.01). Conclusion: Accounting for patient body habitus as determined by either BMI, height, or weight did not improve the correlation between CXR accuracy and chest CT accuracy. CXR is significantly less accurate than CT even in patients with a low BMI.Item type: Item , Foundations of Scholarly Writing(B P International, 1/19/2024) Heston, Thomas FAcademic research, transcending disciplinary boundaries, mandates adherence to established core principles, ensuring ethical rigor and substantial contributions to knowledge. Comprehensive literature reviews are pivotal in unearthing research gaps, laying the groundwork for investigations that yield novel insights. The alignment of research questions with robust methodologies—be it qualitative, quantitative, or a hybrid—fortifies the validity of outcomes. Meticulous writing and strategic publication approaches are instrumental in effective dissemination and impact generation. A commitment to transparency, ethical integrity, and the acknowledgment of inherent limitations is fundamental to maintaining credibility. In an era where research is continually evolving, particularly with the advent of new technologies, the agility in ethical reasoning is essential to navigate emerging complexities and ethical quandaries. Efforts to minimize potential harms, honor diverse perspectives, and uphold moral principles remain non-negotiable ethical imperatives. Institutional training programs, established protocols, codes of conduct, and vigilant oversight embed a culture of integrity within academic spheres. Recommendations for future research writing include the development of interdisciplinary teams, ongoing training in critical thinking, and ensuring that ethical standards keep pace with the rapid evolution of technology.Item type: Item , We Thank You Lord(IJCR, 8/30/2023) Heston, Thomas FThis poem, essay, and song are about cultivating gratitude for our gifts, overcoming obstacles to serving others, and how a dying patient's courage reminds us that we can provide comfort and strength even in dire circumstances.Item type: Item , To Mask or Not Mask Correctly - An Empirical Look at Public Masking Behavior(Athenaeum, 2023-08-19) Heston, Thomas FIntroduction: Mask usage was mandated by public health authorities globally to decrease the spread of COVID-19. These recommendations were based on data showing that N95 masks and possibly surgical masks, when worn tight against the face, help slow the transmission of the SARS-CoV-2 virus. However, cloth and loose-fitting surgical masks are greatly inferior. Methods: Mask use by a random observation of 100 people in public indoor facilities was recorded and statistically analyzed. Results: Out of 100 people wearing a mask, 37 wore a cloth mask. Another 36 people wore a loosely applied surgical mask. Only 27 people wore a surgical mask that covered the nose and mouth and was applied firmly against the face at its margins. There were no people seen wearing an N95 mask. Overall, people were about 70% more likely to wear a surgical mask than a cloth mask (63 vs 37, p below 0.05). Of those wearing a surgical mask, more people wore it loosely than properly (36 to 27, p=0.17). Overall, people were more likely to wear a cloth mask or improperly applied surgical mask than a properly fitted one (73 vs 27, p below 0.001). Conclusion: In public settings, using cloth or loose-fitting surgical masks was almost 3 times more common than adequately using a tight-fitting surgical mask. Out of the 100 people observed, none wore an N95 respirator mask.Item type: Item , The Robustness Index: Going Beyond Statistical Significance by Quantifying Fragility(Cureus, 2023-08-30) Heston, Thomas FStatistical significance is widely used to evaluate research findings but has limitations around reproducibility. Measures of statistical fragility aim to quantify robustness against violations of assumptions. However, dependence on sample size and single unit changes restricts indices like the unit fragility index and the fragility quotient. The Robustness Index (RI) is proposed to overcome these limitations and quantify fragility independently of the research study's sample size. The RI measures how altering sample size affects significance. For insignificant findings, the sample size is multiplied until significance is reached; the multiplicand is the RI. The sample size is divided for significant research findings until insignificance is reached; the divisor is the RI. Thus, higher RIs indicate greater robustness of insignificant and significant research findings. The RI provides a simple, interpretable metric of fragility. It facilitates comparisons across studies and can potentially increase trust in biomedical research.Item type: Item , Development of a taxonomy to describe massage treatments for musculoskeletal pain(2006) Sherman, Karen J.; Dixon, Marian W.; Thompson, Diana; Cherkin, Daniel C.Background: One of the challenges in conducting research in the field of massage and bodywork is the lack of consistent terminology for describing the treatments given by massage therapists. The objective of this study was to develop a taxonomy to describe what massage therapists actually do when giving a massage to patients with musculoskeletal pain. Methods: After conducting a review of the massage treatment literature for musculoskeletal pain, a list of candidate techniques was generated for possible inclusion in the taxonomy. This list was modified after discussions with a senior massage therapist educator and seven experienced massage therapists participating in a study of massage for neck pain. Results: The taxonomy was conceptualized as a three level classification system, principal goals of treatment, styles, and techniques. Four categories described the principal goal of treatment (i.e., relaxation massage, clinical massage, movement re-education and energy work). Each principal goal of treatment could be met using a number of different styles, with each style consisting of a number of specific techniques. A total of 36 distinct techniques were identified and described, many of which could be included in multiple styles. Conclusion: A new classification system is presented whereby practitioners using different styles of massage can describe the techniques they employ using consistent terminology. This system could help facilitate standardized reporting of massage interventions.Item type: Item , A survey of training and practice patterns of massage therapists in two US states(2005) Sherman, Karen J.; Cherkin, Daniel C.; Kahn, Janet; Erro, Janet; Hrbek, Andrea; Deyo, Richard A.; Eisenberg, David M.Background: Despite the growing popularity of therapeutic massage in the US, little is known about the training or practice characteristics of massage therapists. The objective of this study was to describe these characteristics. Methods: As part of a study of random samples of complementary and alternative medicine (CAM) practitioners, we interviewed 226 massage therapists licensed in Connecticut and Washington state by telephone in 1998 and 1999 (85% of those contacted) and then asked a sample of them to record information on 20 consecutive visits to their practices (total of 2005 consecutive visits). Results: Most massage therapists were women (85%), white (95%), and had completed some continuing education training (79% in Connecticut and 52% in Washington). They treated a limited number of conditions, most commonly musculoskeletal (59% and 63%) (especially back, neck, and shoulder problems), wellness care (20% and 19%), and psychological complaints (9% and 6%) (especially anxiety and depression). Practitioners commonly used one or more assessment techniques (67% and 74%) and gave a massage emphasizing Swedish (81% and 77%), deep tissue (63% and 65%), and trigger/pressure point techniques (52% and 46%). Self-care recommendations, including increasing water intake, body awareness, and specific forms of movement, were made as part of more than 80% of visits. Although most patients self-referred to massage, more than onequarter were receiving concomitant care for the same problem from a physician. Massage therapists rarely communicated with these physicians. Conclusion: This study provides new information about licensed massage therapists that should be useful to physicians and other healthcare providers interested in learning about massage therapy in order to advise their patients about this popular CAM therapy.Item type: Item , Practice patterns of naturopathic physicians: results from a random survey of licensed practitioners in two US States(2004) Boon, Heather S.; Cherkin, Daniel C.; Erro, Janet; Sherman, Karen J.; Milliman, Bruce; Booker, Jennifer; Cramer, Elaine H.; Smith, Michael J.; Deyo, Richard A.; Eisenberg, David M.Background: Despite the growing use of complementary and alternative medicine (CAM) by consumers in the U.S., little is known about the practice of CAM providers. The objective of this study was to describe and compare the practice patterns of naturopathic physicians in Washington State and Connecticut. Methods: Telephone interviews were conducted with state-wide random samples of licensed naturopathic physicians and data were collected on consecutive patient visits in 1998 and 1999. The main outcome measures were: Sociodemographic, training and practice characteristics of naturopathic physicians; and demographics, reasons for visit, types of treatments, payment source and visit duration for patients. Result: One hundred and seventy practitioners were interviewed and 99 recorded data on a total of 1817 patient visits. Naturopathic physicians in Washington and Connecticut had similar demographic and practice characteristics. Both the practitioners and their patients were primarily White and female. Almost 75% of all naturopathic visits were for chronic complaints, most frequently fatigue, headache, and back symptoms. Complete blood counts, serum chemistries, lipids panels and stool analyses were ordered for 4% to 10% of visits. All other diagnostic tests were ordered less frequently. The most commonly prescribed naturopathic therapeutics were: botanical medicines (51% of visits in Connecticut, 43% in Washington), vitamins (41% and 43%), minerals (35% and 39%), homeopathy (29% and 19%) and allergy treatments (11% and 13%). The mean visit length was about 40 minutes. Approximately half the visits were paid directly by the patient. Conclusion: This study provides information that will help other health care providers, patients and policy makers better understand the nature of naturopathic care.Item type: Item , Complementary and alternative medical therapies for chronic low back pain: What treatments are patients willing to try?(2004) Sherman, Karen J.; Cherkin, Daniel C.; Connelly, Maureen T.; Erro, Janet; Savetsky, Jacqueline B.; Davis, Roger B.; Eisenberg, David M.Background: Although back pain is the most common reason patients use complementary and alternative medical (CAM) therapies, little is known about the willingness of primary care back pain patients to try these therapies. As part of an effort to refine recruitment strategies for clinical trials, we sought to determine if back pain patients are willing to try acupuncture, chiropractic, massage, meditation, and t'ai chi and to learn about their knowledge of, experience with, and perceptions about each of these therapies. Methods: We identified English-speaking patients with diagnoses consistent with chronic low back pain using automated visit data from one health care organization in Boston and another in Seattle. We were able to confirm the eligibility status (i.e., current low back pain that had lasted at least 3 months) of 70% of the patients with such diagnoses and all eligible respondents were interviewed. Results: Except for chiropractic, knowledge about these therapies was low. Chiropractic and massage had been used by the largest fractions of respondents (54% and 38%, respectively), mostly for back pain (45% and 24%, respectively). Among prior users of specific CAM therapies for back pain, massage was rated most helpful. Users of chiropractic reported treatment-related "significant discomfort, pain or harm" more often (23%) than users of other therapies (5-16%). Respondents expected massage would be most helpful (median of 7 on a 0 to 10 scale) and meditation least helpful (median of 3) in relieving their current pain. Most respondents indicated they would be "very likely" to try acupuncture, massage, or chiropractic for their back pain if they did not have to pay out of pocket and their physician thought it was a reasonable treatment option. Conclusions: Most patients with chronic back pain in our sample were interested in trying therapeutic options that lie outside the conventional medical spectrum. This highlights the need for additional studies evaluating their effectiveness and suggests that researchers conducting clinical trials of these therapies may not have difficulties recruiting patients.
