HonestBone Osteoporosis Intelligence Engine

Evidence-based insights from FDA approval packages, product labels, clinical trial data, peer-reviewed publications, and OpenFDA entries.

Evidence Library

Loading documents...

Workspace

Loading recent activity...

Loading saved answers...

Loading collections...

Admin Console

Loading users...

Loading organizations...

Loading audit log...

Loading usage analytics...

Osteoporosis Intelligence Engine

The Mission

HonestBone was created to solve a core inefficiency in drug development: essential clinial and regulatory precedent and data exist, but are fragmented across FDA reviews, product labels, clinical trial publications, and research papers, making it slow to access, difficult to operationalize, and a challenge to translate into actionable strategy. HonestBone is a high-fidelity evidence platform focused on bone and mineral metabolism that transforms static regulatory archives into a structured, AI-driven intelligence layer. It enables clinical development, regulatory, and PV teams to instantly retrieve cited evidence and apply it to trial design, regulatory strategy, and safety evaluation, reducing time to insight while improving decision confidence.

The Founder/Creator

Anand Narayanan, MD is an endocrinologist and clinical development specialist focused on how technology and AI can improve the interpretation of clinical and regulatory evidence. His research, published in JAMA Network Open and JACC, has explored the use of natural language processing (NLP) to detect hospitalizations for worsening heart failure, and he has written in Medscape on insights from wearable technology in thyroid disease. He founded HonestBone, an AI-driven platform that extracts, structures, and interprets clinical, regulatory, and safety data to support evidence-driven decisions across cross-functional teams. He currently serves as Senior Associate Medical Director of Clinical Development at a biopharmaceutical company focused on musculoskeletal diseases.

The Technology

HonestBone utilizes an advanced Hybrid RAG (Retrieval-Augmented Generation) architecture that combines a curated library of trusted documents, including FDA reviews and summaries, product labels, Phase 3 protocols when available, randomized controlled trial publications, and select research papers, with modern large language models.

For each query, the system retrieves relevant passages from the Evidence Library and can also access structured regulatory and safety data through integrations such as the OpenFDA and PubMed API, using secure system connectors based on the Model Context Protocol (MCP).

The system then generates responses linked back to the underlying sources, ensuring insights are based on verifiable clinical and regulatory information rather than model knowledge alone. Each citation includes a relevance score showing how closely that document passage matches your question using semantic search, with a higher score indicating a stronger textual match and retrieval relevance, not clinical importance or correctness.

Pipeline Overview

Source Documents → Parser → Structured Page Data (text, layout, metadata) → Semantic Chunking → Chunk Objects → Embeddings Model → Vectorized Knowledge Database

Question → Query Embedding (vectorized) → Similarity Search → Nearest Vectors Retrieved → Chunks and Metadata → Structured Context Package → LLM invoked → Synthesizes Answer

View Technical Architecture

RAG (Retrieval-Augmented Generation): RAG is a framework that gives an AI a "searchable library" to reference before it answers. Rather than relying only on its training data, the AI retrieves relevant information from our Evidence Library, which has been processed into indexed sections within a vector database, and then generates a response based on those facts.
Docling with OCR (Optical Character Recognition): We use the Docling engine to "ingest" (read and process) complex medical files. It includes OCR, a technology that converts images of text (like scanned PDFs) into machine-readable data. This ensures that even "flat" historical documents are fully searchable and usable.
"Salami Slicing" & Structural Integrity: Traditional AI systems often struggle with complex tables. We use a method called salami slicing to break documents into ultra-specific, context-aware chunks. This preserves the spatial relationships of complex tables and nested hierarchies in documents that traditional AI parsers often scramble.
MCP (Model Context Protocol): MCP is the "universal connector" of our architecture. It allows HonestBone to seamlessly pull data from multiple sources at once such as ingested files that are viewable in the Evidence Library, OpenFDA, and PubMed, and present it through a single interface.
Dual-Model Routing for Precision: We use task-based model routing to balance performance, scalability, analytical depth, and computational resources. Routine queries are handled by an LLM optimized for document interpretation, synthesis, and evidence-grounded responses, while more complex queries automatically invoke a higher-context analysis model.
Source-First Integrity: Every answer aims to be anchored to a specific source file and page number that is easily accessible. This "Chain of Verification" provides complete regulatory transparency, making every insight audit-ready for clinical development and regulatory science.

Real Talk

Commentary on how medicine and regulatory science learns and evolves, and when it doesn't. Views published here are my own and do not reflect those of any current or prior employer.

Anand Narayanan, MD · May 2026

The Trial That Should Have Ended a Theory, and Didn’t

On ACCORD, eighteen years later, and what it tells us about how medicine learns

A 72-year-old woman with coronary disease, an eGFR of 52, and a hemoglobin A1c of 7.6 sits in an endocrinologist’s office. She is on metformin, a GLP-1 agonist, and basal insulin. She feels fine. Her last hypoglycemic episode was three weeks ago, in the parking lot of a Trader Joe’s, and her daughter had to give her juice.

Read the piece

Anand Narayanan, MD · May 2026

The Drug That Built the Regulatory Framework That Reversed Its Own Verdict

On rosiglitazone, the meta-analysis that changed everything, and the institutional appetite for forgetting

In 2006, an American clinician treating type 2 diabetes had a reasonable set of choices for second-line therapy after metformin. A sulfonylurea, with its half-century of evidence and its hypoglycemia risk. Insulin, with its weight gain and injection burden. Or one of the new thiazolidinediones, which sensitized tissues to insulin in a way that felt mechanistically elegant.

Read the piece

Anand Narayanan, MD · May 2026

The Endpoint That Wasn’t, the Question That Wasn’t Asked, and the Compounded Shadow Market That Filled the Gap

On STEP, SURMOUNT, and the strange history of how we approved the most successful drugs in the history of obesity

A 47-year-old marketing executive in Brentwood loses 38 pounds in fourteen months on semaglutide. Her blood pressure normalizes. Her HbA1c drops from 5.9 to 5.4. She is, by every measure her physician records, a clinical success.

Read the piece

Anand Narayanan, MD · May 2026

The Trial That Came Twenty-Five Years Late, and the Industry That Didn’t Wait

On TRAVERSE, the invention of Low T, and what it means when the trial you demanded finally arrives

A 54-year-old man with a BMI of 31, fatigue, low libido, and a morning testosterone of 280 ng/dL walks into a “men’s health” clinic in a strip mall outside Tampa. He has seen the ads. He knows about Low T. He has come, he says, because he doesn’t feel like himself anymore.

Read the piece

The Trial That Should Have Ended a Theory, and Didn’t

On ACCORD, eighteen years later, and what it tells us about how medicine learns

Anand Narayanan, MD · May 2026

The endocrinologist looks at the A1c and frowns. Seven-point-six. The guideline says less than seven. She adjusts the insulin up.

This scene plays out tens of thousands of times a day in the United States. It is treated as good medicine. It is, in fact, the precise clinical strategy that a 10,251-patient randomized trial showed kills people.

The trial was called ACCORD. It was published in the New England Journal of Medicine in June 2008. It stopped early. It stopped because the patients getting the aggressive treatment were dying.

We have known this for eighteen years.

We have not updated.

· · ·

Let me give you the trial in numbers, because the numbers matter and they are not difficult.

ACCORD enrolled patients with type 2 diabetes at high cardiovascular risk. Mean age 62. Mean diabetes duration 10 years. Mean baseline A1c 8.1%. Half the patients had established cardiovascular disease.

Half were assigned to intensive glycemic control, targeting an A1c below 6.0%. The other half to standard control, targeting 7.0 to 7.9%. Everything else was identical between arms: blood pressure management, lipid management, antiplatelet therapy. The only thing being tested was the A1c target.

The intensive arm achieved a median A1c of 6.4%. The standard arm achieved 7.5%. By any pharmacologic standard, the intervention worked. The biomarker moved.

After 3.5 years, the Data and Safety Monitoring Board stopped the glycemia arm. The reason was simple. In the intensive arm: 257 deaths. In the standard arm: 203 deaths. A hazard ratio of 1.22, with a confidence interval that excluded 1.0. Cardiovascular death was worse: hazard ratio 1.35.

The primary composite endpoint (nonfatal MI, nonfatal stroke, or CV death) was numerically lower in the intensive arm but not statistically significant. Intensive control reduced nonfatal heart attacks. It increased fatal ones. On the composite, it was a wash. On mortality, it was harm.

This was not a small trial. This was not a fluke. ADVANCE and VADT, the two other intensive-control trials of the era, did not replicate the mortality signal but also did not show benefit. The collective signal across three large trials of intensive A1c lowering in established type 2 diabetes was: no macrovascular benefit, possible harm, definite hypoglycemia, definite weight gain, definite cost, definite patient burden.

The theory that lowering A1c reduces cardiovascular events in established type 2 diabetes was the operating assumption of the field for thirty years. The theory was tested. The theory failed.

And yet.

And yet the FDA still approves diabetes drugs on the basis of A1c reduction. Open the label of any drug approved since 2008. The efficacy claims are A1c. Not mortality. Not MACE. A1c.

There is a regulatory architecture here that survived ACCORD without flinching. The 2008 FDA Guidance for Industry on diabetes drugs, issued in the immediate aftermath of ACCORD and the rosiglitazone meta-analysis, added a requirement for cardiovascular safety trials. It did not change the efficacy bar. A new diabetes drug, in 2026, still gets approved because it lowers a number that we have known for eighteen years does not reliably predict the outcomes patients care about.

I want to be precise about what I am claiming. A1c is a useful biomarker. It predicts microvascular complications quite well: retinopathy, nephropathy, neuropathy. The DCCT and UKPDS established this. I am not saying A1c is meaningless.

I am saying that the field’s response to ACCORD should have been to reconsider whether A1c was an adequate surrogate for the macrovascular outcomes that drive mortality and morbidity in older patients with established type 2 diabetes. Instead, the field’s response was to keep approving drugs on A1c and add a parallel requirement for CV safety trials.

This is what surrogate endpoint capture looks like in the wild. The biomarker is institutionalized. The trial that should have unseated it is reframed as a hypoglycemia story, a rosiglitazone story, an “individualized targets” story: anything except what it is, which is a story about a biomarker that doesn’t deliver on its promise.

· · ·

The hypoglycemia explanation deserves its own discussion because it is repeated everywhere, even in my fellowship training, and it does not survive contact with the data.

The standard reading of ACCORD is: intensive control caused more severe hypoglycemia, hypoglycemia caused arrhythmias, arrhythmias caused death. It is a clean mechanistic story. It allows the field to say “ACCORD was about hypoglycemia, not about A1c targets per se.” It is also, on the available evidence, probably wrong.

In 2010, the ACCORD investigators published a post-hoc analysis in BMJ. They looked at the relationship between severe hypoglycemia and mortality within each arm. Among patients who experienced severe hypoglycemia, mortality was actually higher in the standard arm than the intensive arm. The straightforward interpretation: severe hypoglycemia marks frailty and high risk. It does not, in any clean way, cause the excess mortality in the intensive arm.

This is the methodological move that should be taught in every fellowship: a post-hoc subgroup analysis that contradicts the convenient mechanistic story. The convenient story is repeated anyway, because it lets everyone keep doing what they were doing. The intensivists can say, “We just need to avoid hypoglycemia.” The guidelines can say, “Individualize, but the target remains seven.”

What actually killed people in ACCORD? We don’t know. Hypoglycemia probably contributes. Rapid glucose lowering may matter. Weight gain (the intensive arm gained 3.5 kg) may matter. Rosiglitazone, used heavily in the intensive arm and independently associated with cardiovascular harm in a contemporaneous meta-analysis, may matter. Polypharmacy and drug interactions in patients on four or five glucose-lowering agents may matter.

We do not know. Eighteen years on. The field has not run the trial that would tell us.

Absence of Evidence is not Evidence of Absence.

· · ·

Here is the part where I name names, because that is what honest commentary requires.

The American Diabetes Association’s Standards of Medical Care in Diabetes is the most influential single document in American diabetes practice. It is updated annually. It has, since 2009, used some version of the phrase “individualize glycemic targets.”

This sounds reasonable. It is reasonable, on its face. Of course targets should be individualized. Of course a 45-year-old with new-onset diabetes is different from a 78-year-old with three prior MIs.

But “individualize” in practice has meant: keep the seven percent target as the default, and carve out exceptions for the frail. The default did not move. The trial that should have moved the default did not move it.

Compare this to oncology, where a single negative phase 3 trial routinely changes the standard of care within twelve months. Compare it to cardiology, where the COURAGE trial of stenting in stable CAD changed practice: slowly, incompletely, but it changed it. The diabetes field absorbed ACCORD and kept going.

Why? Yes I will ineed speculate... The ADA Standards of Care writing committee includes members with extensive industry funding, disclosed on the Diabetes Journals website. The diabetes drug industry depends on A1c as the efficacy endpoint. A guideline that softened A1c targets would soften the commercial case for new agents. A guideline that aggressively reframed A1c as a microvascular surrogate without macrovascular validity would be commercially catastrophic for an industry that markets primarily to primary care physicians who have been trained for thirty years to treat the number.

I am not alleging conscious bias. I am alleging structural incentive. The structure produces the result whether or not anyone in the room intends it.

· · ·

There is an irony in the ACCORD story that needs to be told, because it is the one redeeming feature of the regulatory response.

The 2008 FDA guidance required cardiovascular outcome trials for new diabetes drugs. This was a burden. The industry hated it. Trials cost hundreds of millions of dollars. Many feared the requirement would kill diabetes drug development.

It did not kill it. It revealed it.

The CV outcome trials run between 2008 and 2020 included EMPA-REG OUTCOME, LEADER, SUSTAIN-6, CANVAS, DECLARE, REWIND, and others. They demonstrated that SGLT2 inhibitors and GLP-1 receptor agonists, drugs initially approved on A1c, actually reduced cardiovascular events, mortality, and in some cases renal endpoints. The CV benefits of these drugs were not predicted by their A1c effects. They were discovered because regulators forced trials that the industry would never have run on its own.

This is the deep lesson of ACCORD that almost no one tells. The trial’s harm signal forced a regulatory requirement that produced the evidence base for the current treatment paradigm. SGLT2 inhibitors are now first-line for type 2 diabetes with CV or renal disease: not because they lower A1c, but because they reduce MACE and heart failure hospitalizations and slow eGFR decline. We learned this because ACCORD scared the FDA into requiring outcome trials.

And then, in 2020, the FDA softened the CVOT requirement. The justification was that the SGLT2 and GLP-1 trials had established class effects and further trials were unnecessary. Maybe. Or maybe regulatory memory is short and industry pressure is constant, and the requirement that produced the most important advances in diabetes care of the last twenty years was sunset because it was inconvenient.

The next class of diabetes drugs will not have CVOTs. We will not know if they help or harm cardiovascularly. We will approve them on A1c. We will repeat the cycle.

· · ·

What should the field have done?

It should have stopped approving diabetes drugs on A1c alone in established high-risk patients. It should have required mortality and MACE endpoints. It should have rewritten the ADA Standards of Care to lead with cardiovascular and renal outcome data, with A1c as a secondary efficacy measure relevant primarily to microvascular complications. It should have stopped treating “treat to target” as a clinical philosophy and started treating it as a hypothesis that failed to be proven by ACCORD.

It did none of these things.

It added a parallel safety requirement, hedged the guidelines, blamed hypoglycemia, and kept going.

The 72-year-old woman in the endocrinologist's office is still getting her insulin titrated up. The endocrinologist is following the guideline. The guideline is loosely tethered to a literature that includes ACCORD as one entry among many, methodologically equivalent to a single observational study in the synthesis. The trial that should have ended a theory has been absorbed into a footnote.

This is how medicine learns when it does not want to.

· · ·

I want to close with a discussion that generalizes beyond diabetes.

ACCORD did not fully reorganize the field around the possibility that biomarker/surrogate improvement and clinical benefit can meaningfully diverge.

That matters because much of modern medicine in today's fast world increasingly depends on surrogate endpoints, composite outcomes, accelerated approvals, and post-marketing inference. The question is not whether these tools are ever useful. Many clearly are. The question is how reliably the system recalibrates when a major trial challenges the assumptions underneath them.

My own view is that medicine updates unevenly. It incorporates evidence more readily when the evidence extends an existing framework than when it destabilizes one. That is not primarily a story about bad actors. Most people inside the system are trying, in good faith, to improve patient outcomes. But institutional structures, professional commitments, and clinical habits all create momentum that is difficult to reverse.

I do not think this is a counsel of despair. I think it is a reminder that negative trials deserve the same intellectual seriousness as positive ones.

The trials running today will test a new generation of assumptions. Some of those will survive. Some will not. The harder question is whether we will recognize the difference quickly enough to adapt our regulatory science and medicine when the evidence arrives.

The Drug That Built the Regulatory Framework That Reversed Its Own Verdict

On rosiglitazone, the meta-analysis that changed everything, and the institutional appetite for forgetting

Anand Narayanan, MD · May 2026

The most prescribed of these was rosiglitazone, marketed as Avandia by GlaxoSmithKline. In 2006, GSK sold $3.4 billion of it worldwide. Approximately 1.1 million Americans were taking it. It was, in the language of the pharmaceutical industry, a “franchise drug.”

In May 2007, Steven Nissen and Kathy Wolski of the Cleveland Clinic published a meta-analysis in the New England Journal of Medicine pooling 42 trials of rosiglitazone, comprising 27,847 patients. The pooled odds ratio for myocardial infarction was 1.43, with a 95% confidence interval of 1.03 to 1.98. The odds ratio for cardiovascular death was 1.64, with confidence intervals that crossed unity.

The drug that fixed insulin sensitivity at its root, the analysis suggested, was killing people from heart attacks at the rate of roughly one excess MI per several hundred patient-years of exposure.

What happened next is the story I want to tell.

· · ·

The immediate response was chaos. The FDA convened an advisory committee in July 2007. The committee voted 20 to 3 that rosiglitazone increased the risk of cardiac ischemic events but voted 22 to 1 to keep the drug on the market. This is its own teaching case in advisory committee logic (the committee believed the drug caused harm and believed it should remain available), but it is not my topic here.

A black box warning was added in November 2007. Rosiglitazone initiators decreased from 39.1% to 8.0% by the fourth quarter of that year. The prescribing decline was immediate and dramatic. Clinicians did not need the FDA to tell them what to do. The Nissen meta-analysis, combined with the existence of pioglitazone as a same-class alternative without the cardiovascular signal, did the work.

In 2008, ACCORD stopped early for mortality harm in the intensive glycemic control arm. The intensive arm had used rosiglitazone heavily. Whether rosiglitazone contributed to ACCORD’s mortality signal was never definitively established: the analyses were post hoc, the comparators were confounded, the question is methodologically very hard. But the temporal proximity of the Nissen meta-analysis and the ACCORD result created a regulatory moment.

The FDA issued its 2008 Guidance for Industry on Diabetes Mellitus: Evaluating Cardiovascular Risk in New Antidiabetic Therapies to Treat Type 2 Diabetes. Every new diabetes drug, henceforth, would require evidence of cardiovascular safety from outcome trials. The era of approving diabetes drugs purely on HbA1c was over.

This guidance, born directly from the rosiglitazone affair, is the most consequential piece of regulatory science in modern endocrinology. It forced the trials that produced EMPA-REG OUTCOME, LEADER, CANVAS, DECLARE, and the entire SGLT2 inhibitor and GLP-1 receptor agonist evidence base. It transformed diabetes care.

It happened because Steven Nissen pooled 42 trials in 2007 and found a hazard ratio of 1.43.

· · ·

Now here is the part that has to be told carefully, because it is the part that almost no one tells.

In 2010, the FDA added a Risk Evaluation and Mitigation Strategy (REMS) to rosiglitazone. Special certification was required for healthcare providers who prescribed it. Only specially certified pharmacies could dispense it. Only patients already taking it or new patients who could not use any other glucose-lowering medications were eligible. The drug was, functionally, withdrawn from the market without being formally withdrawn. The number of patients using rosiglitazone in the United States dropped from 117,349 in 2009 to 3,405 during 2011 to 2013.

Meanwhile, GSK had been required to conduct a postmarketing cardiovascular outcome trial called RECORD (Rosiglitazone Evaluated for Cardiovascular Outcomes and Regulation of Glycemia in Diabetes). RECORD had actually started in 2001, before the Nissen meta-analysis, as a long-term safety study. Its interim results were published in 2007 and 2009. Its final results, in 2009, did not show a statistically significant increase in MI or cardiovascular death with rosiglitazone compared to sulfonylurea or metformin.

RECORD was, on its face, a negative trial for the Nissen hypothesis. But RECORD was widely criticized at the time. It was open-label. It had high event rates that suggested poor adherence. Its endpoint adjudication had been challenged. Its statistical analysis plan had been modified during the trial. Nissen and others argued, in 2010 and 2011, that RECORD was methodologically too weak to overturn the meta-analytic signal.

Then, in 2013, something interesting happened. The FDA commissioned an independent readjudication of the RECORD data by the Duke Clinical Research Institute. The readjudication confirmed the original RECORD findings: no significant cardiovascular harm.

In November 2013, the FDA determined that data for rosiglitazone-containing drugs did not show an increased risk of heart attack compared to the standard type 2 diabetes medicines metformin and sulfonylurea, and required removal of the prescribing and dispensing restrictions that had been put in place in 2010. The black box warning was modified. In December 2015, the FDA removed the remaining REMS entirely.

Rosiglitazone was, by the FDA’s 2013 and 2015 actions, rehabilitated.

The 2007 meta-analysis was, in the FDA’s revised reading, an overcall.

The regulatory action that had built the foundation of modern diabetes drug evaluation had been, the agency now said, based on evidence that did not hold up.

· · ·

This is where you have to slow down and think carefully, because the temptation to make this a simple story in either direction is enormous and the story is not simple.

The simple version one is: Nissen was right, GSK was wrong, the drug killed people, the rehabilitation was industry-pressure-driven revisionism. This is the version you will hear from people who anchored on the 2007 meta-analysis.

The simple version two is: Nissen was wrong, the meta-analysis was based on small short-term trials not designed for cardiovascular outcomes, RECORD was the definitive study, the FDA’s rehabilitation was a correct scientific update. This is the version you will hear from people who anchored on RECORD.

Neither version is honest.

The honest version is something like this. The Nissen meta-analysis was a methodologically defensible analysis of the data available in 2007. It pooled studies that were not designed to detect cardiovascular events, but it was also the best available evidence at the time. The signal it detected was real in the data, even if its causal interpretation was uncertain. RECORD was a methodologically limited trial: open-label, underpowered for its intended use as a definitive arbiter, with adjudication concerns that were legitimate. The 2013 FDA decision rested on a readjudication that confirmed RECORD’s findings but did not, in any rigorous sense, prove that rosiglitazone was safe. It proved that RECORD did not show harm. These are different claims.

What we actually know about rosiglitazone, as of 2026, is roughly this. It probably does not cause large excess myocardial infarctions in the population that was studied in RECORD. It probably does increase heart failure, which is a class effect of TZDs and is not in dispute. It causes weight gain and fluid retention. It causes osteoporosis-related fractures in women, which is also a class effect and also not in dispute. Its glycemic efficacy is real and durable, more durable than sulfonylureas in the ADOPT trial. Whether the net clinical effect in a 2026 patient population is positive or negative remains genuinely uncertain.

The drug is, today, prescribed to almost no one. Rosiglitazone initiation remained below 1.0% even after regulatory restrictions were removed in November 2013. The prescribing decision was made by clinicians and was not reversed by the regulatory rehabilitation. When the REMS was lifted in 2015, Nissen told Medscape it was a nonevent: the drug was no longer used except in rare patients, and had been removed from the market by most other countries.

He was right. Clinically, the rehabilitation didn’t matter. The drug was already dead.

· · ·

What does this tell us?

Let me try to extract the methodological lessons, because the lessons are the point.

Lesson one: regulatory action based on meta-analysis of short-term trials is dangerous in both directions. The Nissen meta-analysis may have overcalled the harm. It also may have undercalled it. We will never know, because the regulatory response made the definitive trial impossible: once the REMS was in place, recruiting for a new rosiglitazone CV outcome trial was unthinkable. The signal was acted on before it could be confirmed, and the action precluded confirmation. This is a recurring pattern in pharmacovigilance and it has no good answer. If you wait for definitive evidence before acting, patients are harmed during the wait. If you act on suggestive evidence, you may act wrongly and the action may foreclose the trial that would have told you whether you were right.

Lesson two: a single trial, RECORD, should not be the basis for reversing a regulatory action grounded in meta-analytic evidence. The FDA’s 2013 reversal was based primarily on a readjudication of RECORD. RECORD had real methodological limitations. The fact that a flawed trial, after independent readjudication, continues to show no harm is meaningful, but it is not the same as a well-designed trial showing no harm. The asymmetry in the evidentiary standards between the 2010 restriction and the 2013 reversal is striking. The restriction was based on Nissen plus accumulating signal. The reversal was based on one flawed trial that, on closer inspection, was the same flawed trial it had been in 2009.

Lesson three: the regulatory framework treats reversibility as if it were costless. It is not. A million patients were on rosiglitazone in 2006. By 2013, almost none were. The clinicians who stopped prescribing it in 2007, based on the Nissen meta-analysis and the black box warning, were not going to restart in 2013 based on a readjudication of RECORD. The information had a half-life. Once the field had moved on, the regulatory rehabilitation was a piece of paper. This is not a problem unique to rosiglitazone. It is the structural feature of postmarketing safety regulation. Drugs that survive a safety scare with their reputations intact are rare. Drugs that don’t are effectively gone, regardless of what the agency later concludes.

Lesson four, and this is the one I find most uncomfortable: the regulatory framework that emerged from the rosiglitazone affair has been enormously productive, even if the affair itself may have been a partial false positive. The 2008 CV safety guidance forced the SGLT2 and GLP-1 outcome trials. Those trials transformed diabetes care. If the Nissen meta-analysis was an overcall, the consequences of that overcall were among the most beneficial regulatory actions in modern endocrinology. Bad science can produce good policy. Good policy can rest on bad science. This is not a comfortable conclusion. It is, I think, the correct one.

· · ·

There is a meta-question here that the rosiglitazone story raises and that almost no one wants to address.

If we now believe, as the FDA seems to, that rosiglitazone was not as dangerous as the 2007 meta-analysis suggested, what do we make of the 2008 guidance that the meta-analysis triggered? The guidance was justified on the grounds that diabetes drugs needed CV safety evaluation because rosiglitazone had taught us that approving on HbA1c alone was dangerous. If rosiglitazone, in retrospect, did not teach us that, does the guidance still rest on its original foundation?

The FDA’s answer, implicit in its 2020 softening of the CVOT requirement, appears to be: the foundation matters less than the consequences. The CVOTs of the 2010s revealed CV benefits we would not have discovered otherwise. The guidance was retrospectively justified by what it produced even if its initial justification has weakened. This is policy reasoning, not scientific reasoning, and the agency has not been forthright about the shift.

I find this honest position more interesting than either of the partisan versions. The regulatory framework was built on a finding that the agency itself later partially walked back, and it has been kept in place because of its consequences rather than because of its premise. This is fine, perhaps, but it is not the story we tell.

The story we tell is that rosiglitazone was a dangerous drug, the system caught it, and the system was strengthened to catch the next one. The actual story is that rosiglitazone might have been a moderately dangerous drug, the system reacted strongly to a signal that did not fully hold up, and the system was strengthened in ways that produced excellent science for reasons that turned out to be different from the reasons originally cited.

You can defend the outcome. You should be honest about the path.

· · ·

I want to close with the patient who is missing from this essay, because that patient is missing from most discussions of the rosiglitazone affair, and the omission is part of the problem.

In 2006, a 64-year-old man with type 2 diabetes, on metformin and a sulfonylurea, with rising HbA1c and progressive insulin resistance, would have been started on rosiglitazone. In 2008, he would have been switched off it, probably to pioglitazone, possibly to insulin. In 2013, the FDA told his physician that the switch had been based on evidence that had not held up. His physician did not call him to discuss this.

The man is now 84, if he is still alive. He has been on three or four different diabetes regimens since 2008. He has gained weight on some and lost weight on others. He has had a heart attack, which may or may not have anything to do with any of this. His diabetes is poorly controlled. He takes a GLP-1 receptor agonist now, because his current physician believes the SELECT trial.

He is not a participant in any of this. He is the object of it. The regulatory framework was built around him without consulting him, modified without consulting him, and softened without consulting him. The meta-analyses and the trials and the advisory committees and the REMS programs and the readjudications all happened above his head, and he changed pills when his doctor told him to change pills, and the outcomes of his individual life were determined by the cumulative interaction of all these institutional decisions in ways that no one can disaggregate, even in principle.

This is what regulatory medicine looks like from the inside of a body that is, eventually, going to be a statistic.

We should remember the body when we tell these stories.

We do not always remember.

The Endpoint That Wasn’t, the Question That Wasn’t Asked, and the Compounded Shadow Market That Filled the Gap

On STEP, SURMOUNT, and the strange history of how we accidentally got the most successful drugs in the history of obesity

Anand Narayanan, MD · May 2026

In month fifteen, her insurer decides Wegovy is no longer covered. She cannot afford $1,350 a month out of pocket. She tries to stretch her remaining pens to half doses. By month eighteen she has regained 31 of the 38 pounds. Her blood pressure is back up. Her sleep apnea returns. She is, by every measure her physician records, exactly where she started.

She finds a compounding pharmacy online. The semaglutide costs $200 a month. She does not know whether the API is from an FDA-registered facility, whether it has been tested for potency, whether the vial is labeled correctly, or whether the syringe she is being shipped delivers the dose she thinks it does. She uses it anyway.

This is the obesity drug story, properly told. Not the Time magazine cover story. The actual one.

· · ·

Let me start with what STEP 1 measured, because almost no one who quotes its results has read the methods section.

STEP 1, published in NEJM in 2021, enrolled 1,961 adults with a BMI of 30 or greater, or 27 with a weight-related comorbidity. None had diabetes. They were randomized 2:1 to once-weekly subcutaneous semaglutide 2.4 mg or placebo for 68 weeks. Everyone got lifestyle counseling.

The coprimary endpoints were two things: percentage change in body weight at week 68, and the proportion of participants achieving at least 5% weight loss.

Read that again. The coprimary endpoints were a percentage change on a scale, and the proportion crossing a threshold on that same scale. They were the same thing, reported two ways. There were no coprimary endpoints about cardiovascular events. No coprimary endpoints about diabetes incidence. No coprimary endpoints about mortality. No coprimary endpoints about durability beyond 68 weeks. No coprimary endpoints about anything other than weight on a scale at 68 weeks.

The results were extraordinary in their own terms. Semaglutide produced a mean weight reduction of 14.9% versus 2.4% with placebo. 86.4% of the semaglutide group achieved at least 5% weight loss compared with 31.5% on placebo. The drug worked. The biomarker moved. The biomarker, in this case, was the patient’s weight.

And on the basis of this trial, and SURMOUNT-1 for tirzepatide, which used the same coprimary endpoint construction, the FDA approved the two most commercially successful drugs in the history of obesity pharmacotherapy.

For a population intervention that will be taken, in some cases, for the rest of a patient’s life.

For an indication that exists, fundamentally, because obesity causes downstream cardiovascular, metabolic, and oncologic harm.

Without requiring, at the time of approval, evidence that the drugs reduced any of those downstream harms.

· · ·

The defense of this regulatory choice is that obesity itself is the disease, and weight loss itself is the benefit. I have heard this argument from people I respect. Let me give it its strongest form.

Obesity meets diagnostic criteria. It has ICD codes. It has prevalence data. It has morbidity data. Treating it pharmacologically is no different in principle from treating hypertension or hyperlipidemia, where we approve drugs on blood pressure or LDL reduction and trust the epidemiology to translate into outcomes.

The argument has merit. It is also, on inspection, weaker than it looks.

Hypertension and hyperlipidemia have outcome trials. Decades of them. We approve new antihypertensives on blood pressure reduction because we have ALLHAT and HOPE and ONTARGET and SPRINT establishing the causal chain between blood pressure reduction and CV events for the existing class. We approve new statins on LDL reduction because we have 4S and CARE and HPS and JUPITER and IMPROVE-IT establishing that LDL reduction by this mechanism reduces CV events.

What was the equivalent evidence base for weight-loss-as-outcome at the time STEP 1 was published? The track record was not reassuring. Sibutramine reduced weight and increased cardiovascular events; it was withdrawn from the market in 2010. Rimonabant reduced weight and increased psychiatric events and suicide; it was withdrawn. Fenfluramine and phentermine reduced weight and damaged heart valves; withdrawn. Lorcaserin reduced weight and was eventually withdrawn in 2020 because of increased cancer risk.

The history of obesity pharmacotherapy is not a history of “weight loss reliably translates to clinical benefit.” It is a history of weight-loss drugs reducing weight and then turning out to harm people in ways their phase 3 trials did not measure, because their phase 3 trials measured weight.

Against this background, the regulatory decision to approve semaglutide and tirzepatide on weight endpoints alone is not the obvious, conservative, evidence-based choice it is sometimes portrayed as. It is a choice that bets, reasonably perhaps, but as a bet, that this time the mechanism is different.

That bet has, in fairness, partly paid off. SELECT, published in 2023, showed semaglutide 2.4 mg reduced MACE in patients with overweight or obesity and established cardiovascular disease. This is a real result. It justifies, retrospectively, the gamble.

But notice the structure. We approved the drug for a population in 2021. We learned in 2023 that one subset of that population, secondary prevention patients, derives cardiovascular benefit. We have not learned, as of this writing, whether the much larger primary prevention population, which is most of who takes these drugs, derives any outcome benefit at all. The marketing executive in Brentwood is not in SELECT. SELECT does not tell us anything specifically about her.

This is the regulatory architecture: approve on a surrogate, sort out the outcomes later, accept that “later” may never come for the population that actually gets the drug.

· · ·

Then there is the STEP 1 extension trial, which almost no one cites in the popular discourse around these drugs, and which I would argue is the single most important piece of data we have.

After 68 weeks of the main STEP 1 protocol, a subset of participants was followed off-drug. From week 0 to week 68, the semaglutide group lost 17.3% of body weight. After treatment withdrawal, they regained 11.6 percentage points by week 120. Cardiometabolic improvements reverted toward baseline.

Two thirds of the weight loss came back when the drug was stopped. The blood pressure improvement reverted. The lipid improvement reverted. The HbA1c improvement reverted.

This is not a minor footnote. This is the central pharmacologic fact about this drug class. Semaglutide and tirzepatide are not cures. They are not even durable interventions in the sense that statins are durable, where the LDL reduction persists modestly off-drug because of changed hepatic biology. These are pharmacologic exoskeletons. When you take them off, the underlying physiology returns.

This has enormous implications that the STEP and SURMOUNT primary endpoints could not capture, because the primary endpoints were measured on drug. The endpoints were silent about durability. The endpoints were silent about the lifetime exposure required to maintain benefit. The endpoints were silent about what happens to patients whose insurance changes, whose drug is reformulated, whose access is interrupted, whose adherence flags, whose pharmacy is out of stock.

If you are going to approve a drug for a chronic disease and the drug works only while it is being taken and the disease returns when it is not, the relevant endpoint is not 68-week weight change. The relevant endpoint is something like five-year sustained weight reduction in real-world use, or five-year cardiovascular outcomes, or quality of life adjusted for the burden of indefinite weekly injection.

We did not require any of those things.

We required percentage weight loss at 68 weeks.

· · ·

Now to the compounding shadow market, which is the part of this story that exposes what the regulatory framework actually optimizes for.

Wegovy and Mounjaro and Zepbound went into shortage almost immediately upon launch. Novo Nordisk and Eli Lilly could not manufacture peptide and pen injectors fast enough. The shortages were declared: semaglutide added to the FDA shortage list in March 2022, tirzepatide added shortly after.

Under section 503A and 503B of the Federal Food, Drug, and Cosmetic Act, when a drug is on the FDA shortage list, compounding pharmacies are permitted to produce copies. This is a sensible provision in principle. It exists for emergencies: chemotherapy shortages, antibiotic shortages, hospital-grade injectables. It was not designed for chronic lifestyle medications produced by the largest pharmaceutical companies in the world.

But there it was. The shortage was real, the statute was clear, and compounded GLP-1 became a multi-billion-dollar industry. Compounded GLP-1s reached roughly 30% of the market by some estimates. Telehealth companies built entire businesses around online prescribing of compounded semaglutide and tirzepatide. The price was a quarter of the brand. The volume was enormous. The regulatory oversight was, to put it charitably, variable.

What did the FDA know about what compounded GLP-1 patients were getting? As of early 2025, the FDA had received more than 455 adverse event reports linked to compounded semaglutide and more than 320 reports associated with compounded tirzepatide, many involving dosing errors from patients self-administering incorrect doses from multidose vials, some of which required hospitalization.

That is the reported number. The actual number is, by every reasonable estimate of adverse event underreporting, much higher.

Then the shortage ended. The FDA resolved the tirzepatide shortage in December 2024 and the semaglutide shortage in February 2025. Compounding had to wind down. The Outsourcing Facilities Association filed lawsuits; courts denied preliminary injunctions. On April 30, 2026, the FDA proposed excluding semaglutide, tirzepatide, and liraglutide from the 503B Bulks List, citing no clinical need for outsourcing facilities to compound these drugs from bulk API.

Door closed, for now. With a public comment period open through June 2026 and litigation continuing.

· · ·

The compounding episode tells us three things about the regulatory framework that nobody quite wants to say out loud.

First, when access to a drug is gated by price rather than by safety or evidence, the framework will route around itself. The FDA approved Wegovy at a price point that excluded most of the population that meets the label. Insurers responded by not covering it. Patients responded by buying compounded versions of dubious provenance. The FDA responded by trying to enforce against compounders. The compounders responded by suing. The lawsuits will continue. None of this was inevitable. All of it follows from approving a drug at a price the system could not sustain for an indication the size of the obesity population.

Second, the shortage designation became a regulatory hot potato that everyone wanted to handle in a way that benefited their own constituency. Novo Nordisk and Eli Lilly wanted the shortage resolved as quickly as possible to shut down compounding competition. The compounding industry wanted it to continue indefinitely. Patients wanted continued access at the compounded price. The FDA wanted the legal cover of a shortage being either present or absent in a binary way that the statute requires but reality rarely provides.

Third, and this is the part that should worry anyone thinking carefully about endpoint selection: we approved a drug for tens of millions of Americans, watched a parallel unregulated supply chain emerge to serve the people the formal market priced out, generated hundreds of FDA-reported adverse events and presumably thousands of unreported ones, and the entire episode unfolded without anyone in the regulatory architecture being able to say “we should not have approved this drug at this price point with this evidence base for this size of population without a plan for what happens when demand outstrips supply.”

That sentence is never spoken because the regulatory framework does not have a place for it. The framework approves drugs on evidence that they work in the trial. It does not approve drugs on evidence that the access pathway, the pricing structure, and the supply chain are adequate to deliver the drug to the population that was studied. Those are someone else’s problem. They are nobody’s problem. They are the patient’s problem.

· · ·

What should we have done differently?

We should have required, as a condition of approval, a primary endpoint at least 24 months out, with on-drug and off-drug analyses, so the durability question was answered before marketing rather than years after. The trials would have been longer and more expensive. They would have been worth it.

We should have required a cardiovascular outcomes trial in the primary prevention population, not just the secondary prevention population captured in SELECT, before approving for the millions of patients with elevated BMI and no established CV disease. The signal might have been positive. We do not know, because we did not require the trial.

We should have made approval contingent on a price and access plan that did not predictably generate a shortage and a parallel unregulated market. This is unusual to say in American drug regulation, where price is treated as outside the FDA’s remit. But the disconnect between approval and access is now so severe that pretending it is someone else’s problem is no longer credible.

None of these things happened. Instead, we approved two extraordinary drugs on 68-week weight endpoints, deferred many of the most important long-term questions until after commercialization, and then watched a fragmented compounding market emerge to fill the gap between regulatory approval and real-world access.

· · ·

The marketing executive in Brentwood is still using compounded semaglutide. Her insurance still doesn’t cover Wegovy. The compounding pharmacy she used in 2024 has shut down. She found another one. The pricing has gone up. The supply is intermittent. She has gained back another four pounds. She is considering switching to tirzepatide through a telehealth platform that mailed her a flyer.

Her physician does not know any of this, because she has not mentioned it. There is nothing in the chart that would lead anyone to ask.

We approved these drugs to treat her disease.

She is, technically, on therapy.

The Trial That Came Twenty-Five Years Late, and the Industry That Didn’t Wait

On TRAVERSE, the invention of Low T, and what it means when the trial you demanded finally arrives

Anand Narayanan, MD · May 2026

The clinic does not ask whether his fatigue might be sleep apnea, though his BMI suggests it. It does not ask whether his low libido might be relationship distress or SSRI use. It does not ask whether his single morning testosterone level is below the threshold by 30 ng/dL on a day when, statistically, half of his repeat draws would be above the threshold. It does not draw a second confirmatory level, as every endocrine society guideline says it should.

It prescribes testosterone cypionate, 200 mg intramuscularly every two weeks. It schedules him for a follow-up in six weeks. It bills his insurance, or his credit card if his insurance won’t pay. It does not call him in three months when his hematocrit rises to 54. It does not call him in six months when his estradiol climbs and his nipples become tender. It does not call him in twelve months when his ankles swell.

In 2023, there were approximately 2.3 million American men on testosterone replacement therapy. In 2002, there were approximately 200,000. The growth was not driven by improved screening for hypogonadism. The growth was driven by Solvay and AbbVie and the constellation of compounding pharmacies and telehealth clinics that built an industry on direct-to-consumer marketing of a condition that, until they invented it, did not have a name.

The condition is Low T. The drug is testosterone. The trial that should have established whether the drug helped or harmed the men taking it was finally published in 2023.

It was called TRAVERSE.

It had been demanded by the FDA in 2015.

It enrolled the first patient in 2018.

The drug had been on the market in its modern formulations since the mid-1990s.

· · ·

I want to be careful here, because the testosterone story is one where the reflexive critique and the reflexive defense are both wrong, and the honest position is harder than either.

The reflexive critique: testosterone therapy is overprescribed to men who do not have hypogonadism, marketed as a fountain of youth, and probably harmful. The reflexive defense: testosterone therapy is a legitimate treatment for a real biochemical deficiency, well-established benefits exist for sexual function and body composition, and the concern about cardiovascular harm was always speculative.

Both sides have a point. Neither side has the trial.

TRAVERSE (TheRapy for Assessment of long-term Vascular events and Efficacy ResponSE in hypogonadal men) was the trial. It enrolled 5,246 men aged 45 to 80 with pre-existing or high risk of cardiovascular disease and reported symptoms of hypogonadism, requiring two fasting testosterone levels below 10.4 nmol/L (300 ng/dL). Multicenter, randomized, double-blind, placebo-controlled, designed as a non-inferiority trial. The primary endpoint was time to first MACE: cardiovascular death, nonfatal MI, or nonfatal stroke. Mean treatment duration was 21.8 months, median follow-up 32.9 months.

The result, published in NEJM in 2023: testosterone was non-inferior to placebo for MACE. The hazard ratio crossed 1.0. The pre-specified non-inferiority margin was met. The trial that the FDA had demanded in response to the Vigen 2013 JAMA paper and the Finkle 2014 PLoS One paper, both retrospective observational studies suggesting CV harm, did not confirm the harm signal.

This was, on its face, a regulatory vindication. The drug class survived the test.

Except.

· · ·

Read the secondary endpoints. The most notable adverse events observed were an increased risk of non-fatal arrhythmias, venous thromboembolic events, and fractures in the testosterone group compared to the placebo group.

I want to pause on the fracture signal, because it is the most interesting and least discussed.

The conventional wisdom about testosterone and bone is that testosterone is anabolic for bone, both directly via androgen receptors and indirectly via aromatization to estradiol. Replacing testosterone in hypogonadal men should reduce fracture risk. Every textbook, every review article, every endocrine fellowship lecture has said this for thirty years.

TRAVERSE showed that testosterone-treated men had more fractures than placebo. The numbers were modest but the direction was the opposite of what every prior assumption predicted. The authors and accompanying editorials have largely waved this away as a chance finding, or speculated about increased activity leading to falls, or noted that bone density was not measured.

The honest reading is that we don’t know. The trial showed an unexpected fracture signal in the direction opposite to mechanistic prediction. The signal could be chance. It could be real and reflect something we don’t understand about exogenous testosterone in older men with comorbidities. It could be confounded by activity. We do not have the data to adjudicate.

The field’s response has been to ignore it. The European Expert Panel for Testosterone Research, in its 2025 position statement, concluded that TRAVERSE found no significant increase in MACE and that testosterone therapy does not elevate CV risk. The fracture finding is mentioned in passing. The atrial fibrillation finding is mentioned in passing. The pulmonary embolism finding is mentioned in passing. The headline result (non-inferior for MACE) is treated as the result, and the secondary signals are footnoted away.

This is a familiar pattern. A trial answers the question it was designed to answer and is treated as having answered every question that could be asked of it. The questions it raises but does not answer are absorbed into the literature as ambiguities, to be re-evaluated in the next trial that will not happen.

· · ·

Now let me tell you what TRAVERSE did not study, because the trial it was not is the trial most of the testosterone industry actually depends on.

TRAVERSE enrolled men with two confirmed fasting testosterone levels below 300 ng/dL who had symptoms of hypogonadism. This is the population the Endocrine Society guideline describes as appropriate for testosterone replacement. It is not the population that walks into a strip mall men’s health clinic.

The strip mall clinic patient typically has a single non-fasting testosterone level somewhere between 250 and 400 ng/dL, often drawn in the afternoon when testosterone is physiologically lower, often after a poor night’s sleep, often during a period of weight gain or psychological stress that suppresses the HPG axis. He has symptoms (fatigue, low libido, decreased energy) that are non-specific and have many other explanations. He is started on testosterone without a workup for secondary causes of hypogonadism: pituitary disease, hemochromatosis, opioid use, obesity-induced hypogonadism that responds to weight loss. He is started on doses that often produce supraphysiologic peak levels. He is monitored variably, sometimes not at all.

This patient was excluded from TRAVERSE. TRAVERSE required two fasting morning testosterones below 300, established CV disease or risk, and protocol-driven dose titration with monitoring. The trial studied a population that does not resemble the actual population receiving testosterone in American practice.

So when a clinician at a men’s health clinic cites TRAVERSE as evidence that “testosterone is safe,” the citation is not wrong exactly. It is just disconnected from the patient in front of him. TRAVERSE established that, in men with confirmed biochemical hypogonadism, with established CV risk, with protocolized treatment, testosterone is non-inferior to placebo for MACE. It established nothing about a 54-year-old with a single afternoon T of 290, no secondary cause workup, and bi-weekly injections from a compounding pharmacy.

The trial cannot be made to do more work than it did. It is repeatedly asked to.

· · ·

There is also the question of why TRAVERSE happened at all, and what its happening tells us about the regulatory architecture.

The trial was a postmarketing requirement. In 2015, the FDA issued a drug safety communication cautioning about testosterone products for low testosterone due to aging, and required labeling changes regarding possible increased risk of heart attack and stroke. The labeling change was triggered by the Vigen 2013 and Finkle 2014 retrospective studies, which suggested CV harm in observational data. The FDA convened an advisory committee. The committee voted to restrict the indication and require a CV safety trial. TRAVERSE was the trial.

Note the timeline. Testosterone products in their modern formulations had been on the US market since the early 1990s. Direct-to-consumer marketing of the Low T concept began in earnest in the mid-2000s. Prescription volume rose roughly tenfold between 2000 and 2013. The retrospective concerns emerged in 2013. The FDA acted in 2015. The trial enrolled its first patient in 2018. The results were published in 2023.

We let a drug class be promoted to two million American men for twenty years before requiring the trial that would tell us whether the promotion was harming them. When we finally required the trial, we waited eight more years for the answer. The patient in the Tampa strip mall in 2010 had no access to the evidence TRAVERSE eventually produced, because the trial would not exist for thirteen more years.

What is the right response to this? The defenders of the regulatory framework will say: the FDA cannot demand a CV outcome trial for every drug at the time of approval. The cost would be prohibitive. The delays would deny patients access. Postmarketing requirements are the appropriate mechanism for emerging safety signals. This is a reasonable position.

The critics will say: testosterone is a hormone with broad systemic effects, marketed for vague symptoms in an aging population, with biological plausibility for cardiovascular risk via hematocrit, fluid retention, and lipid effects. A CV outcome trial should have been a condition of the modern formulations’ approvals in the 1990s, or at minimum at the point when prescription volume began to grow exponentially in the mid-2000s. This is also a reasonable position.

What strikes me, reading the regulatory history, is that the FDA did not have a clear principle for when a drug’s prescription volume or marketing reach should trigger a previously unrequired outcome trial. The Vigen and Finkle papers were the proximate trigger. Without them, would TRAVERSE have happened? Probably not. The trial that retrospectively reassured us about a drug class taken by millions for two decades happened because of two observational studies of variable quality and contested methodology.

This is not a system. This is a series of accidents that produced a study, after which the system declared itself validated.

· · ·

The guideline response to TRAVERSE has been, in a word, muted.

The European Association of Urology updated its guideline in 2024 to incorporate TRAVERSE findings. No other major guideline has been substantively revised. The American Urological Association guideline from 2018 remains current. The Endocrine Society guideline from 2018 remains current. The standard of care, such as it exists, is essentially what it was before TRAVERSE was published.

This is, on inspection, defensible. TRAVERSE did not change what we thought we knew about testosterone in confirmed hypogonadism in the population the trial studied. It confirmed the existing guideline recommendations. If the guideline said “treat confirmed hypogonadism in men with appropriate workup and monitoring” before TRAVERSE, and TRAVERSE confirmed that doing so was reasonable, no update is needed.

The problem is that the practice gap is not addressed by this logic. The guidelines have always said: confirm with two morning fasting levels, evaluate for secondary causes, exclude reversible factors, monitor on therapy. The practice has, for two decades, said: single afternoon level if any, prescribe, don’t follow up. TRAVERSE did not bridge this gap. The guideline-confirming trial is irrelevant to the population receiving non-guideline-compliant care.

The 54-year-old in the strip mall is not on testosterone because the guideline recommended it. He is on testosterone because direct-to-consumer marketing convinced him he had Low T and an industry of telehealth and compounding pharmacy infrastructure made it easy to fill the prescription. TRAVERSE does not speak to his care. The guideline does not constrain his prescriber. The regulatory framework does not reach the clinic.

This is the limit of trial-based regulation. We can run trials. We can update guidelines. We can issue safety communications. We cannot, with the tools available, prevent a market from forming around a loosely-defined symptom complex in an aging population willing to pay cash, supplied by an infrastructure designed to optimize for prescription volume rather than appropriate use.

The regulatory framework optimizes for what it can measure. It cannot measure the strip mall.

Real Talk — Publishing

Loading…

Privacy & Security

Defense in Depth Strategy, Zero-trust Architecture

HonestBone is engineered with a "Defense in Depth" strategy, utilizing multiple layers of physical, technical, and administrative controls to meet the rigorous safety standards required by clinical and regulatory professionals.

1. Information Protection & Encryption

We employ industry-standard protocols to ensure that all data, from your login credentials to sensitive clinical queries, is shielded from unauthorized access.

Encryption in Transit: All communication between your browser and our servers is protected by strict end-to-end SSL/TLS encryption via Cloudflare. This ensures that data is encrypted and cannot be intercepted during transfer, not only between your browser and the web gateway but also throughout the entire journey to our secure servers.
Encryption at Rest: All ingested documents and vector data are stored in encrypted databases. Access to these data stores is strictly gated and requires verified administrator credentials.
Secure Session Management: We use secure authentication tokens (JSON Web Tokens, or JWTs) to keep you signed in and protect your account. These tokens allow our system to verify your identity on each request without storing your password. Tokens expire automatically and are used only to maintain your session and enforce access controls.

2. Artificial Intelligence Ethics & Data Isolation

A primary concern for clinical development and regulatory professionals is the risk of "data leakage" into public AI models. HonestBone is built to prevent this entirely.

No Training on Private Data: Your search queries and results are never used to train the public models provided by OpenAI or Google.
Query Anonymization: Before queries are sent to our LLM routing system, direct identifying metadata (such as name or email) is removed. We log queries to maintain system reliability, detect misuse, troubleshoot errors, and improve user experience.
Isolated Context: The Evidence Library (Vector Database) containing clinical and regulatory documents remains on our secure servers. The AI models only "see" relevant slices of text during the specific moment they are answering your question.

3. The "Chain of Verification"

Our system is designed for complete auditability and transparency.

Source-First Integrity: HonestBone does not "improvise." Every answer is mathematically anchored to a specific page number and source file.
Deep Links for Verification: Users are provided with direct links to the original high-fidelity PDF scans within the Evidence Library to cross-reference AI findings manually.

4. Access Control & User Responsibility

To protect the platform's integrity, we maintain a human-in-the-loop security model.

Admin-Approval Workflow: No user can access Ask the Evidence or the Evidence Library without being manually verified and approved by an administrator.
Role-Based Access Control (RBAC): High-level functions, such as document ingestion and system configuration, are restricted to users with explicit Administrator roles.

Enlighten & EmpowerBone Health

Ask the Evidence

Evidence Library

Real Talk

HonestBone Osteoporosis Intelligence Engine

Evidence Library

Workspace

Admin Console

The Mission

The Founder/Creator

The Technology

Real Talk

The Trial That Should Have Ended a Theory, and Didn’t

The Drug That Built the Regulatory Framework That Reversed Its Own Verdict

The Endpoint That Wasn’t, the Question That Wasn’t Asked, and the Compounded Shadow Market That Filled the Gap

The Trial That Came Twenty-Five Years Late, and the Industry That Didn’t Wait

Real Talk — Publishing

Privacy & Security

1. Information Protection & Encryption

2. Artificial Intelligence Ethics & Data Isolation

3. The "Chain of Verification"

4. Access Control & User Responsibility

Change Password

Enlighten & Empower
Bone Health