Every Health Technology Assessment produced in Europe (by UK NICE, National Institute for Health and Care Excellence, G-BA (Germany, Gemeinsamer Bundesausschuss), HAS (France, Haute Autorité de Santé), AIFA (Italy, Agenzia Italiana del Farmaco), AEMPS (Spain, Agencia Española de Medicamentos y Productos Sanitarios), TLV (Sweden, Tandvårds- och läkemedelsförmånsverket), ZIN (Nederland, Zorginstituut Nederland) is, in effect, a public reasoning document. The agency states what evidence it accepted, what it did not, which comparator it considered relevant, what uncertainties drove the decision, and what restrictions or pricing consequences followed. Several thousand of these documents now exist across European agencies. They are the closest thing the industry has to a written record of payer reasoning. The industry treats these resources as a library while they should be treated as a knowledge base to query and learn from.

The imbalance is visible in the literature. The HEOR methodological corpus is enormous: network meta-analysis, indirect treatment comparison, surrogate endpoint validation, partitioned survival modeling. A search for “knowledge graph HTA” or “semantic web market access” returns near-zero relevant results. The community has built a sophisticated statistical apparatus on top of evidence and almost nothing on top of agency decisions. The elements of decisions (who, when, where, why, due to which context), which are the part that determines reimbursement, remains buried in PDFs.

I used to think the methodological work was the bottleneck but it is not. The bottleneck is that every dossier team starts from documents and rebuilds by hand the patterns that should already be queryable across past procedures.

How it started

I came to HTA from the data side. While working on medicinal product master data and regulatory submissions using IDMP standards, I kept watching market access teams ask questions that were obviously requiring to map (take into consideration) many-to-many elements expressed in different terms by different cultural/national ways of thinking. Many questions ranged from identical to similar to correlated, asked again on every dossier, in every indication, in every country. The work to answer them was redone each time.

Most teams I have worked with do this well. The point is not that they are inefficient; the point is that the infrastructure they need does not exist yet, so individual expertise has to compensate for it.

The imbalance is visible in the literature. The HEOR methodological corpus is enormous: network meta-analysis, indirect treatment comparison, surrogate endpoint validation, partitioned survival modeling. A search for “knowledge graph HTA” or “semantic web market access” returns near-zero relevant results. The community has built a sophisticated statistical apparatus on top of evidence and almost nothing on top of agency decisions. The elements of decisions (who, when, where, why, due to which context), which are the part that determines reimbursement, remains buried in PDFs.

I used to think the methodological work was the bottleneck but it is not. The bottleneck is that every dossier team starts from documents and rebuilds by hand the patterns that should already be queryable across past procedures.

The shape of an HTA question

The questions an access team needs to answer look like this:

  • How did G-BA assess immuno-oncology products when overall survival was immature at submission?
  • Which endpoints did HAS accept for paediatric or adolescent extensions of an adult indication?
  • Across the last five years, what was the typical price differential between rare-disease products granted full reimbursement and those granted conditional reimbursement?
  • Which comparator did NICE select in biosimilar assessments where the reference biologic had moved to a managed access agreement?
  • For an orphan indication assessed in three countries, where did the decisions diverge, and on which evidence point?

None of these is a full-text search. Each one is a graph traversal pattern: clinical trial → endpoint → population → comparator → agency → decision → restriction → price. The data, as it currently exists, does not allow traversable queries.

The gap that makes everything else hard

The current state of the field is roughly this: document archives exist, some of them excellent; several vendors maintain curated repositories of European HTA reports, with metadata covering drug, indication, agency, date, and reimbursement status. Search this data works well and once a relevant procedure is found, the analyst opens the PDF and reads it.

This is where the chain breaks: the substance of the decision, the “why” is handled by humans (for instance, the accepted comparator, the contested endpoint, or the pricing implication of a restriction,…). It is read, summarized, and, if stored at all, captured in slide decks or internal precedent files that often do not survive personnel changes. For learning and reuse purpos by humans and AI, this reasoning should be digitally extracted and recorded as contextual metadata. It should explain why the decision was made, link it to its consequences, and make causality part of the data itself.

A specific example. A team I worked with was preparing the EU launch of an oncology product and needed to anticipate which comparator G-BA would impose. They had eight relevant past procedures in the same therapeutic area. To answer the question, two senior consultants spent three weeks reading the PDFs, building a comparison table in Excel, and writing a memo. The memo was good. None of the structured information they extracted entered any system. The next team facing the same question, six months later for a different molecule, did the work again.

The pattern above is not a personal observation, it is consistent with what HEOR practitioners themselves report. One recent post from Betsy J. Lahue points it out clearly in 10 lessons learned from 10 years working in HEOR:

  • “The most common HEOR failure is solving the wrong problem precisely”(lesson 4) is what happens when teams optimize a model against the wrong comparator because they could not see what the agency actually accepted in adjacent procedures.
  • “Models are translation tools, not persuasion engines”* (lesson 6) is hard to satisfy when the target language: the agency’s own reasoning vocabulary is not extracted in the first place.
  • “Strong HEOR strategies start by mapping the value landscape” (lesson 10) is exactly the work a queryable HTA precedent layer enables.

The gap is not methodological but structural: the input HEOR needs to do its job well is locked in PDFs. This is the cost the industry actually pays: not the cost of running HEOR analyses but the cost of rebuilding payer precedent from scratch each time. There is a clear need for better knowledge management in this field.

A concrete proposal

A curated multi-country HTA archive is a useful starting point. The hard logistical work is already understood: country-specific curation by local experts, monthly updates, coverage of national and joint EU HTA procedures (including the EU Joint Clinical Assessments produced under Regulation (EU) 2021/2282), and full-text search across the document corpus. Several archives, public and commercial, already deliver this. What none of them yet offers is a way to compute across the documents.

A semantic knowledge graph and AI layer on top of such an archive could deliver four capabilities that document search cannot:

  1. Structured outcomes and pricing implications. Each HTA report would yield a structured, semantically annotated record: accepted and rejected comparators, accepted and contested endpoints, key uncertainties named by the agency, decision type, conditions imposed, and where available the resulting price level or differential against the comparator. The same controlled vocabulary across procedures makes “find me analogous cases” a query rather than a research project. Pricing implications, in particular, are almost never extracted today and they are what payers and access teams care about most.
  2. Visualisation of related procedures. A product assessed for adults in 2022 and for adolescents in 2024 is one assessment family, not two unrelated documents. The same is true of line-of-therapy extensions, biosimilar follow-ups, indication broadenings, and managed access reassessments. A graph view that exposes these relationships directly lets an analyst see the full lineage of a product’s reimbursement history at a glance, and lets payer-precedent reasoning operate on connected sets of procedures rather than isolated ones.
  3. AI integration for language and extraction. HTA documents are written in at least seven primary languages. Translation, named-entity extraction, and structured-field extraction are now tractable with modern language models, provided the schema is fixed. A French CT (Commission de la Transparence) report can be made queryable in English without losing the semantic anchors. Free-text questions (“which agencies accepted progression-free survival as a primary endpoint in second-line NSCLC last year?”) can be routed against the structured semantic layer rather than against raw text. Translation alone is not the point; translation into a stable schema is.
  4. Geographic and temporal visualisation. A map view of European procedures for a given indication, colour-coded by decision type and overlaid with timing, makes the European launch sequence legible. It exposes outliers, clusters of agreement, and the agencies whose precedents tend to set the tone for others. For an asset in early planning, this is the difference between a launch sequence designed by intuition and one designed against the actual European decision graph.

None of these capabilities requires a research breakthrough. They require the discipline to extract meaning from documents that already exist, into a rich semantic schema that is stable enough to be useful and flexible enough to be revised as agency methodologies evolve.

Two important points to consider

Context is key

HTA reasoning is partly what knowledge management professionals call “tacit knowledge”. Some of what an agency decided is more or less explicitly explained in the report, some of it lives in oral exchanges with the assessor, in country-specific budget context, in a political moment that no graph will capture. A precedent-finding tool that ignores this will create false confidence. Judgment still belongs to a specific team, working on a specific question, in a specific context. The tool’s role is not to replace reasoning. It is to bring the most relevant precedents to the team faster, with enough context to support better judgment.

Maintenance is tricky

Agency methodologies change: for instance, a 2018 G-BA assessment cannot be compared to a 2025 G-BA assessment without acknowledging the shifts in evidence requirements between them. The semantic layer has to be time-aware, and someone has to maintain it. The companies best placed to do this are the ones already operating a network of local experts. Without that ground truth, an extraction pipeline will drift.

Why it is worth testing

The launch-delay literature is well established. The EFPIA Patients W.A.I.T. Indicator 2025 (IQVIA’s annual European access survey) reports an EU average of 578 days from EMA approval to patient availability across 173 medicines authorised between 2020 and 2023, with country-level wait times ranging from 128 days in Germany to 840 days in Portugal. For oncology products specifically, the average has lengthened by 66 days since 2022. On a specialty asset with $500M peak sales (a common floor in oncology, rare disease and immunology launch business cases) each year of delay translates into roughly $100M–$1B in foregone peak-year revenue depending on indication, list price and patent runway. Twelve to twenty-four months of dossier preparation and HEOR work add another estimated $5–15M per product across HEOR modeling, systematic literature reviews, medical writing and country-by-country adaptation; published agency processing fees alone (e.g. NICE Technology Appraisal charges of £88,000–£251,000) capture only a small fraction of that all-in cost.

A semantic precedent layer plausibly affects three lines of this P&L as confirmed by Betsy’s lessons learned:

  • Direct cost savings on dossier preparation. Assuming a conservative 10% reduction in time spent rebuilding precedent that already exists in past procedures: ~$500K–$1.5M per product on a $5–15M dossier-prep base. This is the cost of “the most common HEOR failure is solving the wrong problem precisely” (lesson 4). Reducing precision applied to non-decisive evidence is exactly what a queryable archive of accepted comparators and contested endpoints enables.
  • Strategic value from improved HTA outcomes. A 5% improvement in HTA success probability is far from speculative. The published ISPOR analysis of 944 AMNOG benefit assessments through November 2023 shows that 42% of G-BA assessments and 59% of subpopulations are awarded “no added benefit”, and the AMNOG-Monitor puts the share even higher when weighted by patient numbers (73.5%). The price consequence is direct and measurable: drugs without added benefit absorb mean rebates of ~33% versus 24% for drugs with added benefit. On a $500M peak-sales asset, a 5-percentage-point shift in this outcome distribution is worth ~$25M in NPV per product. Betsy’s lessons 1 (“if the evidence doesn’t influence a real decision, it’s not useful”) and 9 (“delayed RWE investment leaves HEOR answering access questions too late”) frame why the lever is decision-relevance, not analytic depth.
  • Industry-wide aggregate value. The HTACG Annual Work Programme 2026 schedules approximately 50 EU Joint Clinical Assessments of new oncology medicines and advanced therapies plus the first 5 high-risk medical devices, and the regime expands to all new orphan medicines in 2028 and to all new medicinal products by 2030. National agency throughput sits on top of that envelope: NICE, G-BA, HAS, AIFA, AEMPS, TLV and ZIN together publish several hundred procedures per year (new active substances, indication extensions, reassessments). A defensible floor across both layers is 200+ major European HTA submissions per year, scaling toward 1,000+ once the 2030 rules apply. Even at the floor, the aggregate value at stake exceeds $5B annually. “Strong HEOR strategies start by mapping the value landscape” (lesson 10) is the work this aggregate represents, currently done one PDF at a time.

These savings do not come from better documents. They come from making payer precedent operational which is, in the language of lesson 6, finally giving HEOR “a translation target” it can model against.

What we are doing differently

At Athagoras and MIGx, with the support of BioMedima, we serialize the creation of fit-for-purpose semantic layers from corpora that are already documented but not yet computable. The pattern is deliberate: trim a domain ontology (like the IDMP ontology as described in my recent post on IDMP ontology to the entities a specific Competency Question (CQ) actually requires, enrich it with executable validation rules and controlled vocabularies that bind agency-specific terminology to a stable target schema, then pair it with LLM-based extraction so that every fact written into the graph either validates against the rules or is rejected at write time. IDMP product data was one corpus we treated this way. HTA precedent is the next one.

Applied to HTA, this combination unlocks operations that document search and general-purpose LLMs cannot deliver on their own.

Cross-agency traversal comparator → endpoint → decision → restriction → price, becomes a single graph query rather than a three-week memo.

A French “Service Médical Rendu” rating and a German “Zusatznutzen” extent stay separately authored but become semantically comparable through a shared anchor, in either language. A 2018 G-BA assessment can be set against a 2025 one with the methodology version that applied at each date held alongside the data, so apples-to-oranges comparisons are caught by the schema rather than by a senior reviewer. Lineage runs from a single clinical-trial endpoint through to the rebate it produced, and stays attached. Each of the questions opening this article such as “How did G-BA assess immuno-oncology when overall survival was immature at submission?”, becomes a parameterised query, not a research project.

The differentiator is not the ontology itself; it is treating the ontology as an execution artifact (versioned, time-aware, AI-ready by design…) with provenance attached to every extracted fact and causality kept as a first-class citizen of the data. That is what gives HEOR the translation target lesson 6 calls for “Models are translation tools, not persuasion engines”, and what lets dossier teams compose payer-precedent answers in hours rather than weeks.

The original HTA-ontology that anchors this work is published openly on BioMedima: https://www.biomedima.org/project/hta-ontology/