Methodology
The Observatory is built entirely from open bibliographic data. This page explains where that data comes from, how each figure on the site is computed, and how to read it well, including what each method can and cannot tell you.
The data
The corpus draws on two open sources. OpenAlex is the backbone, providing works, author and institution records, abstracts, and the pre-assigned concepts used for topic classification. PubMed (via the NCBI E-utilities) adds biomedical recall and records not indexed in OpenAlex. Coverage runs from 2015 to 2025.
We fetch narrow and classify broad. Articles are retrieved using anchor terms only: the condition's names and abbreviations (polycystic ovary syndrome, PCOS, Stein-Leventhal, and the 2026 successors Polyendocrine Metabolic Ovarian Syndrome and PMOS). Phenotype terms such as "insulin resistance" are deliberately not used as fetch queries, since they would pull in thousands of papers that are not about PMOS.
Records are merged into one canonical paper using a key priority of DOI, then PMID or PMCID, then a title, first-author and year fuzzy match. Around 8,300 duplicates and 2,900 false positives (for example pCO2 or PCO matching "PCOS") were removed, leaving 20,749 articles.
Because both sources index English-language journals most fully and the search terms are English, the corpus represents the indexed, English-language literature. This is worth keeping in mind when reading the geographic figures, which lean towards English-publishing countries. The sources are also primarily biomedical, so research on PMOS that lives in health services, nursing, social science and qualitative literatures is under-represented. That matters most for questions about care delivery, equity and lived experience, where the absence of papers here reflects our sources as much as the field.
Topic classification
Each article is tagged into a 15-category taxonomy derived from topic modelling (NMF) of the corpus and validated against the natural clusters in the literature. The categories span metabolic health, fertility, genetics, endocrinology, mental health, inflammation, epidemiology, nutrition, pharmacology, lifestyle, gut microbiome, dermatology, diagnostics, animal models, and molecular mechanisms.
Classification runs in three tiers at near-zero cost. Tier 1 maps each article's pre-assigned OpenAlex concepts to the taxonomy (about 88% of articles). Tier 2 is a keyword fallback for records without concepts (about 8%). Tier 3 uses Claude Haiku for the low-confidence remainder (about 4%). Coverage reaches 99.99%.
Classification pipeline
Classification is automated rather than expert hand-coding, so an individual tag is best read as a strong signal rather than a final verdict. At the aggregate level the taxonomy holds up well against the natural structure of the literature.
How topics are counted
An article can belong to up to three topics. Counting it once per tag would inflate multi-topic papers, so instead each article is split equally across its tags: a paper tagged under two topics contributes 0.5 to each. Topic totals therefore sum to the number of articles, not the number of tags.
This supports three views of the same data: raw topic mentions, weighted research share (the default on the site), and primary focus (each article's single highest-confidence tag).
Worked example: 3 papers
| Paper | Tags assigned | Weight each |
|---|---|---|
| Paper A | Metabolic HealthEndocrinology | 0.50 |
| Paper B | Mental Health | 1.00 |
| Paper C | GeneticsMolecular MechanismsInflammation | 0.33 |
Overview bar (weighted)
Sums to 3 = number of articles ✓
Explorer bar (raw mentions)
Sums to 6 = number of tag assignments
Overview vs Explorer counts differ by design. The Overview uses weighted research share throughout, because it is making a claim about where the field's effort goes. Fractions summing to one are the right unit for that question. The Explorer uses raw article mentions, because it is a browsing tool: the numbers in the topic bars correspond directly to the article count shown in the results list below them. If a country filter returns 47 articles tagged under Mental Health, the bar shows 47. Switching to fractional counts would break that correspondence and make the bars harder to interpret.
The practical effect is that multi-topic categories such as Metabolic Health and Endocrinology appear somewhat larger in the Explorer than in the Overview, because each multi-topic paper is counted in full rather than split.
How the Explorer counts collaborators, journals, and institutions
Topics use raw article mentions (see above). A paper tagged under three topics counts once in each.
Countries are raw article counts by institution country code. When a country filter is active, the panel switches to showing co-authoring countries: all countries that share at least one paper with the selected country. A paper with authors from five countries counts as one article toward each bilateral pair. Country is based on author affiliation at the time of publication, not on the nationality of researchers or the location of patients.
Top journals shows a raw article count per journal title for the filtered set.
Top institutions shows distinct articles per institution for the filtered set. When a country is selected, only institutions with that country code are included.
None of these Explorer counts use fractional weighting. Because no single number is used across panels, the absence of a shared unit is less misleading than it would be if the counts were being compared to each other as shares.
Research approach: root cause, symptom, bridging
The split between root-cause, symptom-specific, and bridging research is a deliberate, patient-centred framing we designed for this platform. Rather than borrow a standard bibliometric scheme, it asks the question patients and clinicians actually care about: is the research investigating why PMOS happens (root cause), treating one symptom in a specialist silo (symptom-specific), or bridging the two through diagnostics, treatment, epidemiology and models?
It is applied transparently, by mapping each of the 15 topic categories to one bucket. That is a deliberate simplification: a single topic contains work of all three kinds, so the split describes the field's centre of gravity rather than classifying every individual paper. We publish the mapping so the judgement behind it stays open to scrutiny.
The standard alternative is the NIH translational continuum (basic, translational, clinical, implementation, sometimes labelled T0 to T4), which captures the phase of research. We chose the patient-centred framing because it answers the question patients, clinicians and funders actually ask, namely whether the field is investigating the cause or managing symptoms in silos, rather than which research phase a paper sits in. A translational-continuum view is planned as an optional toggle.
How the 15 topics are assigned to each bucket
Investigates why PMOS happens
Treats one symptom in a specialist silo
Connects cause to clinical care
Each topic is assigned to exactly one bucket. The mapping is a deliberate editorial judgement: a given topic contains work of all three kinds, so placement reflects the centre of gravity.
Geographic concentration
Concentration is summarised three ways: a Gini index over per-country output (0 means every country contributes equally, 1 means a single country produces everything), the combined share of the top five countries, and the collaboration rate (the share of papers with authors from two or more countries).
A country here is the author's affiliation, so these figures describe where research is produced rather than where patients are or where the need is greatest. That is the right lens for a question about the research community, and a limit to bear in mind for questions about populations.
Topic momentum
Momentum compares the share of the field each topic held in 2015 to 2017 against 2023 to 2025, expressed as a percentage change. It shows where the field's attention is shifting, not which topics matter more.
A large percentage change on a small base is still small in absolute terms, so each bar shows the topic's current share of the field alongside its change, letting a real shift be read in context.
Research versus disease burden
The burden chart plots research intensity (publications per million reproductive-age women, 15 to 49, over 2019 to 2025, for countries with at least 20 papers) against disease burden (the GBD 2021 age-standardised DALY rate per 100,000 women). A gap score is the burden percentile minus the research percentile. Sources are GBD 2021 (IHME), via Yao et al. (PLOS ONE 2025), and World Bank reproductive-age female population.
We frame this deliberately as where recorded burden outpaces research, rather than as a clean equity ranking, for two reasons. GBD models PMOS burden mainly through infertility and menstrual symptoms, omitting metabolic, cardiovascular and mental-health consequences, so true burden is higher than the DALY figure shows. And recorded burden reflects diagnostic capacity, since higher-income countries detect and code more PMOS, so a low recorded rate in a lower-income country usually signals under-detection rather than low prevalence. The chart is built and labelled to keep both of these in view.
Funding
Funding is enriched from three sources: OpenAlex funders and awards, PubMed grant lists, and Crossref funder records by DOI. About a quarter of papers carry an acknowledged funder.
We read this as a floor rather than a true funded rate, since acknowledgement varies by journal, country and sector. No source provides reliable monetary amounts, so the figure tracks whether funding is acknowledged, not how much money the field receives.
Comorbidities
Comorbidity figures come from a word-boundary regex pass over each article's title and abstract, covering 18 PMOS-adjacent conditions across metabolic, psychiatric, reproductive, oncological and other groups.
This records co-mention, whether a condition appears alongside PMOS in the text, which is a measure of how often the field connects the two rather than of dedicated research on that pairing.
Updates and scope
This release is a one-time extract covering 2015 to 2025, English language. Continuous updates, broader language coverage, and refinements to classification, funding and comorbidity enrichment are planned for later versions.