License & Data Reuse
License & Data Reuse
Raras is dual-licensed, following the Wikidata / Open Targets / OBO Foundry convention. The application code is copyleft; the data uses a per-partition model — our own contributions are public domain (CC0) and the combined dataset is open under CC-BY 4.0 — so it can be freely reused, redistributed and re-ingested by downstream ontologies and registries.
Summary
The platform source code is copyleft (AGPL-3.0), which protects the commons against closed SaaS forks. Licensing of the data follows the Wikidata / Open Targets per-partition model:
- Raras-originated content (our identifiers, crosswalks, PT translations and SUS integration) is dedicated to the public domain under CC0 1.0 — this is what enables round-trip contribution to Wikidata.
- The combined dataset — the downloadable dump and API responses, which embed upstream sources — is distributed under CC-BY 4.0, because it contains Orphanet, HPO and MONDO data that require attribution. You cannot relicense those parts as CC0.
- Each upstream source keeps its own license (see attribution).
Code — AGPL-3.0-or-later
The platform source code is licensed under the GNU Affero General Public License, version 3 or later. Network use requires sharing the corresponding source, which protects the commons against closed SaaS forks.
Data & Knowledge Graph
Raras-originated triples — our identifiers, ontological crosswalks, Portuguese translations, cultural adaptations, Brazilian prevalence data and SUS integration — are dedicated to the public domain under Creative Commons Zero 1.0 Universal (CC0 1.0). CC0 is required for round-trip compatibility with Wikidata (P1550 Orphanet cross-references) and for downstream ingestion by MONDO, HPO and Bioregistry.
The combined dataset — the bulk dump, SPARQL/RDF/GraphQL responses and crosswalks taken as a whole — embeds upstream sources licensed under CC BY 4.0, so the aggregate is distributed under CC BY 4.0: you may reuse and redistribute it, including commercially, provided you attribute the sources listed below. The per-partition terms are emitted machine-readably in the VOID descriptor via void:subset + dcterms:license.
Open access endpoints
The knowledge graph is openly accessible — no authentication required — through:
- SPARQL 1.1 — /api/sparql
- RDF / Linked Data — /api/rdf (Turtle, JSON-LD, N-Triples)
- GraphQL — /api/graphql
- Bulk data dumps — /api/downloads
- Dataset metadata (VOID / DCAT) — /.well-known/void
- GA4GH Beacon v2 / Phenopackets v2 — /api/beacon, /api/phenopackets
Full developer documentation and the data model live at /docs.
Open standards
Raras is built on open biomedical and health-data standards: Orphanet (ORPHA codes), MONDO, OMIM, the Human Phenotype Ontology (HPO), ICD-10, RDF/OWL, SPARQL 1.1, GA4GH Beacon v2 and Phenopackets v2, and W3C VOID/DCAT dataset descriptions for FAIR discovery.
Third-party data attribution
The CC0 dedication applies to Raras' own curation, integration and structuring work. Underlying third-party datasets retain their upstream licenses, which you must respect when reusing the corresponding records:
| Source | License |
|---|---|
| Raras-originated content | CC0 1.0 |
| Orphanet | CC BY 4.0 |
| Human Phenotype Ontology (HPO) | CC BY 4.0 |
| MONDO | CC BY 4.0 |
| Open Targets | CC0 1.0 |
| ClinVar | Public domain |
| PubMed (metadata) | Public domain |
| OMIM | Restricted — identifiers only, no descriptive text redistributed |
Brazilian public health data (SUS / DataSUS, CONITEC, ANVISA, PCDT) is sourced from official government publications. The full list of official sources is documented in the repository.
How to cite
dcterms:hasVersion) and provenance (dcterms:publisher, prov:wasAttributedTo) stamped into the data, so copies remain traceable to their source.Please cite: Raras — Brazilian Rare Disease Knowledge Graph (RarasNet), https://raras.org. Raras-originated triples are CC0 1.0; the combined dataset is CC-BY 4.0.
Contact
Questions about licensing or data reuse: [email protected]
Raras Health Ltda