Post
477
🌐 Public MediaWiki Collection Dataset -
nyuuzyou/wikis
Collection of 1.66M+ articles from 930 public MediaWiki instances featuring:
- Full article content from diverse public wikis across the internet
- Complete metadata including templates, categories, and section structure
- Rich structural information preserving wiki organization and links
- Multilingual content across 35+ languages including English, Chinese, Spanish, and more
- Regional language variants including US/UK English, Brazilian Portuguese, and Traditional/Simplified Chinese
Key contents:
- 1,662,448 wiki articles with full text
- Extensive metadata including templates, categories, sections
- Internal wikilinks and external reference information
- Cross-domain knowledge spanning multiple topics and fields
Collection of 1.66M+ articles from 930 public MediaWiki instances featuring:
- Full article content from diverse public wikis across the internet
- Complete metadata including templates, categories, and section structure
- Rich structural information preserving wiki organization and links
- Multilingual content across 35+ languages including English, Chinese, Spanish, and more
- Regional language variants including US/UK English, Brazilian Portuguese, and Traditional/Simplified Chinese
Key contents:
- 1,662,448 wiki articles with full text
- Extensive metadata including templates, categories, sections
- Internal wikilinks and external reference information
- Cross-domain knowledge spanning multiple topics and fields