Archiving a MediaWiki installation
Base
- https://www.pokorra.de/de/2020/12/export-mediawiki-to-html/
- https://www.bachmann-lan.de/mediawiki-nach-html-exportieren/
- https://github.com/SolidCharity/exportMediaWiki2HTML
git clone https://github.com/SolidCharity/exportMediaWiki2HTML.git
cd exportMediaWiki2HTML
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
Problems
- Fix for w/ vs wiki/ problem
- Will render redirects as the parge it’s redirected to, leading to duplicate content -> Adding
&redirect=no
to the URL solves that, but creates annoying pages. e.g.https://wiki.xinchejian.com/w/index.php?title=Staff_members&redirect=no&action=render
Namespaces
Namespaces are exported separatly, identified by their Namespace IDs
- 0: default
- 2: User
- 10: Template
- 14: Category
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w --namespace=2
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w --namespace=1ß
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w --namespace=14
Postprocessing
# Create ne jekyll page
jekyll new output
cd output
# copy over content
cp -r ../export/* .
# Add front matter
for a in $(ls -1 *.html); do if [ "$(head -1 $a)" = "---" ]; then continue; fi; echo -e "---\nlayout: page\n---\n$(cat $a)" > $a; done
https://jekyllrb.com/docs/front-matter/
# Adjust width of minima
mkdir assets
vi assets/main.scss
# serve
jekyll serve
Dealing with links
- https://jekyllrb.com/docs/liquid/tags/#links -> fails if the page is not found
- https://mademistakes.com/mastering-jekyll/how-to-link/#linking-pages -> only works on Markdown, not HTML :-/
Alernative approaches:
- https://stackoverflow.com/questions/9340341/converting-a-website-from-mediawiki-to-plain-html#comment88311596_9340341
- http://camwebb.info/blog/2012-12-20/
- http://reluk.ca/project/Votorola/s/wiki-copy/
- https://stackoverflow.com/questions/10713330/obtaining-static-html-files-from-wikipedia-xml-dump
Original: https://tech.tiefpunkt.com/2022/06/converting-a-mediawiki-wiki-to-jekyll/