Archiving a MediaWiki installation

Base

git clone https://github.com/SolidCharity/exportMediaWiki2HTML.git
cd exportMediaWiki2HTML
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Problems

  • Fix for w/ vs wiki/ problem
  • Will render redirects as the parge it’s redirected to, leading to duplicate content -> Adding &redirect=no to the URL solves that, but creates annoying pages. e.g. https://wiki.xinchejian.com/w/index.php?title=Staff_members&redirect=no&action=render

Namespaces

Namespaces are exported separatly, identified by their Namespace IDs

  • 0: default
  • 2: User
  • 10: Template
  • 14: Category
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w --namespace=2
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w --namespace=1ß
python3 exportMediaWiki2Html.py --url=https://wiki.xinchejian.com/w --namespace=14

Postprocessing

# Create ne jekyll page
jekyll new output
cd output

# copy over content
cp -r ../export/* .

# Add front matter
for a in $(ls -1 *.html); do if [ "$(head -1 $a)" = "---" ]; then continue; fi; echo -e "---\nlayout: page\n---\n$(cat $a)" > $a; done

https://jekyllrb.com/docs/front-matter/

# Adjust width of minima
mkdir assets
vi assets/main.scss

# serve
jekyll serve

Alernative approaches:

Original: https://tech.tiefpunkt.com/2022/06/converting-a-mediawiki-wiki-to-jekyll/