Bert Johnson » Blog »

Archive vBulletin to HTML

While migrating my old websites to Azure, I decided to retire my old vBulletin forums. Instead of maintaining an unpatched application with PHP and MySQL dependencies, I preferred to migrate to static HTML files.

To export the site, I relied on the handy “wget” Gnu utility. wget is pre-installed on most Linux distros and is available for Windows.

Here’s the command:

wget -r --convert-links=on --html-extension=on --page-requisites --accept '*.html,showthread.php*,forumdisplay.php*,member.php,images*,*.gif,*.png,*.jpg,*.jpeg,*.css,*.js,clientscript*' --reject "*?p=*,*goto=*,*sort=*,*order=*,*daysprune=*"

vBulletin normally has many redundant URLs that lead to the same contents. These wget parameters grab only the relevant content (forum indexes, threads, images, stylesheets, and scripts) while ignoring problematic query strings.

The only downside in archiving vBulletin to HTML is that it takes more storage. Instead of relying on efficient MySQL storage and dynamic page generation, we have duplicate markup in each file. For me, that tradeoff is well worth it.