Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Transforming wikitext into HTML is surprisingly IO-bound, since most Wikipedia pages transclude many other pages for templating and logic."

Sounds like a case for in memory caching. Or is it that any random Wikipedia page requires logic from a random subset of a large pool of other pages? Somehow I doubt that, with Zipf's law coming to my mind.

Also remember that an HTML5 front-end would be able to cache a substantial amount of this (a few megs at least) and Lua can be quite effectively compiled into JS.



MediaWiki uses lots of caching on the server, but the question is how hard it would be to move rendering to the client.

For a sense of how hard this would be, try using the Special:Export page on Wikipedia.

http://en.wikipedia.org/wiki/Special:Export

If you download transcluded templates, the article for Barack Obama is 781K.

My experience of Special:Export is that it has some flaws that cause it to miss some things it needs to export, so the real total may be much larger.

And that's just the data - one would also have to download a lot of related code, which might balloon that up to a megabyte or more.

Wikitext is particularly ornery (because it's just based on grinding regular expressions against each line, it is not easy to describe with regular grammars) so you'd have to download a very large parser, with various plugins as each MediaWiki install uses them to warp how Wikitext is processed. This is assuming some optimistic scenario where MediaWiki's rendering, and all related plugins, are entirely ported to JavaScript compatible with all desired browsers.

I'm not denying that, if you wanted to create a new Wikipedia from scratch today, based on JavaScript, you could probably move a lot of rendering to the browser. You would choose more browser-friendly formats, like JSON or XML, rather than making up some random text-based format, just because it was easy to type into a textarea. You would make transformational operations work in JavaScript, or be exportable to JavaScript. You could definitely get it to the point where it would be practical for quick previews while editing.

For the Wikipedia that we have today, it's really hard.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: