As I promised here it is a long and grammatically wrong rant about what I think about the status of PHP and of the web application in general, starting from the tools I used for my thesis (PHP5, PostgreSQL), through a brief description of some today php cms and wikis and ending considering what i think it’s still missing.
PHP5 rocks
Since that beastie will run on a dedicated server there was no need to support the crappy hosting services with only PHP4/MySql4 service. With PHP5 it is possible to use true object oriented probramming without being deadly slow and using all that must-have things like public/private, inheritance and polymorphism, so you can shut up all that annoying Java programmers when they laugh at you :-P. Unfortunately, I’m absolutely sure that web hosting services won’t offer PHP5 service until 2050 or so, partly because a big slice of PHP programs are written so badly that they pitifully fails to run under PHP5.
PostgreSQL rocks too
The system had to support complex hierarchies (simple restrictions-free ontologies, like RDF, I leave owl-like restrictions to my heirs :)); Implementing hierarchies whit that crappy db engine that begins with “M” that does not support triggers and the default engine doesn’t even have foreign keys, Au contraire PostgreSQL it’s almost perfect, so as the other best things nobody cares of it and everybody goes with the cool and trendy MySql, being forced to move all of the integrity checks into the PHP code. Maybe I will give to Mysql a second opportunity with the promising version 5, but I still haven’t tried it.
PhpWet sucks
My poor cms that runs this site sucks badly. I realized that when I had to migrate the site from www.fosk.it to www.notmart.org. Changing some pathnames and adding some sections was very painfully, and anyway: what I was smoking when I decided to write my own broken themplate engine when so cool template engines exists already?.
But anyway it is still the only cms I can use without going mad because 1) every system has his very own idiosyncrasies an I know the mines, but 2) more objectively is the only one that supports all of some things that I consider irrenunciable. In particular:
- all articles must go in a well ordered hierarchy (sorry, but I think everything too much in hierarchies :), I will explain why)
- multilingual support (translate also the articles, non only the interface)
- and last it must be easy to add more complex objects as article types, for example not only articles with title and content, but things with more fields that are required for other applications (for example photo, description, price and quantity available for an item in an e-shop site).
And what about other cms, wiki and journals?
Every cms I’ve tried fails something and excels some other things (or wikis, or journals, it doesn’t matter: that distinction should be made for the final website, not for the engine, otherwise it only justifies a lack of flexibility), in particular:
- WordPress
- Only a journal, the world is not a blog and even if it can run some other kinds of websites and it is getting better it is still not ready, but probably it will be able to handle more generic content in the near future.
- Mambo
- After trying it for a job that it has not been done, I realized I hate it, because it’s complex, it produces crappy HTML, supports translations only with a third-party add-on, his administration panel uses an ugly javascript menubar and it is difficult to exit from the “list of latest news” scheme and doing a more ordered hierarchy (oh no! again with that obsession :))
- Drupal
- This big beast is rather interesting and definitely I will learn more about it. Reassuming roughly the things I like about it, here it is:
- The HTML it produces is fairly better than the Mambo one.
- It supports a nice taxonomy, but I think the nodes should have been forced to be always comprehensible names instead of numerical ids that can be renamed later as strings (i.e. nobody will ever do it:).
- It uses clean URIS with a good use of Apache mod-rewrite: www.mysite.org/category/subcategory is fairly nice than www.mysite.org/index.php?id=261765&action=edit&foo=bar&bah=antani :-). The first one is easier to read to both humans and search engines
There surely are also something I don’t like, in particular translations support as Mambo is done only with a third party add-on that I truly would like having it included in the official version. Moreover, I believe that the database structure could have been fairly simpler, I still haven’t studied the internals well, but that 55 tables madness sounds me strange.
- CMS made simple
- This, as his name suggests, is fairly more simpler and young than the previous two, but I think it’s one of the more promising. It’s a no-nonsense cms that supports articles ordered into a simple hierarchy (yeah :)), news (with rss, as all the others) and a simple search function. There are some things I would see in a future version and I would absolutely love it:
- Maybe the news should be also articles in the hierarchy.
- Translation of the content of course.
- Ability to add more richer content (like the example of the e-commerce before, but it could be some other thing)
- Mediawiki
- And now His Majesty Mediawiki, that runs Wikipedia, my daily drug :). This being a wiki is a very differend kind of beast respect the previous ones. Even if I don’t know it very well (and I will surely remedy that) Let’s see some of the peculiarities that made me reflect a lot:
- Track changes: Due to its open (in the sense everybody can modify the content) nature every change made can be easily rolled back and an history of all changes is always accessible.
- Heavily based on search: this is mandatory if you have millions of entries like Wikipedia, but the latest version has a comfortable way to edit the sidebar with the pseudopage MediaWiki:Sidebar
- The articles are stored and edited in an own simple language rather than HTML. In this way they are easily edited in a faster way than HTML and it’s theoretically simpler to translate in other languages than HTML or future newer versions of HTML (the day it will be possible to download auto generated pdfs from wikipedia articles I will be the happier child in the world:)
- the relations/hierarchies between articles (think about the taxonomy of the plants or animals in Wikipedia) are computed from the article content rather from esplicitly specified table fields. This leads to a more flexible and fast way to categorize the articles. In order to avoid performance problems the article structure must be parsed at publishing time and the relations must be put in the database anyway.
Web 2.0? Does a Web 1.0 ever existed?
Today there isn’t a buzzword so buzz how “Web 2.0” is. But it’s still a very vaporous concept. To me even a “web 1.0” still not exists, because we are still in a 0.someting era, an immediate post “this site is optimized for internet explorer at 800×600 resolution”. At the moment there isn’t yet a clear agreement how the web should look like in the future. Somebody has even tried to make a validator that checks if your site is web2.0 compliant, but IMO it has some serious problems. First, it binds himself too much with the concept of the blog “ex.Has a Blogline blogroll?”, but altough the blogs became an important part of the internet they are not and will not be the internet. It also attempts to bind on a specify language (“Appears to be built using Ruby on Rails?”), and the strength of the internet since its creation has always been its platform Independence (even with the various hijack attempts of Microsoft).
The leading group is the Semantic Web, part of the W3c that works on some things very interesting with the constant risk being too much academic (with academic <=> useless and complex). The work that started it all was RDF, that is a simple and very generic XML schema for representing subject predicate object constructs. It’s not meant to be used alone but with RDF dictionaries ad so it become rss, owl and many others. So rss has many incompatible variants because RDF is very generic.
So how is the perfect website?
After all why managing well hierarchies is so important?
Yeah, I know, hierarchies are an absolutely not intuitive thing and these will probably help only few user to navigate the site (have you ever used the sitemap to navigate a site? I hadn’t), because that is the search that is becoming more and more important, because it’s easier and more natural. But I think a clean hierarchy helps the one who creates the site to make a more organic work, but that’s not the most important thing. If a hierarchy (or better: an ontology, that is in veery poor words a collection of objects and relations between object) is exported in a clear and formal language like OWL or some other (hopefully simpler) RDF dictionary would help some cool automated tasks, like helping the search engines to understand what the page relations are (maybe there could be some kind of relation between two pages even if there aren’t hyperlinks between them) or syndicating not only the page of news (RSS) but also the other content.
A cms? a wiki?
I think one engine should be able to manage “content” in every ways the administrator wants to display it. So an engine should be every of these things together, the final aspect/behavior is details.
How the content should be edited/stored on the server
And most important: what format should be used? HTML would be the more obvious answer, but I think the ability of download a page/article in more rich formats like pdf or opendocument is very important, and is also important to remember that HTML is not a fixed entity and will change even radically in the next few years (when somebody will support it is another discourse 🙂 ).
So in order to create a system a little more future-proof it is necessary to make the storing format on the server pluggable and some candidates pops into my mind:
- Mediawiki format: very simple even to be written by hand and I think it contains enough information to be converted into a well printable format. Probably it would be also easy to write a WYSIWYG editor for it.
- Docbook (that will be translated in HTML and other final formats by an XSLT stylesheet): very complex and expressive, but too ugly editors.
- Open document: very complex and expressive, publishing a new article would be as easy as uploading an OpenOffice file. But producing decent HTML would probably be hard and computationally expensive (it would be necessary caching everything).
And off course since HTML is not the alpha and the omega of representing information it should be possible to download the page in as many different format as possible. You want an HTML page? www.notmart.org/foo/bar.html. You want a PDF file? www.notmart.org/foo/bar.pdf and so on.
How the user should publish content
This should be an easy procedure for everybody, even the most computer-illiterated and of course it should be totally platform independent.
It would be nice having both possibilities to edit the site content both with a normal web browser or a specified client: think about XML-RPC and the flock blogging interface, but something of course not only limited to blogging. Obviously the ad-hoc client won’t be the only way to edit content, in order to maintain total platform Independence a web based interface should be always available (and of course somebody will hate the graphical client and some others will hate the web interface too, so the choice must be always present)
If i would like AJAX or not for the web based interface i am not sure: some things like Google Maps are cool but try to use them with a 56k modem or with an old web browser :-). So if advanced javascript, AJAX, XFORMS (when in the year 2050 some browser will support it 🙂 ) and other buzzword are being used there should be always a plain old HTML version like Gmail.
That’s it 🙂
Here are my random thoughts about the web. Maybe if some days I will have some time I will try to implement some of these ideas, but probably I won’t have time and probably tomorrow I will totally have changed my mind on this argument. In the meantime I will continue to search the perfect system that I’m sure it is somewhere out there 🙂