Skip to the page content area.

:::

Tavern Change Log Volume 2


Date: 3.31.’03.

The Travellers’ Guestbook has finally back. Ha ha!

It had been 2 years since its shut down. The previous programs were written in the middle of 1998, almost 6 years ago. It was very buggy. Although I kept fixing them, I couldn’t break through several major issues. To avoid fixing here and there all the time, I decided to turn it off completely and write a new one. But these issues are so large to solve. Several months later I still had no idea at all. It was completely suspended then, for 2 years.

Recently my other PHP web site content management system project — Monica — had come to some progress and solved several major issues. Then last week Quity was trying to ask me some Linux questions. She couldn’t reach me on ICQ. She left a message on ICQ to ask me deal with the guestbook as soon as possible so that she can leave a message even when I’m not on-line. Then I’m wondering: Why not use my experiences on Monica here? What if I migrate the PHP Monica to CGI/Perl or mod_perl? I would also like to see how much time is required for me to write a new guestbook now, with my current power? That becomes the Selima project, which is this new guestbook you see now. I spent about a week for the migration. PHP is quite different than CGI/Perl or mod_perl. There is still some issues. But it seems to be successful now.

Monica/Selima has solved the major problems of the old guestbook. The following issues are improved:

  1. The messages are not shown with <pre>…</pre> anymore. I’m now using a subroutine a2html() to turn the message into HTML, solving the line-breaking problem while preserving the original text format.
  2. Language switching ability is added. The original mixed Chinese/English pages are split into 3 language pages: Traditional Chinese, Simplified Chinese and English.
  3. UTF-8 is used to save the messages. Unicode processing ability is added. This solves the problems where messages from foreign friends become distorted.
  4. Use suspended forms and redirection so that contents are always displayed with GET requests. No more content on POST requests. This solves the issue where pressing reload on POST produce duplicated messages.
  5. The e-mail addresses in the messages are not displayed directly now. This prevents the spammers from scanning the addresses and spamming our friends.
  6. Pages are split from its message sizes but not the numbers of messages now. This prevents the problem that some page are extremely long and some are very short.
  7. Optimized for mod_perl and compatible with CGI/Perl and even command line. It is more efficient and flexible.

There are so many yet to be listed. It’s a pity that, since Selima uses a complex set of its own libraries, it’s very difficult to release them and let other people understand them. It’s a pity that it cannot be an open-source project.

The guestbook data file format has changed. The old messges are not shown currently. I’ll put them on after I put them in order and convert them.

The name Selima is an Arabic female name. It means peace. The US-Iraq war is going. The US governments disregards the objection from the UN, invades Iraq and causes numerous innocent deaths. I named it Selima in the memorial to those innocent lives and in the hope that war ceases and peace comes to the Arabic world again.


Date: 11.5.’00.

Tavern IMACAT’s has experienced many changes. It’s time to make a conclusion here.

First of all: At the middle of July, my ADSL fast fixed connection is finally installed. Tavern and WOV have finally moved from STS to my own host. This task is much simplier than I’ve expected, because both are Windows 2000 IIS 5.0 hosts, which I’m quite familiar with. Nothing changed. The only problem was: My computer is an old Pentium Ⅱ 233. To be a main Win32 server is too heavy a loading for it.

After Tavern and WOV have moved, I had no fears anymore. I resigned from STS at the middle of August. Then I bought a new Pentium Ⅲ 800 as a new server with my severance pay. It runs Red Hat Linux 6.2. (It has been upgraded to 7.0 now.) At the beginning, it was really a hard day. It’s a complete different world with my familiar Win32, with totally different concept. But I got a nice new job, with a whole library of Linux books. I borrowed them home everyday. Not until I was familiar to Apache server, did I move Tavern and WOV onto the new server. According to the records on the WOV editors’ notes, it was Sep. 16th., ’00., about one month after I bought this machine.

The major change moving to Linux Apache is the HTTP headers. I was not running CGI, but ISAPI PerlIS under IIS. PerlIS cannot parse HTTP headers, so I had to manually set HTTP headers up. But under Apache CGI, direct HTTP headers cause errors. I had to rewrite all the HTTP headers.

Things following up happened intensively.

Sep. 26th., I’ve made the guestbook illustration, anded help message on WOV subscribing pages, and artificial HTTP 403 Forbidden message. These were documents I planned to write long time ago. The major point of guestbook illustration is the line-breaking problem: Guestbooks will not break lines automatically. I’m not planning to do automatical line-breaking. It’s the style of prose poems. That’s a habbit since I was playing BBS. But the line-breaking of guestbooks is still a big problem. I’ve seen the line-breaking policy of Yam, Kimo discussion boards, but no better solution was found. This is still a problem to work with.

WOV editors’ e-mail is added on the subscribing page, in order to help when subscribing program comes to an error. HTTP 403 is Forbidden, to protect site administration pages and subscribers’ data.

Sep. 28th., the robot.txts had done. This is used to limit search engine robots where they can go and where they should not go. This can avoid search engine to build index on private infos, like site administration pages.

Oct. 11th., I’ve employed mod_perl. This is another milestone. Because the company does not use Perl, I was unable to know how to use the legendary mod_perl to enhance the websites. At last I’ve found mod_perl and it’s installation documents, re-compiled, re-installed the Apache server. But the HTTP header processing of Apache mod_perl is different from Apache CGI again! To avoid the same problems hereafter, I abandoned manual HTTP headers, replaced it with CGI::header() module to handle it. I took away my own time2str(), too, replacing it with HTTP::Date::time2str().

This solved another problem: mod_perl was unable to keep-connection. CGI::header() calls the Apache::Registry header parser, which solves the connection problem.

I start to change my old practices. To save system resource and avoid loading unnecessary outer modules, I tried to write my own necessary modules as much as possible. Now to keep the compatibility I rely on outer modules by others as much as possible. This is a important conceptual change for me, especially on Perl and UNIX programming: cooperation. Perl program should be a result of cooperation by different people, but not a fight by a single program dealing all the problems.

This goes the same as UNIX. Different programs, processes call each other to cooperate and transfer messages for the result. I can do it in Win32, too, with too few tools and too much system resource cost. This is not the aim of the design of Win32.

Until this time, I’ve really entered the world of UNIX, a complete different world with DOS/Win32.

And also, mod_perl loads necessary modules and retains them in the apache process, which causes the confusion of module variables. I’ve move the variable settings from the common modules to the seperated scripts. To avoid different common modules with the same name confuses one another, I’ve changed the name of common modules, too.

I’ve written a complicated formula to calculate the last-modified time of each volumn of the guestbooks. This formula was removed now, to avoid the confusion of browser cache.

I’ve downloaded and installed the W3C HTML validator website, run it on a virtual host.

At the same night, I’ve written a more effective internet domain justification with gethostbyname(). I’ve modified the previous formula, to fix the bug that cannot calculate the last modified time when no guestbook messages were present. Also, the atrificial HTTP 303 See Other redirect after saving new guestbook message causes problem at the browser, because mod_perl adds it’s own HTTP 303 message body, causing the length of message body to be wrong. Then the browser cannot judge the real content length. I’ve temporary disabled the Content-Length HTTP header to solve it.

Oct. 19th., I added the flock() file locking before every file opening, to avoid read/write corruption at the same time. It should be done long ago, but I was not clear how it works, so I left it there. Now I’ve read carefully the detail of flock(), knowing how it works, I’ve done with them. Besides, for the flock() new method, I’ve rewrite the program flow of WOV subscribing program. This program flow needs another rewrite to solve another problem: The justification of the last modified time.

Also, I’ve written my own artificial HTTP 500 Internal Server Error message, and trapped the file opening errors to make HTTP 500 errors. This solves the old problem that, site administration program does not warn when rebuilding pages failed.

Oct. 27th is another major milestone. I’ve solved all the problems so far with the weekend. They are described below:

  1. I’ve trapped all errors on file I/Os and other system calls, causing HTTP 500 error messages, to more strictly protect program processing. The artificial HTTP 500 error was originally quit with exit(). Now it quits with die, leaving message on the server error log.
  2. I’ve rewrite the process flow of the counter, to reduce unnecessary file locking, avoiding stucking. Besides, the counter judges the inner IPs with grep() now.
  3. I’ve made clear the detail how cookies works. So I’ve simplified the counter cookies, replacing CGI::Cookies module with simplier CGI::header().
  4. I’ve made clear the detail of GD module, simplify the graphical processing of the counter, the last update and the subscribers counter.
  5. I’ve found the way to recognize Network Operation System, rewritten the process flow of security contron under Win32, and employed getgrnam() and getpwnam() to do the security control under Linux.

Sat., Oct. 28th.. I’ve finally done the keywords system!! I’ve rewritten all the databases and site administration interface, adding the keywords field on them. I was planning to do this for a long time. At the same time I’ve rewritten the interface to edit the relative links, providing greater capability and flexibility. I’ve also written the authority announcement and privacy policy, linking the WOV in Tavern to the privacy policy on WOV.

I was planning to write a privacy policy for a long time. Since I’ve made clear the detail of cookies transfer, I made it together. I hope this do remind you not to leave personal infomations incautiously. As for the Authority Announcement, well… Basically, I hate these authority stuffs. But some commercial sites just went too far. Besides, some people started to get interested in me and my writings. (What have I committed recently?) To avoid troubles, it’s better to state it clearly.

Sun., Oct. 29th.. I’ve improved the artificial error messages, trapping the file I/O errors in the artificial error messages themselves, and sending a mail to the webmaster when artificial HTTP 500 occurs.

I’ve solved another prior bug: Because the programs cannot recognize If-Modified-Since HTTP header, and as I’ve described above, the last modified time will trace back to the previous message time when calculating last modified time, this causes the browsers to confuse on the time and data content of their cache, causing the browsers to stuck when re-reading a page. I’ve decided to drop the previous last modified formula, with a new formula according to the last modified date of the program and the data themselves. I also add the support to response to the If-Modified-Since HTTP header, and the artificial HTTP 304 Not Modified message, to support browser cache.

I’ve made the artificial HTTP 405 Method Not Allowed error, to avoid request with PUT / DELETE / TRACE … Although this shall never happens normally, it’s rather stupid if it happens without handling.

To this point, the site programming has come to an end. I’ve solved all the problems I could figure out, and took a deep breath.

Nov. 2nd., there were nearly 600 messages on the Travellers’ Guestbook now. The old database engine was overloaded, causing serious lags. After experimentation with different methods, I’ve rewritten the database engine. The old engine parses data one by one, while the new one does with batch processing. It’s a little tricky, though, but it largely speeds up. The new database engine keeps a nice speed even over 6,000 messages. What if over 10,000 messages? It’s too far for me to worry now. Basically, I’m not planning to use SQL database. It’s a principle: portability. I wish the whole website can move entirely onto any server at any time, not depending on the specific factors of the environments: MySQL? mSQL? PostgreSQL? MS-SQL? Access ODBC? Or IBM DB2?

Because the site engineering has come to an end, I planned to release the source of Tavern IMACAT’s. This is a rather large project. Even I was coding with good conventions, I had not make many annotations. They still need lots of annotations for others to understand.

This weekend I tried to move Tavern IMACAT’s back to Win32 IIS environment. It’s more complicated then I’ve expected. PerlIS cannot parse HTTP headers, IIS CGI cannot set the current working directory, and mod_perl Apache::Registry deal with CGI::header() differently with others. I’ve spent 3 days, looking for all the Perl and mod_perl documentations and discussions. I’ve finally made the code that, employs absolute path to get the maximum compatibility, and adjust HTTP headers with special situation. This solved the problem I’ve mentioned above: CGI::header() will add message body automatically for artificial HTTP errors under mod_perl Apache::Registry, causing wrong message body length which confused the browsers. Now programs will not send message bodies under mod_perl, leaving this task to mod_perl itself.

I’ve rewritten the process flow of the graphical programms, including the counter, the last update, the subscribers counter, in order to fit the convention with other programs on the process flow.

There’s still things to do: The process flow of WOV subscribing program needs to be rewritten: Process subscribtion first then judge the last modified time. I’ll have to add more annotations, too, for the source code to release. Also, I’ll have to add the version changes of this website.

It’s not a big deal to release the source code. Free source of guestbooks and counters are now available everywhere. But it’s still an important milestone for me.


Index | First | Previous | 1 | 2 | 3 | 4 | 5 | Next | Last