Sunday, February 27, 2022

Fine-tuning Timestamps, and The Great Aquarium

Ah. Night time. Not quite stupid o'clock, but I feel like writing something here.

I spent some time working on a little bit of code between yesterday and today to fine-tune my publishing process for my personal domain. For the uninitiated, I have a customised Python3 script that does minification of the hand-written static HTML/CSS pages, and generates a GZip version to be served faster. I had added a sitemap XML file that is autogenerated as part of the build+deploy process. The final step is to make use of rsync to update the actual files on the server, but without incurring the complete cost of copying all 38M+ bytes.

The bit that requires fine-tuning was something to do with timestamps. Practically speaking, there are three different timestamps associated with the HTML/CSS files. They are:
  1. Logical timestamp as seen by the ``Updated at 2022-02-26T23:44:31+0800'' lines in the pages;
  2. The physical file's timestamp as reported by the operating system through the stat(), more practically observed through output of ls -ltra or whatever the modified date is under Windows Explorer;
  3. And the timestamp of last commit as stored in the Subversion database.
The logical timestamp is useful for the human viewing the page after it is served, but the various caching mechanisms (as well as the rsync process) of the HTTP server uses the physical file's timestamp. There is a fourth (stored timestamp within the GZip file), but that has been adjusted to ensure that it matches the source physical file's timestamp.

The timestamp business affects how many bytes get sent via rsync to the HTTP server. This is because rsync transfers the differences among the files over, and I was frankly getting annoyed with seeing a long list of files that were changed even though I did not change them. Another side problem occurs when I checkout the Subversion repository into a new machine---the physical file times of the same file from that new machine will be different from the ones of my other machines. This means that rsync-ing from one machine to the HTTP server and rsync-ing from the other machine to the HTTP server will guarantee more unnecessary bytes being sent due to the different timestamps.

Thus, I decided to just force set all the local files' timestamps to match what the commit timestamp was for it in the Subversion repository. With the help of this Python3 svn library, I could access the Subversion information for any given file (assuming it's in the repository in the first place). Using that, I managed to update all the timestamps of my files to match those exactly in the Subversion repository yesterday.

But that was insufficient in the long run, because the moment I edit any file, save and then commit, the timestamps [of the physical files] are no longer in sync. And using the simple script that I put together yesterday (till late) meant waiting for 2+ min just to update all the timestamps of the files, even though I only touched like... 2 files. Thus, I spent this morning working on a smaller delta-version that, instead of crawling through the Subversion repository (slow), crawled through the local file system (fast-er), and only triggering the Subversion repository look up if the file in question is ``sufficiently recent''. This strategy only added about 1.0+ s to the overall run-time, but kept the timestamps in sync.

That was a fun little exercise to update my process.

-----

I spent much of my afternoon clearing the other half of the hill in Minecraft. I had this thought of creating a large aquarium using the space beneath my large ``industrial complex'' platform. It was much easier to clear out the space because I didn't have to be as careful as clearing the mountain beneath my brick apartment---I didn't have to worry about accidentally puncturing the dedicated ``floor space''. Being near to my large beacons with the Haste II and Jump II enchantments made my mining speed superlatively high. So by the time I took a break and start writing this, I have already cleared something like three quarters of the hill.

After clearing out the space, I will lay out some sand to form the base. There's a ravine cutting through though, and I need to decide how I want to deal with it. One way could be to set up a glass ceiling in the ravine and lay the sand on top---the other is to set up a network of signs and lay the sand on top of that. The second is much trickier to do, but consumes more wood than glass.

Sand aside, the walls of the space above the sea-level will be made of glass blocks that I have. The hard part after is how to populate the space thus walled up with water---I have an idea, but I need to try it out to see if really works.

Anyway, it's late. I'm tired. I think I'll go take a shower, and then turn in for the night. Till the next update.

No comments: