Friday, February 05, 2021

Web Comics Crawling

Man, it has been a looooooooong while since I wrote an entry tipsy.

My choice of poison for the evening was the Absolut Oak. I am not usually one who goes about buying ``limited edition'' drinks, but this was bought from a friend a long while back while said friend was trying to off-load some of the things that was obtained when with a spouse-to-be that didn't. It was not something that I would normally get either (not really a vodka person), and the first time that I tried it, it tasted bad compared to the whiskeys that I was trying at the same time.

Funny enough, over time, the flavour profile became more palatable.

Anyway, the purpose of this entry isn't about talking about the spirits that I have been quaffing, but on what I had done for the day.

I updated some of the off-line copies of web comics that I have been following through careful crawling with my own scripts. These scripts had been written in Python2, but since that [version of the Python interpreter] is no longer going to be updated, I have taken the opportunity to update it to Python3 as well. The crawling principle is simple: from a given starting HTML page, find which is the comic image, download it, wait some time, then follow the ``next'' link to the next comic image and continue the process until we reach a loop or no way of proceeding.

I like to create off-line copies of my web comics for the simple reason of making it easier to go through the material at my own leisure. The creation of cbr files means that it is infinitely more portable than relying purely on the online access. I can just view them with any associated readers, like CDisplayEx on the PC or Perfect Viewer for Android.

I'm not linking to CDisplayEx because it allegedly has CandyOpen malware, which I honestly have not seen the effects of. I use MVPS hosts as the last line of defense at the O/S level to prevent connections to sketchy servers via DNS, so that may be a reason why I am not seeing any issues.

Or it could be that they have removed said malware. Who knows, really?

At the same time, I have been reading more of Adi Parva of The Mahabharata of Krishna-Dwaipayana Vyasa, clocking in at another 98 pages today. The current page count is 472/768 completed, and I seem to be on track to complete this first book by end of next week. Then I can start on something else, while slowly advancing my progress on Harrison’s Principles of Internal Medicine (20th Edition).

That's all I have for today. Till the next update.

No comments: