16 May 2020

Gradually importing my Wordpress blog

I've made a bit of progress in importing my Wordpress blog, though I must admit it is quite a struggle. Although I'm using Emacs for the blog, I'm very new to Emacs. As someone has said, it isn't so much an editor as an engine for the LISP programming language, with many arcane functionalities. You basically program Emacs to work in whatever way you want and do the things that you want. But I'm not a programmer and don't know LISP, so that means trying to figure out what other people have programmed for it and incorporating the useful parts.

Someone had suggested using a piece of Pelican - another static-blog generator - as a middleman for importing the blog from Wordpress. I tried that, unsuccessfully. One problem was that in installing the necessary files I filled my Linux root partition to 100% and basically had to abandon that whole Linux installation. This meant taking some of the generous space I had assigned to a Swap partition, creating a new Linux root partition, downloading and installing a new version of my Linux distribution (MX Linux) to there and reusing the home partition. It wasn't so bad - the actual re-installation took about an hour. Having learned the hard way that 12 Gigabytes isn't enough for a modern Linux system (though it used to be), I assigned 20 Gigabytes this time.

But the Pelican importer wouldn't work for me. It depended on various elements for Python that seemed to be incompatible one with the other. I gave up with that and started looking for different options. One was to try to edit the XML download file containing all my blog posts from Wordpress. I imported it in Libreoffice as a 1,000 page file and started searching and replacing all the extraneous parts. But that was ridiculous, and eventually the program crashed. I gave up on that.

I looked into various feed readers that I might use instead. If I could download the posts as an RSS news feed, that would halfway serve my purpose. So I discovered that Emacs itself has the potentiality to work as a feedreader. It relies upon a component called Elfeed. That works very well in fact. Again, it was a bit of struggle, during which I learned various new things about Emacs. For example, I learned that the configuration for the whole Emacs system is properly donxe through going into a command known as Customize (Alt-x: “customize” and then searching for the correct component to change). I also learned how to install new functionalities to Emacs without relying only on what the Debian package manager offers.

Eventually I ended up with a nice RSS newsfeed that looked very similar to the flat plain text files that I would need to generate in the Emacs static blog program in order to produce the blog. It would be easy to transfer them over into the blog. The only problem was that the feedreader was importing only a few blog posts, rather than the approximately 700 that I desired. Was this a result of some limitation in the feedreader, or in the RSS file produced by Wordpress.com? It turned out to be the latter more than the former. By default, Wordpress adds only about 10 posts to the feed. But you can change that easily in its Settings, under Reading. It is true that the maximum is still only 150. But that's a starter. I can, I think, import 150, then turn those posts into WP Private posts or drafts, and then continue - if I ever get that far.

The latter is a real question, because the process is still rather time consuming. Smarter people than me would easily write a program to complete the whole process quickly. But I'm still somehow doubtful that it would go well. In my experience, importers are rarely perfect. The import process involves changing various things and a program wouldn't be able to do that so well. So, in my own way, I'm proceeding slowly, and, in the meantime, learning various things about Emacs and the static blog engine under Org-mode.

Writing a static blog, even with the semi-automation, is never going to be quite as comfortable as writing in a Content Management system such as Wordpress offers. You make a simple mistake in a date or in incorporating a link or an image, and something goes wrong. But I have worked with static blogs before, and this system is actually easier than the ones that I've used, It's a huge plus to end up with perfectly readable plain text files that are not held in some database. I understood a while back that plain text files that sit in folders, without any dependency on some program to make them accessible, is the best way to work. Linux has various diary systems like Redbook but I don't like these. Contradicting myself again, I'm actually writing this in Cherrytree notes. I do like Cherrytree. I mostly use it for keeping a knowledge base for all the things I need to do on the computer. There's no way I could remember them, and Cherrytree is an easy way of rediscovering something I learned 5 years ago about Libreoffice or Wordpress. I write everything down. For example, with regard to my work on Emacs. Instead of having to remember the definitions for importing an image, I can copy that line of code into Cherrytree and go back to it.

Emacs Org-mode offers the possibility to entirely replace Cherrytree - in fact notekeeping and ToDo lists are the most common use for it. But normally, in that process, one ends up with huge files. It would be possible to separate those into separate files later, but Cherrytree has much easier ways of importing and exporting to and from just about everything, in addition to rich text formatting and various other niceties. In my experience, I'm liking to go with the simplest option, rather than the smartest, as long as autonomy isn't compromised. Autonomy is crucial. Digitalization constantly offers us so many shortcuts. Using Wordpress is a such a shortcut. Using something like Facebook would be another, much much worse, option. But the shortcuts end up making you work harder to recuperate what you want, and you end up, years later, doing what you should have done from the beginning, and taking charge of your own material. And this is also a matter of proficiency. A programmer has a higher level of ability with regard to software use than a person without programming skills. As someone with only middling capabilities, I'm obviously going to seek options that I can wrap my head around. If the solution appears to be too complicated, I'm going to opt for something simpler.

Because I'm not a computer genius, and lack many skills, I look for simple ways of doing stuff. Simplicity and autonomy are always my aim, even if the ways there are often convoluted. But it's a worthwhile effort. In a century, I imagine, it will still be possible to work with plain text files that don't rely on any proprietary system or complex databases. This Emacs Org Static Blog outputs plain text files that are kept in sync with nice looking web pages. Using such a system is worth the little extra effort. There are probably various other ways to achieve something similar with less energy. There were Posterous and Scriptogram, for example, which allowed one to post from email or from Dropbox. Services like these eventually run out of money and are discontinued. The same may happen to Medium, Tumblr (currently owned by Wordpress), Facebook and all the rest. Plain text files are the way to go. Cheap and best; future-proof, as much as something can be.

Tags: technology
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.