A few Linux utilities that are useful for manipulating XMLTV schedule files

In my previous article, Some hints for getting free-to-air satellite channels into the Electronic Program Guide in Kodi or XBMC (or another frontend), I mentioned that schedule “grabber” programs save their files in XMLTV File format. So let’s say you have one or more XMLTV type files, but you want to do some additional manipulation on them before feeding them to your backend software.  Here are a few tools that run under Linux that I have found that may be useful, under certain circumstances.  These are in addition to zap2xml, which I mentioned in my previous article.

NOTE:  To find out if you have a particular program installed on your system, try entering the word which followed by a space and the program name at a Linux command prompt.  If the program is installed on your system, it should show you the path to the file.  Note that you will probably need to use the full path and filename if you are attempting to run the program from a shell script or a cron job!

  • tv_cat – Concatenate XMLTV listings files. The man page description says, “Read one or more XMLTV files and write a file to standard ouput whose programmes are the concatenation of the programmes in the input files, and whose channels are the union of the channels in the input files.”  Or in simple terms, it merges XMLTV format files together.  This program may already be on your backend system but if it’s not, you can typically install it on a Ubuntu/Debian-based system (and possibly in some other Linux distros) by installing the xmltv package.  tv_cat is a bit picky about the format of the files it will combine, so check the output carefully to make sure it is including all the channels.  I had some issues using tv_cat with TVHeadEnd, and wound up using a small quick-and-dirty Perl script to combine XMLTV listings files instead.
  • sed – Stream EDitor. This is a utility built into just about EVERY Unix/Linux system out there, and it’s probably available for Windows in some form also. sed is more or less a one-trick pony – it searches for text and replaces it with something else.  You can use it to resolve duplicate channel ID’s in two different XMLTV files by changing them in one of the files, using a command of the form sed -i ‘s/original text/replacement text/’ filename but note that there are some potential “gotchas”, so read the documentation first.  For example, if either the search or replace string contains a / character, it mush be “escaped” with a backward slash, so as not to be confused with the / delimiter character.  So if, for example, your search or replace string included the closing tag </display-name> you’d use <\/display-name> instead. (EDIT: You can also change the delimiter character to avoid this issue – see the first comment below).

    I will note that there are some “purists” out there that will say that you should never use sed to manipulate an XML file, even though it’s easy and (if you are careful to use unique strings that don’t appear anyplace that you don’t want to change) fairly foolproof, so I suppose I had better mention a tool that is specifically intended for manipulating XML files…

  • xmlstarlet – command line XML toolkit. According to the description, “XMLStarlet is a set of command line utilities (tools) which can be used to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for plain text files using UNIX grep, sed, awk, diff, patch, join, etc commands.” This is another one that you will likely find in your Linux distribution’s repository, at least if you are running a version of Ubuntu or Debian.  This one offers you a lot more flexibility in manipulating XML files, but at the expense of being somewhat more complicated to use.

Since xmlstarlet is a bit difficult for some users to wrap their heads around, I will give here some actual examples of how it could be used on an XMLTV format file, but please note that I am no expert with this so if you have a different proposed usage, please try to figure it out for yourself using the handy documentation, available as a web page or in PDF format.  No offense, but better you should spend a couple hours trying to figure out the correct syntax to achieve whatever results you want than me! 🙂

1. Remove all “Local Programming” entries from an XMLTV file named xmltv.xml and save to newfile.xml:
xmlstarlet ed --delete "//programme[title='Local Programming']" xmltv.xml >newfile.xml
2. Same as above but only for one specific channel:
xmlstarlet ed --delete "//programme[title='Local Programming'][@channel='someid.someaddress.com']" xmltv.xml >newfile.xml
3. To change the value of @channel wherever it appears in the file:
xmlstarlet ed -u "//programme[@channel='someid.someaddress.com']/@channel" -v 'newid.newaddress.com' xmltv.xml
4. To extract all entries for a specific channel to a separate file (non-destructive – does not change the original file):
xmlstarlet sel -t -m "//programme[@channel='someid.someaddress.com']" -c . -n oldfile.xml >newfile.xml

Note that I am not saying that any of the above are the best example of how to do something.  As you can see, especially from the last example given, this program has some rather non-intuitive syntax for its command line arguments (to put it mildly).  If you have any additional – or better – examples of using xmlstarlet to manipulate XMLTV files, please leave them in a comment and I will consider adding them here.

That said, if you need to do an operation on an XMLTV file and don’t want to write a program or script to do it yourself, xmlstarlet could be your salvation – IF you can figure out how to use it!