A few Linux utilities that are useful for manipulating XMLTV schedule files

In my previous article, Some hints for getting free-to-air satellite channels into the Electronic Program Guide in Kodi or XBMC (or another frontend), I mentioned that schedule “grabber” programs save their files in XMLTV File format. So let’s say you have one or more XMLTV type files, but you want to do some additional manipulation on them before feeding them to your backend software.  Here are a few tools that run under Linux that I have found that may be useful, under certain circumstances.  These are in addition to zap2xml, which I mentioned in my previous article.

NOTE:  To find out if you have a particular program installed on your system, try entering the word which followed by a space and the program name at a Linux command prompt.  If the program is installed on your system, it should show you the path to the file.  Note that you will probably need to use the full path and filename if you are attempting to run the program from a shell script or a cron job!

  • tv_cat – Concatenate XMLTV listings files. The man page description says, “Read one or more XMLTV files and write a file to standard ouput whose programmes are the concatenation of the programmes in the input files, and whose channels are the union of the channels in the input files.”  Or in simple terms, it merges XMLTV format files together.  This program may already be on your backend system but if it’s not, you can typically install it on a Ubuntu/Debian-based system (and possibly in some other Linux distros) by installing the xmltv package.  tv_cat is a bit picky about the format of the files it will combine, so check the output carefully to make sure it is including all the channels.  I had some issues using tv_cat with TVHeadEnd, and wound up using a small quick-and-dirty Perl script to combine XMLTV listings files instead.
  • sed – Stream EDitor. This is a utility built into just about EVERY Unix/Linux system out there, and it’s probably available for Windows in some form also. sed is more or less a one-trick pony – it searches for text and replaces it with something else.  You can use it to resolve duplicate channel ID’s in two different XMLTV files by changing them in one of the files, using a command of the form sed -i ‘s/original text/replacement text/’ filename but note that there are some potential “gotchas”, so read the documentation first.  For example, if either the search or replace string contains a / character, it mush be “escaped” with a backward slash, so as not to be confused with the / delimiter character.  So if, for example, your search or replace string included the closing tag </display-name> you’d use <\/display-name> instead. (EDIT: You can also change the delimiter character to avoid this issue – see the first comment below).

    I will note that there are some “purists” out there that will say that you should never use sed to manipulate an XML file, even though it’s easy and (if you are careful to use unique strings that don’t appear anyplace that you don’t want to change) fairly foolproof, so I suppose I had better mention a tool that is specifically intended for manipulating XML files…

  • xmlstarlet – command line XML toolkit. According to the description, “XMLStarlet is a set of command line utilities (tools) which can be used to transform, query, validate, and edit XML documents and files using simple set of shell commands in similar way it is done for plain text files using UNIX grep, sed, awk, diff, patch, join, etc commands.” This is another one that you will likely find in your Linux distribution’s repository, at least if you are running a version of Ubuntu or Debian.  This one offers you a lot more flexibility in manipulating XML files, but at the expense of being somewhat more complicated to use.

Since xmlstarlet is a bit difficult for some users to wrap their heads around, I will give here some actual examples of how it could be used on an XMLTV format file, but please note that I am no expert with this so if you have a different proposed usage, please try to figure it out for yourself using the handy documentation, available as a web page or in PDF format.  No offense, but better you should spend a couple hours trying to figure out the correct syntax to achieve whatever results you want than me! 🙂

1. Remove all “Local Programming” entries from an XMLTV file named xmltv.xml and save to newfile.xml:
xmlstarlet ed --delete "//programme[title='Local Programming']" xmltv.xml >newfile.xml
2. Same as above but only for one specific channel:
xmlstarlet ed --delete "//programme[title='Local Programming'][@channel='someid.someaddress.com']" xmltv.xml >newfile.xml
3. To change the value of @channel wherever it appears in the file:
xmlstarlet ed -u "//programme[@channel='someid.someaddress.com']/@channel" -v 'newid.newaddress.com' xmltv.xml
4. To extract all entries for a specific channel to a separate file (non-destructive – does not change the original file):
xmlstarlet sel -t -m "//programme[@channel='someid.someaddress.com']" -c . -n oldfile.xml >newfile.xml

Note that I am not saying that any of the above are the best example of how to do something.  As you can see, especially from the last example given, this program has some rather non-intuitive syntax for its command line arguments (to put it mildly).  If you have any additional – or better – examples of using xmlstarlet to manipulate XMLTV files, please leave them in a comment and I will consider adding them here.

That said, if you need to do an operation on an XMLTV file and don’t want to write a program or script to do it yourself, xmlstarlet could be your salvation – IF you can figure out how to use it!

Some hints for getting free-to-air satellite channels into the Electronic Program Guide in Kodi (or another frontend)

If you are running a satellite backend system such as TVHeadend or MediaPortal (or MythTV, if you are one of the lucky few that can actually get it to work), and you use Kodi or the MythTV frontend, then it is possible to populate the schedule grid with listings from many sources. Note I did not say that it is easy, just that it is possible. The key is to use an external program such as zap2xml (Zap2it TV listings to XMLTV or XTVD .xml). These are commonly referred to as “schedule grabbers”, or just “grabber” programs.

The real trick is figuring out how to use one of those programs. Typically they are used to grab listings for a single over-the-air market, not a hodgepodge of stations and services from various locations. Such programs will create a xml file that contains schedule listings (in TVHeadend it will typically be at /home/hts/.xmltv/tv_grab_file.xmltv, assuming that “hts” is the TVHeadend username on your system), and that file will have all the over-the-air channels in your area, or all the cable or satellite channels from your provider. If you want to use zap2xml and you’ve never set it up before, I’ll give you some setup hints later in this article. But for now, lets assume that you have it all set up and you know how to create an xml file (named /home/hts/.xmltv/tv_grab_file.xmltv) containing your local listings.

Once you have created the xml file, it can be imported into the TVHeadend, MediaPortal, or MythTV database. To import it into TVHeadend, you need to use a file called tv-grab-file, which must be downloaded and moved into the /usr/bin directory on the system running TVHeadend (be sure to make it executable, since it is a bash script). Also, you may need to edit the line in the script that starts with “cat” and contains the path to the tv_grab_file.xmltv file, to specify the correct path and file name of the file produced by your selected grabber program, if it isn’t being saved as /home/hts/.xmltv/tv_grab_file.xmltv. Note that if the tv_grab_file.xmltv file is not owned by the TVHeadend user, it must at least be made readable by TVHeadend – incorrect permissions and/or ownership on this file will make it inaccessible to TVHeadend.

Once you have done that, you tell TVHeadend to use the tv_grab_file script by going to Configuration | Channel / EPG | EPG Grabber page.  First, if any grabbers are enabled in the “Over-the-air Grabbers” section, disable them by unchecking the boxes next to them.  Then, in the “Internal Grabber” section, select “XMLTV: tv_grab_file …”  in the dropdown (you may need to restart TVHeadend or reboot the system before it will appear – if it still doesn’t appear, check the ownership and permissions of the file, it should be the same as the other tv_grab_* files in the /usr/bin directory – owned by root, and executable by all users). This is how it looked in previous versions of TVHeadend:
TVHeadend Internal Grabber SettingsIn the newest versions of TVHeadend you can actually schedule the Internal Grabber to import the listings at a specific time each day. Here I have set it to run at 2:33 AM every day (after commenting out the default):

NEW TVHeadend Internal Grabber settingsAfter you get the schedule grabber working, all you need to do set up a cron job or scheduled task to run your listings grabber (the zap2xml program, or whatever you use) once a day. Then if you have the newer version of TVHeadend as shown above, set it up to import the listings ten minutes after you have run the listings grabber (you could probably get by with a shorter interval, but why rush it – you want to make sure the listings grabber has completed its task before TVHeadend grabs the resulting file).

NOTE: There are some people that find that for whatever reason, they cannot run the tv_grab_file script. This most often happens on systems or devices where bash is not installed (to determine whether that is the case, enter which bash at a Linux command prompt and if bash is installed it will display the path, typically /bin/bash). In such a case it may be possible to use a modified tv_grab_file script. For example, in my Review of the TBS MOI+, I showed a variation of that script that runs under ash (as provided in BusyBox), since the original bash script won’t work in that environment. But if you can’t do it that way, there’s another method that involves using xmltv.sock, which is discussed in this thread on the Kodi forum. The advantage of doing it that way is that your new listings can be imported into TVHeadend immediately after you’ve obtained them, but the disadvantage is that most people find it harder to get that method to work, and the use of tv_grab_file is definitely easier to explain. If you do use xmltv.sock then you must go to the External Interfaces section of the page shown above, and check the box for the XMLTV module (not shown in the screenshots).

Note that after TVHeadend imports NEW channels, you MUST refresh the browser window before the new channels will appear in the EPG Source dropdowns under the Configuration | Channel / EPG | Channels tab. You may even need to close and re-open your browser. Failure to do this is probably the #1 reason people think TVHeadend has not imported the newly-added channels.  You must select an EPG source for each channel before schedule data for that channel will be read into TVHeadend, which will happen on the next scheduled import of the TV schedule data.

NOTE: You can force an unscheduled read of TV schedule data by temporarily setting the Internal Grabber Module to “No grabber”, clicking the “Save Configuration” link, then changing the Internal Grabber Module back to “XMLTV: tv_grab_file …” and clicking the “Save Configuration” link again.

When you are setting up your listings sources (channels) on whatever TV listings service you plan to use, if you don’t know which providers or stations carry a specific channel, you can look it up on Wikipedia, which will often tell you which providers and/or local stations carry that channel.

There are some national services that I will not name here, but that aren’t listed in the channel listings because they aren’t intended for viewing by home viewers. You need to get creative with those. For example, if you just happen to find a feed of the QXZ network, and you are smart enough to not blab about it all over creation so that the signal gets scrambled, you may be able to get listing data for at least prime time by grabbing the listings for the network owned “flagship” station in your time zone, which might be WQXZ on the east coast or KQXZ on the west coast. Of course the QXZ network is totally fictitious, but hopefully you get the idea.

If you spend a little time studying the XMLTV File format, you can even write your own scripts or programs to create “fake” XMLTV data for certain stations – I suppose you could even do it manually if you are a very patient and precise person.

There is one pitfall to all this, which was actually more of a problem with a grabber program that’s no longer useful due to changes in the underlying service, but I’ll mention it anyway just in case you ever run across it. The various schedule sources sometimes change channel ID numbers without any advance warning, particularly when you are grabbing terrestrial (over-the-air) channels. If you find that the schedule data for a particular channel “runs out” after a certain day, it’s probably because the ID numbers have changed. In TVHeadend, go to the Configuration | Channel / EPG | Channels tab, find the affected channel(s), and change the EPG Source using the dropdown – you will probably see both the old and new ID’s, usually one underneath the other. Uncheck the old one and check the new one, and the next time TVHeadend updates its EPG information, it should be okay (don’t forget to click the “Save” button before you leave the page!). By the way, in case you hadn’t figured it out from the previous paragraphs, that’s the place where you select the channel data to associate with a channel in the first place, but I will again note that if the EPG sources aren’t appearing in the dropdowns after TVHeadend has imported your xml file, then you may need to refresh the page, or close and re-open your browser.

EDIT: For those that have never set up zap2xml before, here is the general procedure. These are basically their instructions, but with some added comments to help clarify what needs to be done.

1. Register your free Zap2it.com TV Listings account (input zip/postal code and select lineup) – COMMENT: I suggest you only select an over-the-air lineup, not a cable provider and particularly not a satellite provider, and here is why. When you are finished adding all your “favorite channels”, change your location to a zip or postal code in your time zone, but in a place where it’s impossible to receive any of the favorites you have selected. What you will then notice is that in the Zap2it grid, no channel numbers are displayed for each channel. That is a good thing, because as I mentioned above, channel numbers can change without warning (particularly on cable or commercial satellite lineups) and this will stop the channel numbers from being imported into TVHeadend. When channel numbers aren’t shown, zap2xml cannot include them in the channel ID strings it creates, therefore you no longer need to be concerned that an unexpected channel number change will leave you without EPG data for one or more stations.

(I will just note here that recent releases of zap2xml allow you to select TV Guide as your listings source instead of Zap2it. While TV Guide arguably provides better program descriptions, it will not allow you to display listings or favorites from more than one TV market area at a time, and they do use channel numbers in their identifiers and there is no way to disable that. Since satellite users often watch channels from several different TV market areas, I suggest sticking with Zap2it unless you have such mad programming skills that you can figure out how to combine separate listings from the different areas where your channels originate, and you don’t mind creating a separate TV Guide account for each such market area).

2. Click Set Preferences:

Manage Favorite Channels

COMMENT: It has come to my attention that some people don’t seem to understand why you should set favorite channels. The reason for using favorites, and then setting Zap2it to display only those favorites, is to reduce bandwidth usage. This has three benefits, two for you and one for Zap2it. First, if your ISP enforces usage caps, you’ll be using less data if you’re not downloading schedules for stations that you cannot receive or that you never watch. Second, it takes time to download schedule information, so when you are testing (as you might after adding a new channel) you won’t have to sit and twiddle your thumbs while unnecessary schedule data is being downloaded. And finally, if everyone downloaded a monster amount of schedule data from Zap2it every night, it could increase their costs, and then they might change things to make this data completely unavailable to us. This is also the reason I tell you to use a six hour grid below; it reduces the amount of data that you receive (and that Zap2it has to send) while obtaining schedule data, without omitting necessary information. Please follow the instructions in this section exactly as shown, even if some other well-intentioned person tells you (incorrectly) that these settings don’t do anything.

  1. Select your favorite channels from the “Available Channels” and put them in the “My Favorite Channels” list – COMMENT: Once you select a location and select your favorite channels, you can then change your location (using a different zip or postal code) and add more favorites from the new location, without losing the ones you have already added. So, you are not limited to adding only channels that are available to a particular zip or postal code to your favorite channels list.

Additional Settings – COMMENT:  These are VERY IMPORTANT, don’t skip any of these steps! And, please don’t listen to anyone who says these are not important – see the explanation above.

  1. Checkmark [✔] “Show six hour grid”
  2. Checkmark [✔] “Show only my favorite channels in the grid”
  3. Click “Save”

3. You may need to install the required supporting perl libraries (not needed with the Par-Packed Windows file) – COMMENT: I’d try step 4 first before you go adding any perl libraries, as they may already be present.

4. Run zap2xml with the userEmail and password parameters of your account – COMMENT: Watch the output and see it complains about missing Perl libraries. If so, see step 3. Here is how you might invoke it, assuming that you are running TVHeadend and that your TVHeadend user is “hts”:

./zap2xml.pl -u user@email.com -p yourpassword -o /home/hts/.xmltv/tv_grab_file.xmltv -c cache -F -O -T -q

(Leave off the -q while testing, because it may suppress error messages that you’ll want to see). This assumes that you have created a directory named “cache” (as a subdirectory off the directory where the zap2xml.pl script is saved) and that you have either already created a file /home/hts/.xmltv/tv_grab_file.xmltv (can be a zero byte dummy file to start) and made it world writeable, or else that you are always going to run the zap2xml.pl script as the TVHeadend user, in order to avoid file permissions issues. Of course, you must create the /home/hts/.xmltv directory if it doesn’t already exist. The point is that when you run the script and it tries to create the /home/hts/.xmltv/tv_grab_file.xmltv file, you don’t want it to fail due to file permissions or ownership issues.  You can read about the options shown above, and others you may wish to use, on the zap2xml home page (in particular, you may want to use something like -d 12 to get 12 days of listings instead of the default 7).  NOTE: Should you happen to be running OpenElec or LibreElec, which I most emphatically DO NOT RECOMMEND, zap2xml.pl will not work for you, in part because you may find it difficult or impossible to get to a command line, but also because Perl is not available.  There is an alternative for OpenElec/LibreElec that is written in Python, but I have absolutely no experience with it.

5. Optionally set up a cron job/task scheduler task to run it every day – COMMENT: Keep in mind that when setting up a cron job, you must use full paths, and not any shortcuts such as “.” or “~” in the path. The cron job or task should be scheduled to run a few minutes BEFORE TVHeadend’s Internal Grabber is scheduled to run. So if TVHeadend’s grabber is set to run at 2:33 A.M., you may want to run your cron job that invokes zap2xml at 2:23 A.M., giving it ten minutes to finish (which is, generally speaking, more than ample time for it to run to completion).  Please set it to run at some odd random time (in other words, not right on an hour, half hour, or quarter hour mark) so that everyone isn’t clogging up the servers at once.

(End of edit.)

I am painfully aware that there is nothing at all that is easy about this process, and it probably makes you wish that we had European-style free-to-air services, where EIT guide data is embedded right in the program stream (lucky Europeans!1). But since we don’t, I just wanted you to be aware that you don’t need to have a blank EPG in your Kodi Live TV section. There is definitely a learning curve to getting it all working, but the more you work with it the more you will understand how all the pieces fit together. You may never be able to get schedule data for every channel you can receive, and you’ll obviously never find it for “wild feeds” that come and go, but I’ve been able to populate the EPG grid for quite a few of the stations I’m able to receive on my dishes.

(By the way, if you want icons for the channels, at least in Kodi you have to load those to the frontend system – as far as I can tell, there’s no good way to put them in the TVHeadend backend server and have Kodi get them from there. So just create a directory, dump your channel icons there, rename them to EXACTLY match the channel names except for the extension – for example, if you have a channel named “Big Fart Channel”2 then the icon file name should be “Big Fart Channel.png”, or whatever the correct extension is – then in Kodi go to System | Live TV | Menu/OSD and modify the “Folder with channel icons” setting to point to the directory containing the icons.  And yes, I’m aware that you can supposedly enter a path to a “User Icon” for each channel in TVheadend’s Configuration | Channel / EPG | Channels tab, but those are really intended to be the paths to “picons” supplied by TV channels in some other parts of the world.  I’ve found that if you attempt to use those, more than likely you are going to slow down Kodi or cause some other undesirable effect, such as Kodi hanging when you attempt to quit Kodi.  You are better off to store your channel icons on each of your devices that run Kodi.)

If you know of any software tools that would make this easier, or pick up any hints or tips that I have not mentioned, please feel free to post a comment (but be aware that I will not approve spam comments that promote commercial services. Also, I don’t need any praise – if you like this article, don’t tell me, tell your friends that have satellite dishes and that could possibly benefit from this information!).

Here are some possibly useful links:

zap2xml software and documentation.
tv-grab-file, a file used with TVHeadend to import xmltv format data
Zap2xml for ATSC in OpenELEC (Kodi forum thread)
Installing zap2it grabber in OpenELEC (YouTube Video)
Setup instructions and files for Synology NAS users (TVHeadend forum thread)
NextPVR – EPG Setup – XML/XMLTV EPG – Zap2it & Zap2xml (NextPVR forum thread)
Home DVR Tvheadend OTA EPG Setup (Part 2) (YouTube Video)

Note regarding the video in the previous link: It shows the basic setup, but using an older version of TVHeadend, and using the free version of the mc2xml listings grabber which stopped working in July, 2015. So, don’t use mc2xml, use zap2xml instead. Carefully read and follow the instructions at the top of that page on how to set up your Zap2it account, and also consider utilizing the tricks I mentioned earlier in this article.

If you found this article useful, you may like my followup article, A few Linux utilities that are useful for manipulating XMLTV schedule files.

NOTES:
1 Sure, the lucky Europeans get EIT guide data with their free-to-air channels, but at least we don’t have to pay a “telly tax” on each TV set we own, so there’s that!
2 Someone REALLY should start the “Big Fart Channel” – that would be a real gas! And with that, this article has really bottomed out. What do you mean, my puns stink?

Do you know how to disable overscan on your TV, and why you should?

More and more Free To Air satellite viewers are starting to use backend computers that have DVB-S/S2 tuner cards installed, and use backend software such as TVHeadEnd. This allows playback in all rooms of the home, but at each traditional HDTV set you need a computer of some type to act as a frontend. The frontend is then connected to the TV set via a HDMI cable.

Now what you may not realize is that when you do this, or when you use a a more conventional digital satellite receiver with HDMI output, chances are that your TV is not showing you the full picture that the frontend or receiver is sending. In fact, it almost certainly isn’t if you haven’t disabled overscan. What is overscan, you ask? Well, I could now write several paragraphs attempting to explain it to you and why it matters, or I can send you to a very good article at Engadget that explains it much better than I ever could. Since I am a bit on the lazy side, you get one guess which choice I am going to make. Here’s the article:

HD 101: Overscan and why all TVs do it

Here’s another one on the subject from Heron Fidelity:

A Good HDTV Shows You Everything

And one more from HD Guru:

Is Your HDTV Under Performing? Here’s a Fix

Note that this affects more than just satellite reception – it degrades the picture from nearly every digital source that your TV can receive signals from, including non-TV sources such as game machines. Manufacturers leave it enabled because they would rather that some of picture be lost, and the remaining picture a bit less sharp, than to have consumers return TV’s as “defective” because they happen to see a line at the top or side of the screen on one or two channels.

The big problem is that not all sets have a way to turn it off, and some that do hide the setting that turns it off very well. I’ve seen the feature referred to as Overscan, 1:1, Dot by Dot, Just Scan, Exact Fit, Screen Fit, Native, and a few other terms I can’t remember offhand. It might be grouped with the picture size and aspect settings, or it might be in an advanced settings menu, or in some other place entirely. Some manufacturers, such as Vizio, either omit it entirely or include it under a rather non-obvious setting. And I have even come across TV’s where you have to change the way the frontend computer sends the signal to the TV before the option is exposed, such as with certain older Sharp models.

Quite a few of the signals on the satellites nowadays are in glorious high definition, so why would you want your TV to degrade that signal? Find the setting to disable overscan, if your TV has one, and activate it. The only exception to that advice might be if you are primarily watching old TV shows made in pre-digital times, and the uplinker is transmitting the video with the old analog-style closed caption and other control data visible as flashing white and black scan lines at the top of the picture area. There is no really no reason any digital uplinker should be doing that, but I cannot assure you that none of them do it.