Chaining Squid URL rewriters – custom URL rewriter chained with SquidGuard

Below describes how to get URL filtering based on SquidGuard and URL rewrite for quality optimization/video caching, etc. Article talks about basics and how to set up URL rewrite and later how to chain multiple URL rewriters,

Basic URL rewrite

Basic URL rewrite has been covered here https://blob.mypn.eu/get-the-resolution-right-squid-basic-url-rewrite-script/.

SquidGuard URL filtering

SquidGuard URL filtering, how to set it up and keep alive has been covered here https://blob.mypn.eu/squidguard-url-filtering/

Chain multiple URL rewriters

Squid, at least on v3.5 does allow to define only single  url_rewrite_program which causes set of implications.  The main disadvantage is that without use of external program it is impossible to chain multiple URL rewriters as needed in our case.

The aim is to have chained above URL bitrate rewrite as well as to use SquidGuard to filter unwanted content, i.e. advertisements (SquidGuard category adv / ads). The perfect solution would be to be able to use ACLs to i.e. direct video related domains for URL rewrite whilst the rest to SquidGuard for filtering.

Given today limitations the simplest way (read lazy) way is to use chaining script which will do the work for us. Checking online one will find http://adzapper.sourceforge.net/#download which provides two scripts: wrapzap and zapchain. These scripts were created by Cameron Simpson back in 2000/2001, so quite some time ago and are often referenced with multiple examples how to get them working.

wrapzap & zapchain

Wrapzap is used to set all environment variables however I can’t find any of them being required for our example. At the bottom of the script it calls the real zapchain with selected URL filters

Regardless of number of tries I could not get this working despite online reports suggesting that it should just work. Other tested option was to run zapchain directly from squid.conf with selected filters.

The main difference was probably due to the way URL rewriters were supposed to work vs nowadays. In the past it seems like rewriter would output just the new URL whilst modern implementation expects syntax:

This required additional parsing and rewrite of original zapchain script to deal with modified output.

This provided expected results and submitting output of one URL rewriter to another. My use case would rarely result in double modification of the output. The order in which rewriters are called represents the hit rate of the both. Adblocker implemented with use of SquidGuard gets much more traffic and will shorten all calls to ads effectively providing less load on the second one.

Leave a Reply

Your email address will not be published. Required fields are marked *