urllist - Manage URL lists
[1] urllist [raz | (add | del) <urllist-name> | update | report]
[2] urllist [(load | vload) [(create | update | auto) <urllist-name> (ftp | sftp | tftp) <file-server> <base-file-name> [(domains | urls | expressions)*]]]
[3] urllist [clear <urllist-name> [(domains | urls | expressions)*]]
[4] urllist auto [<urllist-name> [off | (on (load | vload) (create | update) (daily | weekly) (push | (ftp | sftp | tftp) <file-server> <base-file-name>))]]
This system allows you to filter the access to URLs and domain names according to policies and rules (see the guard command). The command urllist allows you to manage those lists of URLs and domain names. A URL list is represented by a unique name in the system. First, an empty URL list must be created. The first [1] usage form allows you to create, delete or update URL lists. To create an empty URL list use the keyword add followed by a URL list name. A URL list name must begin with an alpha character and may contains alpha numeric characters as well as the characters "_" and "-". To delete a URL list use the keyword del followed by the URL list name to delete. The keyword raz allows you to delete all URL lists.
The system can be configured to periodically update URL list contents (see later). In the first usage form the keyword update allows you to explicitly update URL list contents without having to wait for the periodic automatic updates. This could be useful for instance with a suspended virtual machine having a non updated system time (in this case do not forgetgorget to update the system time prior to updating URL list contents). Using the keyword report allows you to display the latest report of the automatic URL list content updating. During such an update if a URL list can’t be loaded, an error is occurred. For error explanations see the section DOWNLOAD ERROR CODES below.
The second [2] usage form allows you to load URL lists from files located on a file server. Only trusted file servers are allowed. Trusted file servers are defined with the command access. To load and build a URL list from scratch use the keyword load followed by the keyword create. The content of a URL list to load should be defined in three compressed (gzip format) text files having the same base name and three different extensions. The first file has the .domains.gz extension and contains a list of domain base names (one domain base name per line). Only domain base name should be specified (for instance "example.com" is a valid domain base name while "www.example.com" is not). The second file has the extension .urls.gz and contains a list of URLs (one URL per line). A URL is in the form <domain-name>/<loaction>. Please note that the protocol part (for instance "http://") of the URL should not be specified in that file. Finally the third file has the extension .expressions.gz and contains a list of regular expressions (one regular expression per line). See the section REGULAR EXPRESSIONS below.
When defining a domain name, do not specify the prefixes www, web and ftp. For a URL definition, protocol parts (http://, ftp://...), prefixes (www, web, ftp) and ports (:80, :443...) should not be specified. An expression is a regular expression as described in regex (see the regex (5) manual on a UNIX system). Note that complex regular expressions require large CPU resources.
A domain or url diff file contains lines beginning with "-" (minus) or "+" (plus). To add a domain or a URL precede it with the character "-". To delete a domain or a URL precede it with the character "+". The <base-file-name> parameter is the base file name (without extensions .domains.gz, .urls.gz or .expressions.gz) of a defined URL list. File(s) must exist and be accessible on the file server. The optional last arguments specify which URL list types to load. URL list types are: domains, urls and expressions. If no URL list types are specified, domains and urls URL list types are loaded.
To update an existing URL list use the keyword update. In this case downloaded files are diff files. Only domains and urls files can be updated. If more than one update file is loaded for the same URL list, updates well be successively applied to that URL list. In this case the order in which update files are loaded is important.
To automatically download all updates since the last create or update operation, use the keyword auto. In this case downloaded files are diff files and should be named as follows: <base-file-name>.<yyyymmdd>.(domains | urls).gz where <yyyymmdd> is the date (yyyy is the year, mm is the month and dd is the day). In the case where the URL list has never been loaded before and the update mode is used, the create mode is used (the URL list is entirely loaded from scratch).
The vload (verify load) option allows you to secure downloads. This is useful when you download URL lists from a file server managed by CacheGuard Technologies Ltd or one of its referenced partners. When using the vload method, a signature file is downloaded alongside the URL list file and the URL list file is verified using that signature file to assure that the downloaded URL list file has not been altered during its transfer. The signature file name has the same name as the downloaded URL list file followed by the extension .sig. You can print a list of all downloaded files by using the keyword load (or vload) without any other options.
Please note a URL list file in gzip compressed format can’t be larger than than 128 MB. In case where domains, urls and expressions files are all loaded (separately or at the same time), if one of them exceeds the allowed limit, all of them are deleted and the current load operation is immediately cancelled. To load files larger than the allowed limit, you have the possibility to split them into smaller pieces.
The third [3] usage form allows you clear the content of a URL list without completely deleting that list. You can select what parts of the URL list should be cleared by specifying the domains, urls expressions keywords. If no part name is specified, all parts are cleared.
The fourth [4] usage form allows you to program the system to automatically download and apply URL list contents. This can be done daily or weekly. To activate the automatic updating for a URL list use the auto on keywords followed by download specifications. To download the complete URL list use the create keyword. In case where a complete URL list is downloaded, files to download should be named as follows: <base-file-name>.(domains | urls).gz. Please note that both domains and urls files should exist on the remote file server. To only download updates (diff files) use the update keyword. Because downloading and rebuilding a complete URL list can be quite time and bandwidth consuming, always prefer the update mode rather than the create mode. The update mode allows the periodical download of all updates since the last create or update operation. This way, in case of an unavailability of a file server, not update files are lost. To allow the system to automatically download all update files since the last automatic update, files should be named as follows: <base-file-name>.<yyyymmdd>.(domains | urls).gz where <yyyymmdd> is the date (yyyy is the year, mm is the month and dd is the day).
Automatic downloads can be done using the ftp, sftp or tftp protocols. In case where the system is managed by a remote manager system, the manager can be configured to download URL lists and push them to all gateways that it manages. In this case, you must specify the push method as the used protocol. The AUTO UPDATES ON A MANAGER SYSTEM section below gives more information regarding automatic URL lists updates on a manager system.
To deactivate the automatic updating for a URL list use the keywords auto followed by the keyword off.
• Transfer error 1: Unsupported protocol. This build of the program has no support for this protocol.
• Transfer error 2: Failed to initialize.
• Transfer error 3: URL malformed. The syntax was not correct.
• Transfer error 5: Couldn’t resolve proxy. The given proxy host could not be resolved.
• Transfer error 6: Couldn’t resolve host. The given remote host was not resolved.
• Transfer error 7: Failed to connect to host.
• Transfer error 8: FTP weird server reply. The server sent data the program couldn’t parse.
• Transfer error 9: FTP/SFTP access denied. The server denied login or denied access to the particular resource or directory you wanted to reach. Most often you tried to change to a directory that doesn’t exist on the server. To save on an SFTP server always specify the full target path.
• Transfer error 11: FTP weird PASS reply. The Program couldn’t parse the reply sent to the PASS request.
• Transfer error 13: FTP weird PASV reply, The Program couldn’t parse the reply sent to the PASV request.
• Transfer error 14: FTP weird 227 format. The Program couldn’t parse the 227-line the server sent.
• Transfer error 15: FTP can’t get host. Couldn’t resolve the host IP we got in the 227-line.
• Transfer error 17: FTP couldn’t set binary. Couldn’t change transfer method to binary.
• Transfer error 18: Partial file. Only a part of the file was transferred.
• Transfer error 19: FTP couldn’t download/access the given file, the RETR (or similar) command failed.
• Transfer error 21: FTP quote error. A quote command returned error from the server.
• Transfer error 22: HTTP page not retrieved. The requested url was not found or returned another error with the HTTP error code being 400 or above. This return code only appears if -f/--fail is used.
• Transfer error 23: Write error. The Program couldn’t write data to a local filesystem or similar.
• Transfer error 25: FTP couldn’t STOR file. The server denied the STOR operation, used for FTP uploading.
• Transfer error 26: Read error. Various reading problems.
• Transfer error 27: Out of memory. A memory allocation request failed.
• Transfer error 28: Operation timeout. The specified timeout period was reached according to the conditions.
• Transfer error 30: FTP PORT failed. The PORT command failed. Not all FTP servers support the PORT command, try doing a transfer using PASV instead!
• Transfer error 31: FTP couldn’t use REST. The REST command failed. This command is used for resumed FTP transfers.
• Transfer error 33: HTTP range error. The range "command" didn’t work.
• Transfer error 34: HTTP post error. Internal post-request generation error.
• Transfer error 35: SSL connect error. The SSL handshaking failed.
• Transfer error 36: FTP bad download resume. Couldn’t continue an earlier aborted download.
• Transfer error 37: FILE couldn’t read file. Failed to open the file. Permissions?
• Transfer error 38: LDAP cannot bind. LDAP bind operation failed.
• Transfer error 39: LDAP search failed.
• Transfer error 41: Function not found. A required LDAP function was not found.
• Transfer error 42: Aborted by callback. An application told the program to abort the operation.
• Transfer error 43: Internal error. A function was called with a bad parameter.
• Transfer error 45: Interface error. A specified outgoing interface could not be used.
• Transfer error 47: Too many redirects. When following redirects, the program hit the maximum amount.
• Transfer error 48: Unknown TELNET option specified.
• Transfer error 49: Malformed telnet option.
• Transfer error 51: The peer’s SSL certificate or SSH MD5 fingerprint was not ok.
• Transfer error 52: The server didn’t reply anything, which here is considered an error.
• Transfer error 53: SSL cryptographic engine not found.
• Transfer error 54: Cannot set SSL cryptographic engine as default.
• Transfer error 55: Failed sending network data.
• Transfer error 56: Failure in receiving network data.
• Transfer error 58: Problem with the local certificate.
• Transfer error 59: Couldn’t use specified SSL cipher.
• Transfer error 60: Peer certificate cannot be authenticated with known CA certificates.
• Transfer error 61: Unrecognised transfer encoding.
• Transfer error 62: Invalid LDAP URL.
• Transfer error 63: Maximum file size exceeded.
• Transfer error 64: Requested FTP SSL level failed.
• Transfer error 65: Sending the data requires a rewind that failed.
• Transfer error 66: Failed to initialise SSL Engine.
• Transfer error 67: The user name, password, or similar was not accepted and the program failed to log in.
• Transfer error 68: File not found on TFTP server.
• Transfer error 69: Permission problem on TFTP server.
• Transfer error 70: Out of disk space on TFTP server.
• Transfer error 71: Illegal TFTP operation.
• Transfer error 72: Unknown TFTP transfer ID.
• Transfer error 73: File already exists (TFTP).
• Transfer error 74: No such user (TFTP).
• Transfer error 75: Character conversion failed.
• Transfer error 76: Character conversion functions required.
• Transfer error 77: Problem with reading the SSL CA certificate (path? access rights?).
• Transfer error 78: The resource referenced in the URL does not exist.
• Transfer error 79: An unspecified error occurred during the SSH session.
• Transfer error 80: Failed to shut down the SSL connection.
• Transfer error 82: Could not load CRL file, missing or wrong format.
• Transfer error 83: Issuer check failed.
An expression
list file contains regular expressions with one regular
expression per line. The command guard uses Posix regular
expressions. The Posix Regular Expression language is a
notation for describing textual patterns. Of most interest
is:
. Matches any single character (use "." to
match a ".").
[abc] Matches one of the characters ("[abc]"
matches a single "a" or
"b" or "c").
[c-g] Matches one of the characters in the range
("[c-g]" matches a
single "c" or "d" or "e" or
"f" or "g". "[a-z0-9]" matches
any single
letter or digit. "[-/.:?]" matches any single
"-" or "/" or "." or
":"
or "?".).
? None or one of the preceding ("words?" will
match "word" and "words".
"[abc]?" matches a single "a" or
"b" or "c" or nothing (i.e.
"")).
* None or more of the preceding ("words*" will
match "words" and
"wordsssssss". ".*" will match anything
including nothing).
+ One or more of the preceding ("xxx+" will
match a sequence of 3 or
more "x").
(expr1|expr2) One of the expressions, which in turn may
contain a
similar construction ("(foo|bar)" will match
"foo" or "bar".
"(foo|bar)?" will match "foo" or
"bar" or nothing (i.e. "")).
$ The end of the line ("(foo|bar)$" will match
"foo" or "bar" only at
the end of a line).
\x Disable the special meaning of x where x is one of
the special regex
characters ".?*+()^$[]{}\" (\. will match a
single ".", "\\" a single
"\" etc.)
On a manager system, the automatic URL lists update can be activated depending on the installed license level. The activation is normally possible for non-free installation only. This feature on a manager allows you to periodically download URL lists from one or more remote file servers and then automatically push them on all remote gateways. Please refer to the manager command for further information.
If on a manager, an automatic URL list download is configured to download updates (diff files and not the complete content) and after a successful download from a remote file server, its push to a remote gateway fails (because of an unavailability of that gateway at the time of the push operation), the next time (the day or week after) the system will download and push the complete content for that URL list.
access (1) apply (1) file (1) guard (1) manager (1) sslmediate (1)
CacheGuard Technologies Ltd <www.cacheguard.com>
Send bug reports or comments to the above author.
Copyright (C) 2009-2024 CacheGuard - All rights reserved