Unit CastleURIUtils

Classes, Interfaces, Objects and Records
Constants
Variables

Description

URI utilities. These extend standard FPC URIParser unit.

Uses

Overview

Functions and Procedures

procedure URIExtractAnchor(var URI: string; out Anchor: string; const RecognizeEvenEscapedHash: boolean = false);
function URIDeleteAnchor(const URI: string; const RecognizeEvenEscapedHash: boolean = false): string;
function RawURIDecode(const S: string): string;
function URIProtocol(const URI: string): string;
function URIProtocolIs(const S: string; const Protocol: string; out Colon: Integer): boolean;
function URIDeleteProtocol(const S: string): string;
function CombineURI(const Base, Relative: string): string;
function AbsoluteURI(const URI: string): string;
function AbsoluteFileURI(const URI: string): boolean;
function URIToFilenameSafe(const URI: string): string;
function FilenameToURISafe(FileName: string): string;
function URIMimeType(const URI: string): string;
function URIMimeType(const URI: string; out Gzipped: boolean): string;
function URIMimeExtensions: TStringStringMap;
function URIDisplay(const URI: string; const Short: boolean = false): string;
function URICaption(const URI: string): string;
function ChangeURIExt(const URL, Extension: string): string;
function DeleteURIExt(const URL: string): string;
function ExtractURIName(const URL: string): string;
function ExtractURIPath(const URL: string): string;
function URIIncludeSlash(const URL: String): String;
function URIExcludeSlash(const URL: String): String;
function URIFileExists(const URL: string): Boolean;
function URIExists(URL: string): TURIExists;
function URICurrentPath: string;
function ResolveCastleDataURL(const URL: String): String;

Types

TURIExists = (...);

Description

Functions and Procedures

procedure URIExtractAnchor(var URI: string; out Anchor: string; const RecognizeEvenEscapedHash: boolean = false);

Extracts #anchor from URI. On input, URI contains full URI. On output, Anchor is removed from URI and saved in Anchor. If no #anchor existed, Anchor is set to ''.

When RecognizeEvenEscapedHash, we also recognize as a delimiter escaped hash, %23. This is a hack and should not be used (prevents from using actual filename with hash, thus making the escaping process useless). Unless there's no other sensible way — e.g. specify Spine skin name when opening Spine json file...

function URIDeleteAnchor(const URI: string; const RecognizeEvenEscapedHash: boolean = false): string;

Return URI with anchor (if was any) stripped.

function RawURIDecode(const S: string): string;

Replace all sequences like %xx with their actual 8-bit characters.

The intention is that this is similar to PHP function with the same name.

To account for badly encoded strings, invalid encoded URIs do not raise an error — they are only reported to WritelnWarning. So you can simply ignore them, or write a warning about them for user. This is done because often you will use this with URIs provided by the user, read from some file etc., so you can't be sure whether they are correctly encoded, and raising error unconditionally is not OK. (Considering the number of bad HTML pages on WWW.)

The cases of badly encoded strings are:

  • "%xx" sequence ends unexpectedly at the end of the string. That is, string ends with "%" or "%x". In this case we simply keep "%" or "%x" in resulting string.

  • "xx" in "%xx" sequence is not a valid hexadecimal number. In this case we also simply keep "%xx" in resulting string.

function URIProtocol(const URI: string): string;

Get protocol from given URI.

This is very similar to how URIParser.ParseURI function detects the protocol, although not 100% compatible:

  • We allow whitespace (including newline) before protocol name.

    This is useful, because some VRML/X3D files have the ECMAScript code inlined and there is sometimes whitespace before "ecmascript:" protocol.

  • We never detect a single-letter protocol name.

    This is useful, because we do not use any single-letter protocol name, and it allows to detect Windows absolute filenames like c:\blah.txt as filenames. Otherwise, Windows absolute filenames could not be accepted by any of our routines that work with URLs (like the Download function), since they would be detected as URLs with unknown protocol "c".

    Our URIProtocol will answer that protocol is empty for c:\blah.txt. Which means no protocol, so our engine will treat it as a filename. (In contrast with URIParser.ParseURI that would detect protocol called "c".) See doc/uri_filename.txt in sources for more comments about differentiating URI and filenames in our engine.

  • We always return lowercase protocol. This is comfortable, since you almost always calculate protocol to compare it, and protocol names are not case-sensitive, and you should always produce URLs with lowercase protocol names (see http://tools.ietf.org/html/rfc3986#section-3.1).

function URIProtocolIs(const S: string; const Protocol: string; out Colon: Integer): boolean;

Check does URI contain given Protocol. This is equivalent to checking URIProtocol(S) = Protocol, ignoring case, although may be a little faster. Given Protocol string cannot contain ":" character.

function URIDeleteProtocol(const S: string): string;
 
function CombineURI(const Base, Relative: string): string;

Return absolute URI, given base and relative URI.

Base URI must be either an absolute (with protocol) URI, or only an absolute filename (in which case we'll convert it to file:// URI under the hood, if necessary). This is usually the URI of the containing file, for example an HTML file referencing the image, processed by AbsoluteURI.

Relative URI may be a relative URI or an absolute URI. In the former case it is merged with Base. In the latter case it is simply returned.

If you want to support relative URIs, you want to use this routine. It treats Relative always as an URI (so it should be percent-escaped, with slashes and such). Other routines in our engine, like AbsoluteURI and Download, treat strings without protocol as a filename (so it's not percent-escaped, it uses PathDelim specific to OS — slash or backslash etc.). This routine, on the other hand, treats Relative string always as an URI (when it doesn't include protocol, it just means it's relative to Base).

function AbsoluteURI(const URI: string): string;

Make sure that the URI is absolute (always has a protocol). This function treats an URI without a protocol as a simple filename (absolute or relative to the current directory). This includes treating empty string as equivalent to current directory.

function AbsoluteFileURI(const URI: string): boolean;

Does URI contain only an absolute filename. Useful to detect unwanted paths in data files, you usually do not want to have such paths in data files, as they make it impossible to transfer the data (move/copy files) to other system/location.

function URIToFilenameSafe(const URI: string): string;

Convert URI (or filename) to a filename.

This is an improved URIToFilename from URIParser. When URI is already a filename, this does a better job than URIToFilename, as it handles also Windows absolute filenames (see URIProtocol). Returns empty string in case of problems, for example when this is not a file URI.

Just like URIParser.URIToFilename, this percent-decodes the parameter. For example, %4d in URI will turn into letter M in result.

It also handles our castle-data: protocol.

function FilenameToURISafe(FileName: string): string;

Convert filename to URI.

This is a fixed version of URIParser.FilenameToURI, that correctly percent-encodes the parameter, making it truly a reverse of URIToFilenameSafe. In FPC > 2.6.2 URIParser.FilenameToURI will also do this (after Michalis' patch, see http://svn.freepascal.org/cgi-bin/viewvc.cgi?view=revision&revision=24321 ).

It also makes sure the filename is absolute (it uses ExpandFileName, so if the FileName is relative — it will be expanded, treating it as relative to the current directory).

function URIMimeType(const URI: string): string;

Get MIME type for content of the URI without downloading the file. For local and remote files (file, http, and similar protocols) it guesses MIME type based on file extension. (Although we may add here detection of local file types by opening them and reading a header, in the future.) Only for data: URI scheme it actually reads the MIME type.

Using this function is not adviced if you want to properly support MIME types returned by http server for network resources. For this, you have to download the file, as look at what MIME type the http server reports. The Download function returns such proper MimeType. This function only guesses without downloading.

Returns empty string if MIME type is unknown.

Overloaded version returns also Gzipped to detect whether file contents are gzipped.

The recognition mechanism can be enhanced by adding your own mappings to the URIMimeExtensions.

function URIMimeType(const URI: string; out Gzipped: boolean): string;
 
function URIMimeExtensions: TStringStringMap;

Map from an extension to a MIME type, used by URIMimeType. The extension should be lowercase, and includes a leading dot, like .png.

function URIDisplay(const URI: string; const Short: boolean = false): string;

Convert URI to a nice form for display (to show in messages and such). It makes sure to nicely trim URLs that would be too long/unreadable otherwise (like "data:" URI, or multi-line URLs with inlined ECMAScript/CastleScript/shader code).

When Short = False (default), then for most "file:" and "http:" URLs, it just returns them untouched.

When Short = True, it will try to extract the last path component from URLs like "file:" and "http:", if this last component is not empty. Similar to what ExtractFileName does for filenames. It will also decode the URI (convert %xx to normal charaters). Because of the percent-decoding, it is not advised to use this on filenames with Short=true. Usually, you want to call URICaption that makes sure that argument is URL (using AbsoluteURI) and then returns URIDisplay with Short=true.

It is safe to use this on both absolute and relative URLs. It does not resolve relative URLs in any way. It also means that it returns empty string for empty URI (contrary to most other routines that convert empty string to a current directory when resolving relative URLs).

function URICaption(const URI: string): string;

Convert URI to a nice form for a short caption.

Returns empty string for empty URI (contrary to most other routines that treat empty string like a current directory).

See URIDisplay documentation for details.

function ChangeURIExt(const URL, Extension: string): string;

Change extension of the URL.

function DeleteURIExt(const URL: string): string;

Delete extension of the URL.

function ExtractURIName(const URL: string): string;

Extract filename (last part after slash) from URL.

function ExtractURIPath(const URL: string): string;

Extract path (everything before last part), including final slash, from URL.

function URIIncludeSlash(const URL: String): String;

Ensure URL ends with slash.

For an empty URL, returns empty string (so it does not turn "" into "/"). For an URL ending with bashslash (which usually means you passed Windows path name), it removes the backslash before adding slash.

This should be used instead of InclPathDelim or IncludeTrailingPathDelimiter, when you use URLs instead of filenames.

function URIExcludeSlash(const URL: String): String;

Ensure URL does not end with slash. In case you passed Windows path name, it also removes the backslash.

This should be used instead of ExclPathDelim or ExcludeTrailingPathDelimiter, when you use URLs instead of filenames.

function URIFileExists(const URL: string): Boolean;

Does a file exist, that is: whether it makes sense to load it with the Download function.

Returns True for URLs where we cannot determine whether the file exists (like http / https).

This is simply a shortcut for URIExists(URL) in [ueFile, ueUnknown].

function URIExists(URL: string): TURIExists;

Does a file or directory exist under this URL. See TURIExists for possible return values.

function URICurrentPath: string;

Current working directory of the application, expressed as URL, including always final slash at the end.

function ResolveCastleDataURL(const URL: String): String;

If this is castle-data:... URL, resolve it using ApplicationData.

Types

TURIExists = (...);

Result of the URIExists query.

Values
  • ueNotExists: Given path does not indicate either a file or directory.
  • ueFile: Given path is a regular file. In particular, this means it can be read with the Download function.

    Note that there is no guarantee that opening it will work. On a multi-process system the file can be always deleted between the call to URIExists and Download. And the file permissions may not allow reading. We merely say that "right now this file exists".

  • ueDirectory: Given path is a directory. E.g. it can be used as path for the FindFiles function.
  • ueUnknown: Detecting existence of given path is tricky, it could be time-consuming.

    This applies e.g. to URLs using http / https protocols. The only way to detect their existence would be to actually open them. But this involves a network request, so it may take some time, and you may consider doing it asynchronously using (coming in the future) TDownload class (see CastleDownload comments for an API plan of TDownload).

    If you really want to check the file existence, you can always try to open it by Download:

    try
      Stream := Download(URL);
      FreeAndNil(Stream);
      ItExists := true;
    except
      on E: Exception do
      begin
        WritelnLog('Opening URL %s failed with exception %s', [
          URICaption(URL),
          ExceptMessage(E)
        ]);
        ItExists := false;
      end;
    end;

    Depending on the circumstances, the "ueUnknown" can be sometimes interpreted as "it exists" and sometimes as "it doesn't exist". Opening it with Download may either fail or succeed, we cannot detect.


Generated by PasDoc 0.15.0.