SSD3 - "Specification of the URI-Catalogue format" [http://www.shadyindustries.com/specs/ssd/ssd3.txt] Contents: 1 .... Introduction 1.1 ......... Abstract 1.2 ......... Terminology 2 .... Syntax 2.1 ......... Format and syntactic rules 2.1.1 .............. Charset and Characters 2.1.2 .............. Records and Fields 2.1.3 .............. Names and Values 2.2 ......... Extensibility 3 .... Fields 3.1 ......... URI 3.2 ......... NAME 3.3 ......... DATE 3.4 ......... CATEGORY 3.5 ......... DESCRIPTION 3.6 ......... RATING 3.7 ......... LANGUAGE 3.8 ......... TYPE 3.9 ......... ID 4 .... Examples 4.1 .........Full URI-Catalogue example (with two records) 4.2 .........Minimum URI-Catalogue example 5 .... Appendices 5.1 ......... Security Considerations 5.2 ......... Integrity Considerations 5.3 ......... Interoperability Considerations 5.4 ......... Encoding Considerations 5.5 ......... URI-Catalogue Media Type 5.6 ......... External Sources 5.7 ......... Authors 1.1 Abstract This document specifies the standard used for the "URI-Catalogue" format. URI-Catalogue is intended to be a human-readable format for storing Databases of URIs and meta data about those URIs. The format is also suitable for the exchange of URI-Catalogue databases between differing or peer programs or users. 1.2 Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [1] and indicate requirement levels for implementations compliant with this specification. 2.1.1 Charset and Characters Only the ASCII character set may be used for URI-Catalogues. The ASCII characters 0-9, 11-12, 14-31 and 127 (inclusive) MUST NOT appear anywhere in a valid URI-Catalogue document. This limitation may be lifted in future versions of the URI-Catalogueue specification. 2.1.2 Records and Fields The format of the document consists of "records" and "fields". A record is a subsection of the entire document, separated by blank lines (two CRLF line-breaks in a row), and contains fields. Fields are separated within the records by single CRLF line-breaks (a new line), and themselves consist of two parts: the "name" and the "value". The Name of a field and the Value of a field are separated by a colon (ASCII 58) and a single space (ASCII 32). Note that the space MUST be present and MUST NOT be considered part of the value. 2.1.3 Names and Values The Name is the part of the field that occurs before the separator, and MUST be processed case-sensitively, and MUST consist only of the ASCII characters 45, 48-57, 65-90, 95, and 97-122 (inclusive). The Name is used to identify the meaning of a field within a record. The Value is the part of the field that occurs after the separator, and may contain any character string except the characters forbidden in section 2.1.1, unless the specific field definition (see section 3) places further restraints on the characters or syntax of the value. 2.2 Extensibility Fields with the same Name MUST NOT occur in the same record more than once. Fields not specified by this or subsequent versions of this specification MAY be used, provided the field's Name is prefixed with an "X-" (ASCII 88 and 45). If a URI-Catalogue parser encounters an unrecognised or invalid field, it SHOULD discard it and SHOULD continue processing the URI-Catalogue. Fields values MUST NOT be empty. If a required field is not present in a record, or cannot be processed because it is invalid, the entire record SHOULD be discarded and processing SHOULD continue. Fields can appear in any order within a record. Records do not need to be ordered in any particular way, though it MAY assist human users or implementations. 3 Fields This is an exhaustive list of all Standard Fields defined by this specification. The title of each of the below sections is the Name of the field to which the section refers. 3.1 URI The most important field, it is REQUIRED in all records. This field contains the URI [2] that all the other fields are describing. Implementations MUST NOT group or order records by this field, and searching by the URI is NOT RECOMMENDED. 3.2 NAME The NAME field contains the name that has been assigned to the URI, and is REQUIRED. Implementations MAY opt to make this field unique throughout the URI-Catalogue. Implementations MAY order, group or search records by this field. 3.3 DATE The DATE field is REQUIRED in all records, and MUST a timestamp of when The record was created or last amended. The format used for the date MUST be the following: DD/MM/YYYY hh:mm:ss Where DD is a two digit Day, MM is a two digit Month, YYYY is a four digit year, hh is a 24-hour two digit hour, mm is a two digit minute, and ss is a two digit second. For example: 31/10/2007 19:03:34 Is the equivalent of: 3 minutes and 34 seconds past 7pm on the 31st of October, 2007. All timestamps are assumed to be UTC. Implementations MAY order, group or search records by the DATE field. 3.4 CATEGORY This field is OPTIONAL, and contains a string used to group certain records together within the database. For example, a user could search for any record with the string "Internet" in its category. 3.5 DESCRIPTION This field is RECOMMEDED, and contains an arbitary string. It is used to Store a long description of the URI, containing more information than the NAME field can carry, and is generally intended for human users. Implementations MUST NOT group or order records by the DESCRIPTION field, though, the DESCRIPTION field MAY be used to search for specific records. 3.6 RATING This field is NOT REQUIRED. It MUST contain an integer between 1 and 5 (inclusive). This field can be used by an implementation to order or group the records, and the value 5 MUST be considered highest, and the value 1 the lowest. 3.7 LANGUAGE This field is NOT REQUIRED, but MUST contain a valid language code [4]. This language code is used to indicate the main language of the document the URI is pointing too. Implementations MAY allow the data to be grouped or searched by language. 3.8 TYPE This field is NOT REQUIRED, but MUST contain a valid Media Type [3]. This informs the URI-Catalogue processor and/or user of the Media Type of the resource the URI points to. Implementations MAY allow the data to be grouped or searched by Media Type. 3.9 ID This field is NOT REQUIRED, but its value MUST be a positive decimal integer. This field is used to assign a unique number to any given record, and as such, the value of the ID field MUST NOT be the value of an ID field in any other record. There is no maximum limit on the value of the ID field. 4 Examples Note that tabs anywhere in any/all examples are to preserve the profile of this document. 4.1 Full URI-Catalogue example (with two records) URI: http://google.co.uk/ NAME: Google ID: 1 DATE: 30/10/2007 08:31:32 CATEGORY: Search Engines DESCRIPTION: A popular search engine run by Google, Inc. of the USA RATING: 1 LANGUAGE: en-us TYPE: text/html URI: http://shadyindustries.biz/ssd/ssd3.txt NAME: SSD3 - "Specification of URI-Catalogue format" ID: 2 DATE: 31/10/2007 19:28:45 CATEGORY: Official Documents DESCRIPTION: The Specification for the URI-Catalogue format RATING: 2 LANGUAGE: en-gb TYPE: text/plain 4.2 Minimum URI-Catalogue example URI: http://shadyindustries.biz/ssd/ssd3.txt NAME: SSD3 - URI-Catalogue specification DATE: 31/10/2007 19:29:23 5.1 Security Considerations As a structured ASCII-text document, there is little to no potential for URI-Catalogue to be able to cause harm directly or indirectly to a system. The only currently known possibility is that an implementation or user may try to follow a malicious URI, this eventuality cannot be prevented by any mechanism in this specification, and it is entirely up to the implementation or user whether to follow suspicious URIs. It is conceivable that some kind of protection might be required to protect a file of the URI-Catalogue format, such as encryption, however this must be done via some external mechanism, as this specification makes no attempt at providing a solution. [This section is subject to constant review and update.] 5.2 Integrity Considerations URI-Catalogue makes no attempt to ensure that its own data is transferred correctly, and this is up to the application (or transport layer protocols) to achieve, though, it can be considered a general rule that if forbidden characters (see section 2.1.1) are present in the URI-Catalogue, that corruption occurred during transit. [This section is subject to constant review and update.] 5.3 Interoperability Considerations There is not believed to be any reason why this standard cannot be properly implemented on any given system, [This section is subject to constant review and update.] 5.4 Encoding Considerations There are no encoding considerations required, since URI-Catalogue forbids the use of any charset besides 7bit US-ASCII, and also forbids the use of invisible or control characters. [This section is subject to constant review and update.] 5.5 URI-Catalogue Media Type The URI-Catalogue standard can be denoted using the "text/vnd.si.uricatalogue" Media Type [3], of which the IANA registration is located at [5], where further information regarding the specifics of URI-Catalogue and it's Media Type can be found. 5.6 External Sources [1] - "Key words for use in RFCs to Indicate Requirement Levels" http://www.rfc-editor.org/rfc/rfc2119.txt [2] - "Uniform Resource Identifier (URI): Generic Syntax" http://www.rfc-editor.org/rfc/rfc3986.txt [3] - "Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types" http://www.rfc-editor.org/rfc/rfc2046.txt [4] - Tags for the Identification of Languages http://www.rfc-editor.org/rfc/rfc3066.txt [5] - "'text/vnd.si.uricatalogue' Media Type Registration" http://www.iana.org/assignments/media-types/text/vnd.si.uricatalogue 5.7 Authors Nicholas Parks Young Jordan Geear Please send any comments to admin@shadyindustries.com, prefixing the email subject with #SSD3