THE META-HTML LANGUAGE REFERENCE MANUAL

Relational Operators [TOC] Page Operators

Section Intro: String Operators

String Operators

Synopsis:

    There is a single function in Meta-HTML which performs pattern matching, substring extraction, and substring deletion. For convenience, a blind substring extraction function is supplied as well. Three functions perform the three most common case changes. Finally, the pad function allows alignment of fixed-width text.

Commands:

<base64decode STRING &optional [VARNAME]Simple

    Performs the translation operation commonly known as Base64 Decoding on STRING, and returns the results of that decoding.

    If the optional VARNAME is supplied, the results of decoding are placed into the binary variable named VARNAME instead of being returned in the page. This allows random binary data to be decoded, and perhaps written to the output stream using stream-put-contents.

    Base64 encoding is a common transfer encoding for binary data and for Basic Authorization values -- this function can be used to turn such strings into their original, pre-encoded state.

    <set-var the-data = "YmZveDpmcm9ibml0eg==">
    <base64decode <get-var the-data>>
    produces:
    bfox:frobnitz

<base64encode VARNAME &key [SHORTLINES=[TRUE|VALUE]]Simple

    Performs the translation operation commonly known as Base64 Encoding on the contents of the binary variable referenced by VARNAME, and returns the results of that encoding.

    Base64 encoding is a common transfer encoding for binary data and for Basic Authorization values -- this function can be used to turn data from its original pre-encoded state to an encoded one.

    If the keyword argument SHORTLINES=VALUE is supplied, then the result of encoding is broken up into lines containing VALUE characters each, rounded up to the nearest multiple of 4. This format is commonly implemented by many base 64 encoding programs. A value of true uses the Meta-HTML default value of 64. If the argument is not supplied, the data is returned in a single line of data.

    The following code reads a GIF image into a binary variable, and then displays the base64 encoded version:

    <dir::read-file /tmp/image.gif gif>	==> true
    <base64encode gif>			==> [base64 encoded data]

<capitalize STRINGSimple

    Changes the case of each character in STRING to uppercase or lowercase depending on the surrounding characters.

    <capitalize "This is a list">
    produces:
    This Is A List

    Also see downcase, and upcase.

<char-offsets STRING CH &key [CASELESS]Simple

    Return an array of numbers, each one representing the offset from the start of STRING of CH. This function is useful for finding candidate locations for word-wrapping, for example. Here is a complete example:

    <char-offsets "This is a list" " ">
    produces:
    4
    7
    9
    

<downcase STRINGSimple

    Converts all of the uppercase characters in STRING to lowercase.

    <downcase "This is Written in Meta-HTML">
    produces:
    this is written in meta-html

<encrypted-password PASS SALTSimple

    Provided for backwards compatability only. See unix::crypt.

<match STRING REGEX &key [ACTION=[DELETE|EXTRACT|REPORT|STARTPOS|ENDPOS|LENGTH]] [CASELESS=TRUE]Simple

    Matches REGEXP against STRING, and then performs the indicated ACTION. The default for ACTION is "report".

    When action is "report" (the default), returns "true" if REGEX matched.
    When action is "extract", returns the substring of STRING matching REGEX.
    When action is "delete", returns STRING with the matched substring removed.
    When action is "startpos", returns the numeric offset of the start of the matched substring.
    When action is "endpos", returns the numeric offset of the end of the matched substring.

    REGEXP is an extended Unix regular expression, the complete syntax of which is beyond the scope of this document. However, the essential basics are:

    • A period (.) matches any one character.
    • An asterisk (*) matches any number of occurrences of the preceding expression, including none.
    • A plus-sign matches one or more occurrences of the preceding expression.
    • Square brackets are used to enclose lists of characters which may match. For example, "[a-zA-Z]+" matches one or more occurrences of alphabetic characters.
    • The vertical bar is used to separate alternate expressions to match against. For example, "foo|bar" says to match either "foo" or "bar".
    • A dollar-sign ($) matches the end of STRING.
    • Parenthesis are used to group subexpressions.

    Here are a few examples:

      <match "foobar" ".*">                 ==> "true"
      <match "foobar" "foo">                ==> "true"
      <match "foobar" "foo" action=extract> ==> "foo"
      <match "foobar" "oob" action=delete>  ==> "far"
      <match "foobar" "oob" action=startpos>==> "1"
      <match "foobar" "oob" action=endpos>  ==> "4"
      <match "foobar" "oob" action=length>  ==> "3"
      <match "foobar" "[0-9]+">             ==> ""

<pad STRING WIDTH &key [ALIGN=[LEFT|RIGHT|MIDDLE]] [TRUNCATE=TRUE] [PAD-CHAR=X]Simple

    Pads STRING to a length of TOTAL-SIZE. ALIGN can be one of LEFT, MIDDLE, or RIGHT (the default).

    PAD inserts the correct number of PAD-CHARacters to make the input argument take the desired number of spaces (presumably for use in a <pre> ... </pre> statement). The default value for PAD-CHAR is a space character.

    If keyword argument TRUNCATE=TRUE is given, it says to force the string to be the specified length.

    Before any padding is done, leading and trailing whitespace is removed from STRING.

    Examples:

      <pad "Hello" 10>              ==> "     Hello"
      <pad "Hello" 10 align=left>   ==> "Hello     "
      <pad "Hello" 10 align=middle> ==> "  Hello   "
      <pad "  Heckle  " 4 truncate> ==> "Heck"

<plain-text &key [FIRST-CHAR=EXPR] [NOBR=TRUE]
  body
</plain-text>
Complex

    Performs the following steps:

    1. Replace occurrences of pairs of newline characters with a single <P> tag.

    2. Applies the function EXPR to the first character of every paragraph, and inserts the closing tag after that character.

    The output will start with a <P> tag, unless the optional argument NOBR=TRUE is given.

    <plain-text first-char=<font size="+1"> nobr=true>
    This is line 1.
    
    This is line 2.
    </plain-text>
    produces:
    This is line 1.
    

    This is line 2.

<string-compare STRING1 STRING2 &key [CASELESS]Simple

    Compare the two strings STRING1 and STRING2, and return a string which specifies the relationship between them. The comparison is normall case-sensitive, unless the keyword argument CASELESS=TRUE is given.

    The possible return values are:

    1. equal
      The two strings are exactly alike.
    2. greater
      STRING1 is lexically greater than STRING2.
    3. less
      STRING1 is lexically less than STRING2.

    Examples:

    <string-compare "aaa" "aab">               ==> less
    <string-compare "zzz" "aab">               ==> greater
    <string-compare "zzz" "ZZZ">               ==> greater
    <string-compare "zzz" "ZZZ" caseless=true> ==> equal

<string-eq STRING-1 STRING-2 &key [CASELESS=TRUE]Simple

    Compare STRING1 to STRING2 and return the string "true" if they are character-wise identical.

    The optional keyword argument CASELESS=TRUE indicates that no consideration should be given to the case of the characters during comparison.

    <string-eq "foo" "FOO">               ==>
    <string-eq "foo" "foo">               ==>true
    <string-eq <upcase "foo"> "FOO">      ==>true
    <string-eq "foo" "FOO" caseless=true> ==>true

<string-length STRINGSimple

    Returns the number of characters present in STRING.

    <string-length "This is an interesting string">
    produces:
    29

<string-neq STRING-1 STRING-2 &key [CASELESS=TRUE]Simple

    Compare STRING1 to STRING2 and return the string "true" if they are NOT character-wise identical.

    The optional keyword argument CASELESS=TRUE indicates that no consideration should be given to the case of the characters during comparison.

    <string-neq "foo" "FOO">               ==>true
    <string-neq "foo" "foo">               ==>
    <string-neq <upcase "foo"> "FOO">      ==>
    <string-neq "foo" "FOO" caseless=true> ==>

<string-to-array STRING ARRAYVARSimple

    Create an array in ARRAYVAR which is made of of the individual characters of STRING. Given the following:

     <set-var s="This is a string.">
     <string-to-array <get-var-once s> chars>
     
    Then, <get-var chars[3]> returns s.

<strings::collapse VARNAME &optional [COLLAPSIBLE-CHARS]Simple

    Collapses multiple occurrences of any of the characters specified in COLLAPSIBLE-CHARS into a single occurrence of the first character in COLLAPSIBLE-CHARS, throughout the string stored in VARNAME. If COLLAPSIBLE-CHARS is not specified, it defaults to the set of whitespace characters, with a Space character as the first element, i.e., Space, Tab, CR, and Newline.

    <set-var foo=" string with    whitespace     in   various Spots ">
    String: [<get-var-once foo>]
    <strings::collapse foo>
    String: [<get-var-once foo>]
    produces:
    String: [ string with    whitespace     in   various Spots ]
    
    String: [ string with whitespace in various Spots ]

<strings::left-trim VARNAME &optional [TRIM-CHARS]Simple

    Trims the characters specified in TRIM-CHARS from the "left-hand" side of the string stored in VARNAME, replacing the contents of that variable with the trimmed string. If TRIM-CHARS is not specified, it defaults to the set of whitespace characters, i.e., Space, Tab, CR, and Newline.

    <set-var foo="    string with whitespace on the left">
    String: [<get-var-once foo>]
    <strings::left-trim foo>
    String: [<get-var-once foo>]
    produces:
    String: [    string with whitespace on the left]
    
    String: [string with whitespace on the left]

<strings::right-trim VARNAME &optional [TRIM-CHARS]Simple

    Trims the characters specified in TRIM-CHARS from the "right-hand" side of the string stored in VARNAME, replacing the contents of that variable with the trimmed string. If TRIM-CHARS is not specified, it defaults to the set of whitespace characters, i.e., Space, Tab, CR, and Newline.

    <set-var foo="string with whitespace on the right     ">
    String: [<get-var-once foo>]
    <strings::right-trim foo>
    String: [<get-var-once foo>]
    produces:
    String: [string with whitespace on the right     ]
    
    String: [string with whitespace on the right]

<strings::trim VARNAME &key [COLLAPSE]Simple

    Trims whitespace from both the "left" and "right" -hand sides of the string stored in VARNAME, replacing the contents of that variable with the trimmed string. With optional keyword COLLAPSE=TRUE, collapses multiple occurrences of whitespace in the string into a single space.

     <set-var foo="    string with    whitespace     on left and right     ">
     String: [<get-var-once foo>]
     <strings::trim foo>
     String: [<get-var-once foo>]
     <strings::trim foo collapse=true>
     String: [<get-var-once foo>]
     
    produces:
    String: [    string with    whitespace     on left and right     ]
     
     String: [string with    whitespace     on left and right]
     
     String: [string with whitespace on left and right]
     

<subst-in-string STRING &rest regexp replacement>Simple

    Replaces all occurrences of REGEXP with REPLACEMENT in STRING.

    REGEXP can be any regular expression allowed by POSIX extended regular expression matching.

    In the replacement string, a backslash followed by a number N is replaced with the contents of the Nth subexpression from REGEXP.

    <set-var foo="This is a list">
    <subst-in-string <get-var foo> "is" "HELLO">
         ==> "ThHELLO HELLO a lHELLOt"
    
    <subst-in-string "abc" "([a-z])" "\\1 "> ==> "a b c "

<subst-in-var VARNAME &optional [THIS-STRING] [WITH-THAT]Simple

    Replaces all occurrences of THIS-STRING with WITH-THAT in the contents of the variable named VARNAME. Both THIS-STRING and WITH-THAT are evaluated before the replacement is done. THIS-STRING can be any regular expression allowed by the POSIX extended regular expression matching. This command can be useful when parsing the output of cgi-exec.

<substring STRING &optional [START] [END]Simple

    Extracts the substring of STRING whose first character starts at offset START, and whose last character ends at offset END. The indexing is zero-based, so that:

      <substring "Hello" 1 2> ==> "e"

    This function is useful when you know in advance which part of the string you would like to extract, and do not need the pattern matching facilities of match.

    If you wish to index through each character of a string, the most direct way is to convert it to an array first using string-to-array, and then use the foreach function to iterate over the members.

    <set-var s="This is a string.">
    <string-to-array <get-var-once s> chars>
    <foreach character chars><get-var character>-</foreach>
    produces:
    T-h-i-s- -i-s- -a- -s-t-r-i-n-g-.-

<unix::crypt STRING &optional [SALT]Simple

    Return STRING encrypted using the local system's crypt() function with the salt SALT.

    SALT is a two character string -- if you wish to compare the cleartext value that a user has entered against a Unix-style encrypted password, such as one from /etc/passwd, or from a .htaccess file, use the first two characters of the existing encrypted password as the salt, and then compare the resulting strings.

    For example, if the variable EXISTING-PASS contains a previously encypted password, and the variable ENTERED-PASS contains the cleartext that the user has just entered, you may encrypt the user's password and compare it with the existing one with the following code:

    <set-var encrypted-pass =
       <unix::crypt <get-var entered-pass>
                    <substring <get-var existing-pass> 0 2>>>
    
    <when <string-eq <get-var encrypted-pass>
                     <get-var existing-pass>>>
       <set-session-var logged-in=true>
       <redirect members-only.mhtml>
    </when>
    
    <h3>Please enter your password again.  It didn't match.</h3>

<upcase STRINGSimple

    Converts all of the lowercase characters in STRING to uppercase.

    <upcase "This is Written in Meta-HTML">
    produces:
    THIS IS WRITTEN IN META-HTML

<word-wrap STRING &key [WIDTH=CHARWIDTH] [INDENT=INDENTATION] [SKIP-FIRST=TRUE]Simple

    Produce paragraphs of text from the string STRING with the text filled to a width of CHARWIDTH.

    This is provided for convenience only, and is of use when you need to present some free form text in pre-formatted fashion, such as when sending an E-mail message, or the like.

    If the keyword INDENT=INDENTATION is supplied, it says to indent each line in the output by INDENTATION, using space characters.

    If the keyword SKIP-FIRST=TRUE is given, it says not to indent the first line of the output -- just each successive line. For example:

    <set-var text =
       <concat "This is provided for convenience only, and is of use "
               "when you need to present some free form text in "
               "pre-formatted fashion, as in this example.">>
    <pre>
    Title Topic: <word-wrap <get-var text> 40 indent=13 skip-first=true>
    </pre>
    produces:
    Title Topic: This is provided for convenience only, and is of use when
                 you need to present some free form text in pre-formatted
                 fashion, as in this example. 
    

Edit Section
Function Index
Variable Index


The META-HTML Reference Manual V2.0 Copyright © 1995, 1998, Brian J. Fox
Found a bug? Send mail to bug-manual@metahtml.org