NAME

Validator 1.2 - PHP common data validation routines.


SYNOPSIS

  include("class.Validator.php3");

  $check = new Validator ();

  $email = 'cdi@thewebmasters.net';
  $state = 'WA';
  $cc = 'US';
  $phone = '206-123-4567';
  $zip = '9999-1111';
  $url = 'http://buzz.builder.com/';
  $misc = 'Now$ is the\.*Time of "our" (discontent)!';
  $ip    = '206.79.157.144';
  $host = 'buzz.builder.com';
  $file = "/home/httpd/html/index.html";

  if (!$check->is_email($email))  { echo "Invalid email format\n";}
  if (!$check->is_state($state))  { echo "Invalid state code\n";  }
  if (!$check->is_country($cc))   { echo "Invalid country code\n";}
  if (!$check->is_phone($phone))  { echo "Invalid phone format\n";}
  if (!$check->is_zip($zip))      { echo "Invalid zip code\n";    }
  if (!$check->is_url($url))      { echo "Invalid URL format\n";  }
  if (!$check->url_responds($url)){ echo "Url not responding\n";  }
  if (!$check->is_ipaddress($ip)) { echo "Invalid IP format\n";   }
  if (!$check->ip_resolves($ip))  { echo "No RDNS for IP\n";      }
  if (!$check->is_host($host))    { echo "Invalid host name\n";   }
  if (!$check->is_sane($file))    { echo "Tainted file\n";        }

  $clean = $check->strip_metas($misc);
  $myJunk = array('!',''','=');
  $clean = $check->custom_strip($myJunk,$clean);
  $browser = $check->browser_gen();
  echo "You are using a $browser generation browser";

  if ($check->ERROR)              { echo "$check->ERROR\n";       }


DESCRIPTION

Validator is a class containing common data validation routines. Using this class your programs can validate a wide variety of data types common to forms and the internet in general. Included with the various data validation routines are several ``common'' string and array manipulation routines. In short, this is a class designed to conduct grunt work on submitted data. This version of Validator contains 19 data tests, 9 data manipulation methods, and one array debugging tool.


METHODS


Validator()

The new Validator() method simply creates a new object. No arguments needed.


clear_error()

When any of the following tests fail, Validator sets the ERROR variable with the text description of why the test failed. Since the ERROR is static, the ERROR variable always contains the last error encountered. To clear the error, call this method. If you want to automatically clear errors, set the CLEAR variable to true. (False by default) If the CLEAR variable is true, Validator will automatically clear the ERROR variable at the beginning of any new test. Example;

  $check->CLEAR = true;
  $email = 'cdi @ example.com';
  $host = 'example.com';

  if(!$check->is_email($email)) { echo "$check->ERROR\n"; }

  // ERROR now contains "is_email: Whitespace in [cdi ]@[ example.com]"

  if(!$check->is_hostname($host)) { echo "$check->ERROR\n"; }

  // ERROR now contains "", because is_hostname succeeded.

  // Since CLEAR is true, and example.com is a valid hostname, ERROR has
  // been cleared.  If CLEAR were false, ERROR would still contain the
  // whitespace error from the is_email check


has_space($text)

Test. Returns true if the submitted text has spaces or tabs in it. False otherwise.


chconvert($fragment)

Internal Manipulation. Converts an (octal) numeric value into it's visual representation of a file permission segment. (eg, 7 == 'rwx')


get_perms($fileName)

Returns a 4 element array based on the specified file or directories file permissions. The array has the following structure:

    octal permission value, string user, string group, string other

The keys for the array are ``octal'', ``user'', ``group'', and ``other''.

The ``user'',``group'',``other'' keys have values matching that returned from an ls -l command, so if the file's octal value were 744, user,group, and other would be as follows:

    "user"  == "rwx"
    "group" == "r--"
    "other" == "r--"

Example:

    $perms = $check->get_perms($fileName);

        $octal = $perms["octal"];
        $user  = $perms["user"];
        if(ereg("r",$user)) { echo "File is readable by user"; }

Returns null on failure to stat() the specified file. If any specific segment of the file's permission is mis-understood, that segment will be set to ``unk''.


is_sane($filename)

Test. Returns true if the file is sane. False otherwise. The checks are; is_readable, is_writeable, is_dir, is_link. If the first two fail or the last two succeed, returns false.


strip_space($text)

Manipulation. Strips all spaces and tabs from the submitted text and returns the altered text.


is_allnumbers($text)

Test. Returns true if the submitted text contains only numbers, and returns true if the gettype() test returns ``integer''. False otherwise.


strip_numbers($text)

Manipulation. Strips all numbers from the submitted text and returns the altered content.


is_allletters($text)

Test. Returns true if the text contains only A-z characters. False otherwise, case insensitive.


strip_letters($text)

Manipulation. Strips A-z from submitted text and returns the results. Case insensitive.


has_html($text)

Test. Returns true if the text submitted contained HTML entities. False otherwise. Tests for <, >, ``, and & .


strip_html($text)

Manipulation. Strips all html entities, tags and attributes from the submitted text and returns the results (plain text). It's not a very efficient method but it is accurate. Don't use this method to remove html from a file. Use PHPs much faster and more efficient fgetss() function.


has_metas($text)

Test. Returns true if the submitted text contains shell meta characters. The test revolves around the quotemeta() method. See the PHP documentation for the quotemeta() method.


strip_metas($text)

Manipulation. All shell meta chars are completely removed from the submitted string. See the PHP quotemeta() documentation for the chars it removes.


custom_strip($array, $text)

Manipulation. Will strip a user defined array of characters from the submitted text, and return the modified string. This method uses the str_replace() function, and is meta-character safe.


array_echo($MyArray, $Name = "Array")

Debug tool. Array_echo() walks through an array and echos the key=value pairs as an HTML table. Multi-dimensional arrays are handled recursively. The $Name is used to identify the name of the parent array. It can be set to whatever you want. Example;

  $config = new Config($ConfigFile);
  $check->array_echo($config->param, "Param");

Array_echo is only useful for debugging. I can't think of any legitimate reason to use this method in a real program. It definitely comes in handy when debugging large multi-dimensional arrays.


is_email($Address = "")

Test. Validates an email address. Looks for standard [user]@[hostname].[tld] format. This is NOT an RFC 822 compliant check. is_email() goes one further than just syntax, and verifies that the hostname portion of the email address has a valid DNS record. If the hostname portion of the submitted email does NOT have an DNS record, it's obviously an invalid email address regardless of it's format. So 'cdi@example.com' would return false, because example.com does not have a DNS record. The DNS check requires internet connectivity to succeed.


is_url($Url = "")

Test. Checks for valid http and ftp URL format. Looks for a leading 'http://' or 'ftp://' followed by a valid hostname, followed by any arbitrary text (or nothing following the hostname). This test does not check for connectivity of the submitted URL - it only checks the format. The hostname check is conducted by the is_hostname() method. New in version 1.1, now properly uses the parse_url() function to split up the url before validating.


url_responds($Url = "")

Test. Does an is_url(), and if that returns true, will then check that the URL is responding to requests. Obviously this routine requires a working internet connection in order to succeed. Returns true if the URL responded to a request, false otherwise.


is_phone($Phone = "")

Test. Checks a phone number for format. Can handle international phone numbers. Checks that the submitted string contains only 0-9, (, ), -, and + characters. Checks that the number, after removing any non-numeric characters, is no more than 13 numbers long and at least 7 digits long.


is_hostname($hostname = "")

Test. Checks a hostname format. Does not check to see if the hostname is repsponding. Looks for [sometext].[tld] at the minimum. Only A-Z,0-9,-, and . are valid characters in this hostname check. Looks for common typos like double dots (example..com), leading dots (.example.com), trailing dots (example.com.), or no dots (examplecom). Modifications for version 1.1, now properly validates the TLD and squashes a potential bug.


is_bigfour($tld)

Test. Submit a top-level domain for verification. The big four TLDs are COM, EDU, NET, and ORG. Returns true or false.


is_host($hostname = "", $type = "ANY")

Test. Does an is_hostname() check and if that returns true, tests that the host has a DNS record. By default it checks for ``ANY'' DNS record. This can be overriden by specifying the type of record to look for. (eg MX, NS, A etc). If the is_hostname() and DNS checks both return true, this check returns true. False otherwise.


is_ipaddress($IP = "")

Test. Checks for valid dotted quad IP address notation. Does not check the connectivity of the IP, only it's format. It does check for common typos like leading zeros or quad notations greater than 255.


ip_resolves($IP = "")

Test. Does an is_ipaddress() and if that returns true, checks to see if an IP has a reverse DNS entry. valid IP addresses often do not resolve to a hostname. Do not assume that if the IP does not have a reverse DNS entry that the IP is not valid. This test simply looks for that reverse DNS entry and returns true or false appropriately.


browser_gen()

Test. Grabs the HTTP_USER_AGENT environment variable and returns the associated GENERATION of that agent. Generations returned can be any of the following; FIRST, SECOND, THIRD, FOURTH, FIFTH, SPIDER, SPAMMER, ANONYMIZER or UNKNOWN. The number of SPIDER and SPAMMER agents known to the class are limited. Should accurately peg just about any other legitimate user agents though. If the HTTP_USER_AGENT variable is undefined or unknown, returns UNKNOWN.


is_state($State = "")

Test. Checks to see if the value submitted is a valid US postal State code. Checks all 50 states, Washington DC, Puerto Rico and the US Virgin Islands. Case insensitive check.


is_zip($zipcode = "")

Test. Checks to see if the value submitted is a valid US postal zip code format. Does not check to see if it's an actual zip code. (ie, 11111 will return true). The zip code check looks for 10 digit zips ( 11111-11111 ) as well as 5 digit zips. The zip code (with non-digit chars removed) may not exceed 10 digits or be fewer than 5. The only valid characters are 0-9 and the - character.


is_country($countrycode = "")

Test. And a big test at that. Checks that the 2 character country code submitted is a valid country code. There are 243 country codes in the CC database. (Up from 232 in 1.1) If it finds a valid country code, it returns the full name of the country. Returns null otherwise. eg is_country(``TT'') would return ``Trinidad, Tobago''. If you find a country code that I don't have listed here, please send it to me, cdi@thewebmasters.net.


VARIABLES


ERROR (string)

This variable contains the text of the last test ERROR encountered. If a test fails, the text description of why the test failed can be retrieved by grabbing this variable.


CLEAR (boolean, default == false)

CLEAR, if set to true, automatically clears the ERROR variable at the beginning of each new test.


EXAMPLES

see SYNOPSIS


DOCUMENTATION

This is it.


INSTALLATION

Copy the class file to the location specified in your PHP3.INI file. Failing that, use the full path to the class file's location in your include() directive.


BUGS

Fixed several bugs in 1.1: see CHANGELOG for details

    Fixed: is_email() DNS bug: Valid hosts were returning false.
    Fixed: is_hostname() Improperly written regex. (Not POSIX)
    Fixed: is_country() added 14 missing countries
    Fixed: browser_gen() added 24 more spiders, fixed on bad entry


VERSION

Version 1.2 1999/03/05 CDI, cdi@thewebmasters.net


AUTHOR

Copyright (c) 1999, CDI - cdi@thewebmasters.net. All Rights Reserved.


LICENSE

This program is free software; you can redistribute it and/or modify it under the GNU General Artistic License, with the following stipulations;

Changes or modifications must retain these Copyright statements. Changes or modifications must be submitted to the AUTHOR, cdi@thewebmasters.net.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the Artistic License for more details. This software is distributed AS-IS.


AVAILABILITY

http://www.thewebmasters.net/php/


HISTORY

see CHANGELOG