Location: PHPKode > projects > OpenWolf Guidelines Validator > openWolf 0.9.9/help.txt
This document was written for version 0.9 of openWolf.

Install instructions

Copy all the files into a directory of your choice.  You'll need to include the html directory if you want the report output.

Usage

Here is an example of how you'd use the default installation.  This is called from index.php as a form submission.
parse.php?uri=http://www.example.com&&priority1=t&priority2=t&priority3=t

How to customise this program

It is very easy to customise this, either by removing bits or by extending it for another purpose.
In general, if you do not need CSS information, you can remove these lines:
include_once('css_functions.php'); (line 49)
$parse->parse_stylesheet(Array('all')); (line 154)

You can also remove these following lines if you do not need colour information:
$colours=new colours(); (line 121)
$colours->load_colours(); (line 122)

You can use your own Snoopy instance if you'd prefer, but this one has been slightly changed to suppress warning messages in a few places.

You can use the results in any way you'd like.  The results are all contained in an array called $results.

The following is an example of how to test a page for certain requirements.  It is taken from the WAI checkpoint 9.1 which states that authors should "provide redundant text links for each active region of server-side image maps except where the regions cannot be defined with an available geometric shape."
I have provided additional comments here to describe what is going on...

function WAI_1_9_1($parse){
	//Provide redundant text links for each active region of server-side image maps except where the regions cannot be defined with an available geometric shape.
	
	//Priority:		Must
	
	//Techniques:	Find any <img> elements with ismap=true
	
	//Note:			We cannot actually check for server-side image map links, so we just raise a warning...
	
	//We have a timeout check here to prevent loops from taking forever
	$timeout=get_timeout();
	
	//This is the basic results entry for this rule.
	$results=Array();
	//REQUIRED.  Should be one of three options.
	$results['priority']=PRIORITY_MUST;
	//This is the description of the rule
	$results['rule_text']='Provide client-side image maps instead of server-side image maps except where the regions cannot be defined with an available geometric shape.';
	//This is the header that this rule is a subset of
	$results['header']='g9';
	
	//The default value.  Sometimes it is 'true', but mostly it is false.
	$hasFailed=false;
	
	//Here we are getting all the image elements.
	$temp=$parse->getElementsByTagname('img');
	if(count($temp)>0){
		//Go through each element in the array.  Each result is an elementIndex.
		foreach($temp as $this_element){
			
			//A quick timecheck...
			if(time()>$timeout)
				break;
				
			//Get the ismap attribute value
			if($parse->getAttribute($this_element, 'ismap')==true){
				//So this image has an ismap attribute... it must be a server-side image map.
				$result_key='Use client-side image maps instead of server-side maps.';
				//Store the elementIndex for display purposes later on
				$results['instances'][$result_key]['indices'][]=$this_element;
				//This is the result type
				$results['instances'][$result_key]['type'][]=RESULT_WARNING;
				//This is the type of result we will be displaying
				$results['instances'][$result_key]['highlight'][]=HIGHLIGHT_ELEMENT;
				//We aren't highlighting the attribute, but we need to create an entry to keep everything in sync.
				$results['instances'][$result_key]['highlight_attribute'][]='';
				//Mark this as having failed.
				$hasFailed=true;
			}
		} 
		//So if we got through everything and didn't find an ismap attribute, then everything is fine.
		if(!$hasFailed){
			$result_key='No server-side image maps were found.';
			//-1 means no elementIndex
			$results['instances'][$result_key]['indices'][]=-1;
			$results['instances'][$result_key]['type'][]=RESULT_PASS;
			$results['instances'][$result_key]['highlight'][]=HIGHLIGHT_NONE;
			$results['instances'][$result_key]['highlight_attribute'][]='';
		}
	} else {
		//We didn't find any images, so there wont be any image maps either...
		$result_key='No server-side image maps were found.';
		$results['instances'][$result_key]['indices'][]=-1;
		$results['instances'][$result_key]['type'][]=RESULT_PASS;
		$results['instances'][$result_key]['highlight'][]=HIGHLIGHT_NONE;
		$results['instances'][$result_key]['highlight_attribute'][]='';
	}
	
	//Return the results.  If you don't do this, then you wont be able to do anything with them...
	return $results;
}

It is very important to create array entries for all four result parts (indices, type, highlight and highlight_attribute).  If you do not, then each result won't sync up correctly and you will get incorrect messages for each element.  I am trying to think of a better way of doing this, one which doesn't slow things down...
Just to complicate things a bit further, some rules have additional parts, such as 'info'.  This is fine, as long as every entry into this rule's result array contains the same parts, for synchronisation purposes.
Feel free to change this for your own purposes, it's just the way I've put it together for the WAI validation.

*********************************************************************
List of functions contained within the parse class (parse.class.php):
The following are some of the most useful functions.  All the others
are listed after this group.
*********************************************************************

parse_HTML($contents, $get_parents=true, $get_attributes=true)
This is the main function.  It takes the HTML as a string and turns it into an array of elements, which can then be accessed by the other functions listed here.  This MUST be run for this class to work.
If you know you do not need to use anything involve the parent elements or closing elements, then you can specify $get_parents=false.  $get_attributes should always be specified as true.

parse_stylesheet($default_media, $file_location='default')
This is the second main function.  It takes all the stylesheets in the current page and extracts the rules and attributes.
You need to pass an array containing the default media types ($default_media).  If you do not include a file location, it uses the current page location.  This function WILL slow things down a bit, so if you're not interested in the styles of this page, do not run it.

getAllElements()
This returns the array allElements which contains every bit of information about the current page.

getElementsByTagname($tagname)
This returns an array of elements matching this tag name.

getElementById($id)
This returns an array of elements which match this id attribute.  Although it's traditional to only return a single element for this type of function, every other function like this returns an array so for consistency's sake, this usually returns a single element array, although it could be more...

getAttribute($elementIndex, $attribute, $lowercase=false)
This returns the value of an attribute for a specific element.

tagname($elementIndex)
This returns the lowercased tag name for the supplied $elementIndex.

getComputedStyle($elementIndex, $property, $css_only=false, $media='screen')
This returns the value of an attribute for an element, given it's parent elements and any applied styles.  Note that the value may not be actually specified by this element, but parent elements' values may be inherited (ie, background colours).
$css_only will restrict the checks to just HTML attributes, while $media is the default media value if none is supplied.

src($elementIndex)
This returns the fully resolved path for an image's src attribute.  The baseUrl property needs to be set for this to work.

href($elementIndex, $is_string=false)
This will return the fully resolved path of an anchor.  The baseUrl value needs to be set for this to work.
If $is_string is true, then $elementIndex can be a string.

parentElement($elementIndex)
This returns the elementIndex of the current element's parent.

**********************************************
All remaining functions in alphabetical order:
**********************************************

all_descendants($elementIndex, $tag_name='', $avoid='')
This will return an array of all the elements contained within an element ($elementIndex).
You can specifically return a particular type of element by specifing $tag_name, and you can ignore a	particular element by specifing $avoid

bodyInnerText()
This will return the complete text contained within the first body element.
There is the potential for formatting to be lost since all the tags will be stripped.

childElements($elementIndex)
This returns just the immediate children of an element ($elementIndex).
If you want the complete tree, use all_descendants instead.

className($elementIndex, $assoc=false)
Get the class attribute from an element. This ALWAYS returns a lowercase array.  There is an option ($assoc) to return an associative array instead of a numeric array.
If you want a original-case result, use getAttribute instead.

defaultPage($value=-1)
This returns the default page (ie, index.html) that the site is using.
This is mainly used by internal functions.
If $value=-1 then the current value is returned, otherwise the value becomes the defaultPage value.

docType()
This retrieves the document type on the understanding that it is somewhere ahead of the <html> element.

encoding()
This returns the charset encoding of the 'Content-Type' meta tag (if any is found).
If multiple values are found, the most recent (last) one is returned.

fullElement($elementIndex)
This returns the complete element in question, but not the contained text, or closing element.

get_html()
This returns a string will every element concatenated into one value.

getAttributes($elementIndex)
This returns the complete array containing all the attributes (if any) for a particular element ($elementIndex).

getBaseUrl()
This returns the baseUrl value which should be set in order for the href and src functions to work.

getCode($instance)
This returns a specially formatted string of HTML, highlighting a particular aspect.  $instance one of the $instances array entries that should be returned by a rule test.

getCurrentPage()
This returns the page that is currently in memory.

getDomain()
This returns the domain of the current page in memory.

getElementsByAttribute($attribute)
This returns a list of elementIndexes which contain this attribute.  Matches include attributes with empty values.

getNextElement($elementIndex, $direction='forward', $include_closing_tags=false)
This returns the next elementIndex in the array, in either a 'forward' or 'backward' $direction.  By default it ignores closing elements, but this can be changed in $include_closing_tags.

getRow($elementIndex)
This returns the source code row that this element starts in.

getStyles($elementIndex, $media='screen')
This returns an array of the complete list of CSS styles that directly apply to this element, but not inherited styles.  A media type must also be supplied, the default is 'screen'.

getStylesByProperty($property, $media='screen')
This returns a list of styles that contain a particular property for the supplied media type. 
If a particular type of selector needs to be excluded, specify it in the $ignore_selectors array.
Pseudo classes can be filtered out by $ignore_pseudo=true (the default value).
This does not relate it to HTML elements.

getStyleSheetLocation($stylesheet)
This returns the physical location of the provided stylesheet index ($stylesheet).

getStyleValue($elementIndex, $property, $media='screen')
This returns the value of a particular style for an element, given a media type.
This will only return the actual applied value, not all of the styles that are relevant to this element.

getText($text)
This returns the first 80 characters of the text ($text).

index_styles($styles, $selectors, $stylesheet_index, $type)
The point of this function is to create a lookup of all the properties being used, and the selectors that call them.
This way, we can quickly identify the styles that use any particular property, leaving us with just the issue of whether or not they apply to any particular element. $type is either 'normal' or 'inline', depending on the nature of the stylesheet being parsed.

href_filename($elementIndex)
This returns the filename of any href value previously calculated through the href function.

innerHTML($elementIndex)
This returns all the HTML contained within this element and its closing tag.

innerText($elementIndex)
This returns all the text contained within this element and its closing tag.

isClosing($elementIndex)
This identifies if this element is a closing element or not.

isOrphan($elementIndex)
This identifies if this element is an orphan or not.  This usually applies to closing elements that haven't been nested correctly.

outerHTML($elementIndex, $element_limit=-1)
This returns all the HTML contained within this element and its closing tag, including the original element itself.
If the $element_limit value is otherwise specified, then only text up to that number will be returned.

outerText($elementIndex)
I really can't think of a use for this...

parse_doctype()
This returns an array containing all the information about this page's document type (if any).

rebuild_element($thisElement)
This is an experimental function that needs to be called if you use setAttribute to change something.  We need to recalculate the position of everything.  I do not recommend that you use setAttribute because I haven't really tested it widely.

setAttribute($thisElement, $attribute='', $value='')
This is an experimental function which lets you change the value of an attribute.  The rebuild_element function is called durin this so that the start and stop positions of the element are updated.

setBaseUrl($base)
This sets the base url of the current page, which is used for constructing the resolved URLs of href and src attribute values.  If this is not set, then you may not be able to retrieve the href or src attribute values.

setCurrentPage($current)
This stores the current page location, which is used in a number of functions.

setDomain($domain)
This stores the domain of the current page.

setText($thisElement, $text)
Another experimental function, you can change the text of an element.  How this affects other elements has not been investigated.

specified($elementIndex, $attribute)
This returns true or false as to whether an attribute is specified for an element.  Empty values return true.

text($elementIndex, $strip_spaces=false, $include_outer_element=false)
This returns all the text fragments directly associated with this element, but not its decendants.  You can also choose to clean up the spaces found within it ($strip_spaces=true), and include the actual element as part of the result ($include_outer_element=true).
Return current item: OpenWolf Guidelines Validator