r3 - 13 Jun 2006 - 15:10:49 - TimPetersonYou are here: TWiki >  GRAPEcluster Web  >  Documentation > FunctionImplementationDetails

Function Implementation Details

Introduction

This document describes the actions of and interactions between the different objects in the Function system. This information can be useful both when working with the internals of the system and when writing more complex Functions, in order to understand what's going on.

Objects in the System

There are four types of objects in the system: Functions, DataInputs, DataOutputs, and FunctionGroups.

The Function performs all of the computation in the system. Whenever the data for one of its outputs is requested, it retrieves the data provided by its inputs, creates new data of some sort, and sends the new data to the output. Inputs and outputs are represented by DataInput and DataOutput objects, respectively. Upon request, a DataInput collects and aggregates the data provided by its connected DataOutput(s). These DataOutputs cache their data so that their Function need only calculate it once. When an input requests the data, the output requests the function to recompute the data if necessary and then returns the cached data.

Having a large number of functions in the system may prove hard to manage, so FunctionGroups are provided as an organizational tool. They allow the user to separate functions of different purposes into groups while still allowing arbitrary connections between them. Taking this idea one step further is the UserFunction, which is a function plugin which contains a FunctionGroup inside itself. Its inputs and outputs can be configured arbitrarily and connections made between the internal functions and the inputs and outputs. This allows complex functions to be constructed out of smaller parts and then treated as a normal function.

Object Interaction

Nothing in the system is ever computed unnecessarily. Changing the inputs to a function has no effect until something requests output from the function. However, if the system consisted solely of such dormant functions, nothing of interest would ever happen. Thus, something must occur to trigger the production of data. If, for example, a camera is prepared to render an image, and that image is to be displayed on-screen by another function, the display will not automatically be updated when the image changes. The display must be told explicitly to update, which will cause it to request the image from the camera. The camera, in turn, does not re-render its image for every change of its inputs, so it then renders the image in response to the display's request. This wave of updating propagates until all necessary data has been computed and the new image is displayed on the screen.

When the image display queries the camera for a new image, it causes the getList() method to be called on the DataOutput object of the camera. The argument to this method is the Class of data which is being requested. (This argument is only useful, however, when the output is capable of producing data in multiple types.) This returns a list of all data of the given type currently being provided by the output. The output stores its current data in each of the supported data types, in data, a Map between Classes and the associated data of that type. Upon receiving this call, the output checks its dataValid Map to see whether the stored data for this type is valid, which would indicate that it doesn't need to be recomputed. If it is, then the data can be returned immediately. If not, then the Function containing this output must be called to update the data, using the updateData() method. This method is passed a reference to the output needing the update, as well as the data type needed.

Before updateData() can actually update the data, it must first determine whether the conditions are right for an update to be possible. This is done by calling the canUpdate() method. Normally, to make things easier for the plugin writer, an update cannot occur unless other functions are connected to each of the function's inputs. This is because a typical function needs all of its input to be available before it can compute a complete output datum. (This behavior can, however, be overridden.) If canUpdate() returns true, then the update() method is called to perform the update. Otherwise, the data is set to null by calling setData() on the output with null and the requested data type as arguments. Either way, the output's data has been updated, so it can return it to the caller.

The Function's update() method is usually overridden by the specific plugin in order to perform the plugin's calculations. It takes two arguments, both of which are optional in case the plugin doesn't need them. The arguments are the DataOutput that needs updating and the data type within that output. The plugin will typically collect the data from its inputs, compute the output data, and then set the data on the output. The latter is done by calling either the set() or setData() methods. As seen before, setData() requires as arguments the data type as well as the data to be set. However, if the plugin didn't override the version of update() that is given a data type, or if the output only has one data type, it can instead use the set() method, which is passed only the data. The data type will be determined automatically, by assigning the data to any of the available data types that are compatible with the data.

The function retrieves data from its inputs using the getList() method of DataInput. This returns a List of all of the data it is currently receiving from its inputs. If it is a single input, of course, there will be only one item in the list. The input itself doesn't store any data, so it must in turn call its connected outputs to get the data. It calls getList() on all of them, concatenates the resulting lists, and returns the new list. Each of these outputs are connected to functions, so the repeats until all required functions have been updated.

As soon as the data has been calculated once, all of the outputs will have their dataValid flags set. From that point on, nothing will have to be calculated whenever the value of an output is requested. However, the input to the system can change. When it does, this information must be propagated up to all of the input's dependencies so that they know they need to be updated. This is accomplished through the invalidate() method. Whenever an input's data changes, it signals this to its Function by calling invalidate(). The function in turn calls invalidate() on all of the outputs that depend on that input, and the outputs then invalidate() their connected inputs, propagating the invalid state to all dependents, direct or indirect. The next request for data from one of the invalidated outputs will result in the affected data being recalculated.

By default, all outputs on a given function are assumed to depend on all inputs on that function, meaning that a change in value of any input results in the invalidation and subsequent updating of all outputs. However, this universal dependency may not be accurate. For example, one output may depend only on the value of a subset of the inputs. In this case, the function should override the isDependent() method. The parameters to this method are the output and input being considered, and this method will be called for every pair of inputs and outputs in the function. When an input is invalidated, the invalidation is propagated only to those outputs for which isDependent() returned true for that pair of input and output.

Circular dependencies are not allowed. That is, it must not be possible to find a cycle through one or more functions by following only paths for which isDependent() returns true. It is permissible, however, to have a cycle in connections between functions, so long as the dependencies are arranged so as to break the loop at some point. For example, a function could have one output which depends only on one input, and another output which depends only on a different input. It would then be possible to connect the first output directly to the second input, producing a cycle in connections but not a cycle in dependencies. (Care must be taken, however, that the update() method only attempts to update the output requested through its argument, as requesting data from the wrong input when a cycle exists could cause a deadlock.)

The running of update methods is handled by the FunctionManager class, statically nested within the Function class. Whenever data is requested from an output whose data is not valid, Function.updateData() is called, which then calls FunctionManager.requestUpdate(). This method will trace all of the invalid dependencies of that output, by following the Function's dependency lists and recursively requesting an update for each output so discovered. Each such request for an update is stored as an object which tracks which other requests must be fulfilled before it can run the update itself. Once all such requests have been generated, the FunctionManager begins running any updates which have no update dependencies, i.e. updating all requests for which all output dependencies for that request are valid. Each update() method is concurrently run in a separate thread, using a ThreadPool. Whenever an output receives data during the course of one of these update() methods, the corresponding update request object is removed. If this removal leaves any other requests without any remaining dependencies, those requests are immediately started in the thread pool (before the first update() method terminates). This ensures that every update is run as soon as possible and as concurrently as possible.

Inputs and outputs can support more than one data type simultaneously. This allows the same information to be conveyed in multiple representations. (This feature should not be used to provide disparate information on the same input or output.) For example, an output producing a single value could provide the option of representing it as either a single- or double-precision floating-point number. Likewise, an input could accept data as either an int or a float. Each input and output contains a list of data types accepted or produced, respectively, ordered by preference, most preferable first. When an input and output are connected, the data type is chosen automatically. A single output can be connected to inputs using different data types simultaneously, and it will generate data in each format as requested by the inputs.

An input and an output can only be connected if they both support the same data type. In other words, the intersection of their data type lists must be non-empty. If the intersection has one element, this data type is stored and used thenceforth for all communication between this input and output pair. However, it may happen that a particular input and output have more than one data type in common. In this case, only one must be chosen. The data types in each list are ordered from most preferable (listed first) to least preferable (listed last). Out of the data types common to both lists, one is chosen for which the sum of the indices into each list is the smallest, thus choosing the type that is most mutually agreeable. For example, there might be two possible common data types between a given input and output. The first type occurs at index 2 in the input's list and index 3 in the output's list, while the second type is at indices 1 and 5 in the two lists, respectively. The sum of these indices is 5 for the first type and 6 for the second type. Thus, the first type will be chosen, despite the fact that the second type occurs first in the output's list while the first type does not.

Function Plugins

Each function plugin extends the base Function class. It can override as much or as little of Function as it needs, with the minimum being no overriding.

All computation is done in the update() method. There are three flavors of this method available for overriding, so that unneeded arguments aren't passed in. The first version takes two arguments: the DataOutput that needs updating and the Class of data to update within that output. This is useful when a plugin may have multiple outputs with multiple data types each, where it may not be desirable to recompute all data types for all outputs every time the inputs change. For example, not all of the outputs may be connected and the computation involved may be relatively expensive. In this case, it would be best to perform only the calculation needed at the time, which can be accomplished by overriding the two-argument version of update().

The next version of the method omits the data type argument. This is useful when there are multiple outputs but they each support only one data type. Thus, only the requested output need be recomputed. The third and final version of update() takes no arguments. It is intended for cases where everything can be updated at once, because there are few outputs and known data types. It is also possible to not override update() at all if the outputs are never invalidated by the inputs, such as if all outputs contain constant data or are only changed by commands.

The canUpdate() method determines whether there is enough input available to the function for a call to update() to be possible. By default this method returns true iff all inputs return true to a call to isConnected(). This method should be overridden if, for example, a connection to a certain input is optional and computation can proceed without data from it. This is the case with Camera functions, where having nothing connected does not preclude the camera from outputting a blank image.

A Function must be able to create a series of commands that will re-create an exact copy of itself when executed in the same context. The default serializeCreate() and serializeSetup() methods take care of creating the function and setting its parameters, but some functions may have more complicated setup procedures. In this case the function can override serializeSetup() and add to or replace the parameter-setting commands it generates. For example, the UserFunction contains other functions inside it, which must also be created. Also, its inputs and outputs are dynamically created, so more commands are needed to create them. It also creates parameter inputs directly in order to set their default values. This covers everything the default implementation does, so the UserFunction? needn't call the superclass implementation at all.

A basic DataInput returns a list of data in whatever type is requested. For the convenience of the programmer, there are subclasses of DataInput that provide simpler forms of the data. The MultiDataInput class can be used in cases where multiple data of a single type are expected as input. It provides a parameterized retrieval method, so that, for example, an input created as MultiDataInput<String> has a get() method that returns a List<String> rather than the more generic List<Object> returned by the base DataObject. Taking this one step further, for inputs taking a single datum of a single type, there is the SingleDataInput class. This class extracts the single value from the list automatically, providing a get() that, when called on, e.g. a SingleDataInput<String> object, returns a single String.

For polymorphic inputs (those that can accept any of a number of data types) there are the PolyDataInput and MultiPolyDataInput classes. These accept either one or multiple items of data, respectively, from connected outputs.

Each function has its own EventManager, so other objects can listen for events from it. It fires the "setName" event when its name is changed, and it fires "newInput", "delInput", "newOutput", and "delOutput" when the associated actions occur. It also fires the "delete" event when it is being deleted. The function also sets up an EventDispatcher on itself, so that it can listen for its own events simply by defining functions like:

public void onNewInput(Event e)

Commands

Functions can specify in their static members commands that they are capable of handling. These are parsed by a SpracheParser object, which calls whatever function was specified in the code portion of the command definition. The first argument is always the Command object that triggered this call. The rest of the arguments are all of the variables specified in the command, in their appropriate types. An array can also be used instead of separate variable arguments.

For each of the parameter inputs the function posesses, a command definition is automatically generated that will allow that parameter to be set. For example, a parameter named "location" of type Point3d would produce this command definition:

set location <double x> <double y> <double z> { onSetParam }

This indicates that the receipt of a command resembling "set location 1 1.5 2" would result in the method onSetParam being called with four parameters: the Command followed by three doubles corresponding to the variables in the command. This method is defined by Function and can handle any parameter command whose arguments correspond to the data type of the parameter. The CommandDataHandler class provides the conversions between command parameters and data objects.

Any of the generated parameter commands can be overridden by the function. Any command definitions specifed by the function that are found to match with a generated command will replace the generated command. One command definition matches another if all commands that would be accepted by the first command definition would also be accepted by the second. Thus, one could override the example command above by defining a new command

set location <double x  slider(-5, 5, 0.1)> <double y  slider(-5, 5, 0.1)> <double z  slider(-5, 5, 0.1)> { onSetLocation }

This would add sliders to all of the arguments and direct the processing to a different function.

Each function has a command prefix. This is a list of zero or more string tokens that must be prepended to a command sent to the system in order for the remainder of the command to be executed by the function. For example, if function F is inside group G, the command prefix would be "group G function F." If one wanted to send the "foo bar" command to function F, prepending the prefix would result in the complete command "group G function F foo bar." Without the prefix, the ViewControl would be unable to locate the function in order to give it the command. UserFunctions contain functions inside them, so they must also pass on a prefix to their sub-functions. If function F were inside a UserFunction named U, function F's prefix would be "function U function F."

Not only can functions be inside groups, but groups can be nested inside other groups. For example, suppose group A contains group B, which contains function F, and suppose we want to send the command "foo" to it. This necessitates a means for referring to objects several levels deep. One way to accomplish this is by naming each group individually in the command:

group A group B function F foo

When the root group parses this command, it sees that it starts with "group" and a group name, and immediately sends the command to group A. Group A then sees that the next two tokens are again "group" and a group name, so it sends the command on to group B. Group B, finally, sees the tokens "function F" and sends the "foo" command to function F. There is, however, a more succinct way of referring to nested groups. This is accomplished by separating the group names with dots. Group B can then be referred to simply as A.B, reducing the command to:

group A.B function F foo

However, functions can also be referred to in the same way, by including the function name in the dotted path:

function A.B.F foo

This is the shortest possible form of the command.

Initialization

Function plugins are loaded from the viewcontrol/function/plugins directory as well as from any other locations specified at startup. When searching for available plugins, all classes in these directories and all subdirectories are considered. A class file is considered a plugin if it implements the Plugin interface (which the Function class does). Nested classes, i.e. those containing a dollar sign ($) in their filename, are ignored.

Plugins can be stored in the filesystem in a hierarchical fashion, thus the plugin names are hierarchical also, according to the directories they are in, seperated by dots. A class named Camera2D placed in the directory camera would thus have the fully-qualified name camera.Camera2D. However, some plugins may require auxiliary classes, which could clutter the directory already shared with other plugins in that category. To handle this, plugins are permitted to exist in their own directory without disrupting the hierarchical naming. The directory name must be the lower-case version of the class name. So, the camera.Camera3D plugin could reside equivalently in either camera/Camera3D.class or camera/camera3d/Camera3D.class.

When loaded, a plugin is queried for its capabilities. This is done through the getCapabilities() method. Two versions of this method are searched for, with and without a Class argument indicating the class of the plugin. The one-argument version is tried first, for the benefit of Function subclasses. It needs this argument so that the getCapabilities() method inherited from the base class will know what class it is to retrieve the capabilities for. The return value is a PluginCapabilities object containing the plugin name (as above), display name, description, and authors. The Function returns an object of a subclass of that class, FunctionPluginCapabilities. This object contains function-specific information, specifying what inputs, outputs, and commands this plugin will have and accept, respectively.

The Function implementation of getCapabilities() uses Java reflection to gain access to information about the plugin. In particular, it attempts to call static methods and read static fields of the class. Normally the data will be in static variables, but the function forms are provided for in case there is a need to generate this data on the fly.

There are eight fields in FunctionPluginCapabilities that must be initialized: name, category, displayName, description, authors, inputInfo, outputInfo, commands. The name field is derived from the class name. The category field defaults to the class's position in the package hierarchy relative to the plugin root package. The displayName and description fields are strings, retrieved by the getDisplayName() and getDescription() static methods, or, if those do not exist, from the displayName and description static fields, respectively. The author field is an array of strings. The method getAuthors() is called to retrieve it, or, failing that, the author and authors fields are read. The difference is that authors contains a string array, while author contains a single string, allowing author to be used as shorthand when only one author exists.

The final three fields, inputInfo, outputInfo, and commands, are Function-specific. They are each retrieved from like-named get methods or fields, and the fields can be either arrays or single objects. The inputs and outputs of the function are described by DataInput.Info and DataOutput.Info classes, respectively. Command definitions are retrieved in string form, and are parsed into CommandDefinition objects before being stored in the FunctionPluginCapabilities object.

When an instance of a plugin is created, the Function constructor is called. Among other things, it initializes the declared inputs and outputs. For each input declared in the capabilities object (through the inputInfo field), a corresponding member variable is searched for in the object. The name of the variable is the name of the input suffixed by "In." If this variable is not found, another attempt is made without the suffix. For example, for an input named "object" it would look for variables named "objectIn" and then "object," in that order.

If a variable is found, an appropriate DataInput object is created, according to the Info object. Care must be taken to ensure that the type of the variable matches that of the containing class of the Info object used. This object is both assigned to the variable and added to the function as an input, thus making it ready for use. If no applicable variable is found, the declaration is ignored. Outputs are initialized in the same way, using an "Out" suffix.

-- TimPeterson - 09 Jun 2006

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r3 < r2 < r1 | More topic actions
General Information
Technology
  • Resources

Documentation
Repository
Related Projects
  • GUI Development
  • MovieMaker?
  • GUI
  • 3D Input Devices
  • Fly Through Path

Related Sites

 
Powered by TWiki
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback