Karl Pietrzak's Computer Graphics Project: Parallelization of Ray Tracing
Description
My project will be the parallelization of my ray tracer, not just on a single machine with multiple CPUs, but across a network of heterogenous nodes using the
Java Parallel Processing Framework.
Screenshots
Feasibility Study
Ray tracing is a trivially parallelizable task because of the very nature of sending thousands of rays through a film plane. Since the beginning of the quarter, I have abstracted the creation and management of rays. The purpose of this was to enable a variety of ways to create and manage rays, including:
- one right after the other, on a single thread (default)
- creation of a thread for every ray
- using a thread pool
- sending off rays as a "task" to other nodes in a network
The
Java Parallel Processing Framework allows the last option for a few main reasons:
- The Java JRE provides a very good abstraction over different machines running different operating systems, on different hardware, and on different networks.
- The Java Parallel Processing Framework already has the software for not only sending tasks to different nodes but also for system administration. Some advanced features include dynamic, over-the-network class loading, through well-design use of Java's class loaders.
Mid-Quarter Status Update
Unfortunately, I have only gotten the opportunity to refactor and modernize my code base to better support the
Java Parallel Processing Framework, and I have not been able to intergrate it yet.
After studying the documentation for
Java Parallel Processing Framework, it appears this will be fairly trivial.
The difficult part of this project will not be the integration of the
Java Parallel Processing Framework, but testing. I will have to manually log into multiple machines in the computer science department, hope that the firewall does not block anything, and see how well I can scale my ray tracer.
Nevertheless, the benchmarks will be very interesting.
Approach
My project was to parallelize my ray tracer across a network. Writing my own distributed networking application would have been outside the scope of this Project, so I decided to use the
Java Parallel Processing Framework. As described on the home page, the
Java Parallel Processing Framework contains many features that are of immense value to not only a distributed ray tracer, but any distributed networking application:
- an easy to use API to submit tasks for execution in parallel
- a set of APIs and user interface tools to administrate and monitor the servers
- scalability up to an arbitrary number of processing nodes
- the framework is deployment-free: no need to install your application code on a server, just connect to the server and any new or updated code is automatically loaded.
- built-in failover and recovery for all components of the framework (clients, servers and nodes)
- opportunistic grid capabilities with JPPF@Home (see screenshot)
- fully documented APIs, administration guide and developer guide
- runs on any platform supporting Java 2 Platform Standard Edition 5.0 (J2SE 1.5) or later
My ray tracer to begin with implemented the technical requirements (e.g., J2SE 1.5), so there was no major overhauls necessary. In fact, I designed my ray tracer from the beginning to support parallelization by means of good software engineering practices. Specifically, I designed my ray tracer around the concepts of a RayCreator and RayExecutor.
- RayCreator: an interface from which Ray objects can be gotten; this means classes which implement this interface implement some sort of ray-generation algorithm; implementing classes include:
- SimpleRayCreator: generates 1 ray for each pixel
- SuperSamplingRayCreator: generates N rays for each pixel
- RayExecutor: an interface from which Ray objects can be spawned, manipulated, and a color returned
- SimpleRayExecutor: executes a single Ray at a time, and then moved onto the next one (think for-loop)
- SuperSamplingRayExecutor: executes a single Ray at a time, but collapses results into a single color by averaging the returned colors
- JPPFRayExecutor: wraps a single Ray into a JPPFTask for execution
- JPPFMultipleRayExecutor: wraps multiple Rays into a single JPPFTask for execution
Future Enhancements
- The
DataProvider classes offered by the JPPF "provide a way for tasks to share common data". Better use of these services could lower the amount of bandwidth that my ray tracer generates. I already share World instances this way, but perhaps even the Rays themselves would benefit from this kind of data sharing.
- Benchmark and decide on number of rays and bundle size. Better benchmarking would allow choosing optimal bundle size and how many rays to put in a bundle, thereby limiting the amount of I/O that has to be done and maximizing CPU, while still making the task parallelizable.
- Currently, all Rays are generated at once in the RayCreator classes (e.g.,
edu.rit.cs.cg2.kp.ray.SimpleRayCreator). Especially with super sampling, this means a huge number of Rays is generated, and the Java Virtual Machine quickly runs out of memory. A piece-meal ray creation engine can be created using the java.lang.Iterable framework. The work has already begun, actually.
Results
- Variables
- bundle size, number of threads, etc.
- Bundle size: 40,000
- Local, single thread: 4.932s
- Distributed, single node: 48.371s
- Distributed, two nodes, five threads: 34.863s
The morale of the story is that these numbers are very fragile, and can be modified and optimized to a great extent.
Appendix
User's Guide
Using the Eclipse project file
The ZIP file which contains my source code is a packaged up Eclipse project. This means it can be imported by File->Import->Exiting Projects Into Workspace...
Program Description
The only class with a main function is the
edu.rit.cs.cg2.kp.runner.RayTracerDriver class. Simply running this will start the ray tracing process using the
Java Parallel Processing Framework.
The choice of whether to
Java Parallel Processing Framework or simply execute the ray tracing process locally can be found in the
edu.rit.cs.cg2.kp.Camera class. See lines 45 and 46.
Input
None.
Normal Output
ClassServerDelegate.init(): Attempting connection to the class server
ClassServerDelegate.init(): Reconnected to the class server
JPPFClient.init(): Attempting connection to the JPPF driver
JPPFClient.init(): Reconnected to the JPPF driver
current : 1000/250000
current : 2000/250000
current : 3000/250000
[...snip...]
current : 249000/250000
current : 250000/250000
It took: 23.709s
This output is was trivial to implement and provides some way to see whether the ray tracing is actually occurring.
Technical Documentation
Program Description
The
Java Parallel Processing Framework provides a good easy way to parallelize tasks and has a good, extensively, highly sophisticated framework for doing such things.
Since implementing a distributed framework would be vastly outside the scope of my project and the
Java Parallel Processing Framework is so easy to use, there way no reason for me to implement by own.
Overall System Structure
The overall system structure is that of an object-oriented ray tracer. I used the object-oriented approach as described in [[http://www.cs.rit.edu/~jmg/cgII/][Professor Geigel's slides]. This means the following, general class hierarchy:
- World
- Camera
- Ray
- IlluminationModel
My own contributions to this design are as follows. From the start, I recognized the need for good design if I was going to add distributed ray tracing in the future for this Project.
- RayCreator allows different classes to create rays in different ways while still providing a single interface through which to access them.
- RayExecutor creates a unified way to execute rays, even though they have been created in different ways and are executed in different ways.
Testing/Acceptance Criteria
The acceptance criteria for the project are trivial because the results are exactly the same as that of a locally executed ray tracing projects. The ray tracing image is the final proof of whether everything went all right, and it does work. The unit tests through the project (e.g.,
edu.rit.cs.cg2.kp.ray.RayUtilsTest,
edu.rit.cs.cg2.kp.ray.SimpleRayCreatorTest, etc.) provide a way to locally test a specific unit of functionality.
The only thing that I could think of that could disturb the results of the ray tracing process is that different architectures use different floating point representations and optimizations. During my tests, for example, I ran JPPF nodes on x86 Athlon,
UltraSparc?, and Pentium 4 machines, and all these architectures do floating point different. However, the final result image did not show any disturbing artifacts, which leads me to believe the different floating point representations and optimizations and trivial in the grand scheme of things.