Experimental Results for the Test Problems of the 3rd IPC


This page contains experimental results showing the performance of LPG 1.0 in the test problems of the 3rd International Planning Competition (IPC): www.dur.ac.uk/d.p.long/competition.html. Notice that the results that we present here are not exactly the same as the official results of the competition, where for lack of time we were not able to run our system a sufficient number of times to obtain meaningful statistical data. However, in general the new results are very similar to those of the competition, with some considerable improvement in Satellite Complex and in the Rovers domains, where many problems could not be solved due to a bug in the parser of our planner that was easily fixed right after the competition.  Overall, the number of problems attempted in the new tests by our planner was 468 (over a total of 508 problems), and the success ratio was 94.4% (the problems attempted by LPG  in the competition were 428 and the success ratio 87%), which by far the highest success ratio among the fully-automated planners of the competition

All tests were conducted on the official machine of the competition, an AMD Athlon(tm) MP 1800+ (1500MHz) with 1 Gbytes of RAM.
We ran LPG with the same default settings for every problem attempted: maximum numbers of search steps = 500; maximum numbers of restarts for each run  = 50; inconsistency selection strategy = 22; noise factor: 0.1.  The results for LPG correspond to median values over five runs of LPG for each problem considered.  The CPU-time limit for each run was 5 minutes, after which termination was forced (when the CPU-time limit was exceeded in 3 or 4 runs, instead of  the median value,  we considered the worst of the remaining,  successful runs. This happened in 15 of the 424 problems solved using WALKPLAN).

The performance of LPG was tested in terms of both CPU-time required to find a solution (LPG-speed) and quality of the best plan computed (LPG-quality). In all plots, on the x-axis we have the problem names (simplified with numbers). On the y-axis, in the plots regarding LPG-speed we have CPU-milliseconds (logarithmic scale), while in the plots regarding LPG-quality we have the quality of the plans computed measured using the plan metric specified for the corresponding problem.

The plots showing the performance of LPG with respect to all other fully-automated planners of the 3rd IPC can be seen here:
    LPG compared with all other planners

In order to derive some general results about the performance of our approach with respect to all other planners of the competition, we have compared LPG with the best result over all the other planners.  We will indicate these results as if they were produced by an hypothetical ``SuperPlanner''.

The plots showing the performance of LPG with respect to the SuperPlanner can be seen here:
    LPG compared with the SuperPlanner

The (gzip'ed) tar file containing all solutions found by LPG can be dowloaded here
 
 
 
 
 

AG,  18 Novembre 2002