Research Project: On the Influence of Randomness in Planning and its Exploitation via Restarts

Project still available?

The project is not yet taken by anybody.

It might however be that I don't have capacities left in general. Please check out the main page on research projects to see whether I'm maxed out. (I don't post this here so that I don't have to update all my project descriptions in case I don't accept more students.)

Requirements

  • This project can be done as 12 pt project (over two semesters) or as 24 pt Honours thesis. (If really required, 12 pt over one semester might also be possible, but is strongly disencouraged).
  • You don't need background on AI planning as learning that in the first two weeks should be easy.
  • This is a primarily empirical work, and it requires C++ coding skills as well as the ability to develop and run multiple scripts.
  • I don't require any specific mark average like for many of my other projects, but I still hope that the results of this work can be published at least at a B-ranked venue, so please don't take this project if you are not aiming at getting a very high HD mark for it. (And no, this doesn't come for free, this is the result of hard work.)

Project Description

TLDR:

In a very nutshell, your task is to run multiple planners multiple times with different random seeds and evaluate the performance difference thus showing the impact of randomness. In a second step, depending on the outcome, this should be exploited via restarting the respective planer after a certain (well-chosen) time. The usefulness of restarts should be put into perspective by investigating other problem solving techniques such as SAT solving.

The slightly longer version:

The performance of planners performing search depends on many factors, such as deadend detectors, the search algorithm, and quality of heuristics. But if the search space contains millions (or more) of search nodes, some decisions -- i.e., which node to select next -- are bound to be not unique. Some programs are artificially made deterministic by choosing the newest or oldest node as a tie-breaker (although this can be regarded a doubtful strategy as this tie-breaker might not always carry a clear semantics). Even if such a strategy is not deployed, we still need deterministic results to make programs comparable. It would be quite bad if we tried to reproduce results reported in a scientific paper and would get very different outcomes -- just due to randomness. Thus, in such cases programs are "artificially" made deterministic by always making the same random decision in cases where multiple equivalent options exist. This is done with a random seed, that often can be specified as an input parameter (or that is otherwise set in the code).

Since a search procedure (including the chosen heuristic and all other parameters) never always lead to unique decisions throughout the entire search it would be good practice to call each planner many times and report the average as well as standard deviation. Since time is always limited nobody ever does this (sadly) -- the experiments would simply take too long. So, currently, it's just not known how well some search configuration actually performs on average, and which impact randomness has.

This project aims at closing this knowledge gap. Thus, the task is to conduct a comprehensive empirical evaluation on a large range of planning problems with multiple state-of-the-art planners. Both classical as well as non-hierarchical planning systems should be chosen. The research may focus on planning via search, but other approaches, such as planning via compilation to SAT, can also be evaluated. In this evaluation, each planner is called many times, so that mean and standard deviation (most importantly for number of generated search nodes) can be reported.

Depending on these values it shall be investigated whether restarts might be beneficial. I.e., rather than investing more time for the current search configuration, it could be cancelled and started again with a different random seed in hope that different (random) choices might solve the problem quicker than before. In principle, this can also be investigated in combination with iterative deepening search, so that different planning episodes have different search horizons (search depths) available.

Your task is:

  • Conduct a comprehensive empirical evaluation with multiple planning systems.
  • Evaluate the impact of randomness and investigate the potential of restarts (potentially in the context of iterative deepening search).
  • Investigate and try to transfer research results from other disciplines, such as SAT solving (where restarts are more common).
  • Report on the findings (using my LaTeX template for research projects).
  • Should the results be promising, assist me in putting together a scientific paper (or multiple ones, based on the number of achieved results) at fitting conference (most likely after the project).

Further Reading

  • This project is both on classical and on HTN planning. For both I have reading recommendations on my page with resources for students. But please don't invest more than a few minute (if any) before your application, since I don't want you to waste any time before I formally agree to supervise your project.
  • Scientific papers on HTN planning:
    • All the ones mentioned in the advice for students section of my webpage.
    • Specifically for this project my paper "Hybrid Planning Heuristics Based on Task Decomposition Graphs", SoCS 2014, is recommended. You actually don't have to read it, but look at the table! It shows the desired values over 50 runs each -- this is how I envision our empirical evaluation to look like. (All newer papers on HTN planning algorithms and heuristics did not do this anymore and only use a single run instead.)
  • Scientific papers on restarts in SAT solving:
    • On the power of clause-learning SAT solvers as resolution engines
  • Scientific papers on restarts in Planning: