GithubHelp home page GithubHelp logo

despot's Introduction

Approximate POMDP Planning Online (APPL Online) Toolkit

Copyright © 2014-2018 by National University of Singapore.

APPL Online is a C++ implementation of the DESPOT algorithm for online POMDP planning [1]. It takes as input a POMDP model in the POMDPX file format. It also provides an API for interfacing directly with a blackbox simulator.

For bug reports and suggestions, please email [email protected].

[1] N. Ye, A. Somani, D. Hsu, and W. Lee. DESPOT: Online POMDP planning with regularization. J. Artificial Intelligence Research, 58:231–266, 2017.

Table of Contents

Requirements

Tested Operating Systems:

Linux OS X
Build Status Build Status

Tested Compilers: gcc | g++ 4.2.1 or above

Tested Hardware: Intel Core i7 CPU, 2.0 GB RAM

Download

Clone the repository from Github (Recommended):

$ git clone https://github.com/AdaCompNUS/despot.git

OR manually download the Zip Files. For instructions, use this online Github README.

Installation

Compile using make:

$ cd despot
$ make

(Optional): If you prefer using CMake see the CMakeLists section.

Quick Start

DESPOT can be used to solve a POMDP specified in the POMDPX format or a POMDP specified in C++ according to the API. We illustrate this on the Tiger problem.

(Deprecated) 1.To run Tiger specified in POMDPX format, compile and run:

$ cd despot/examples/pomdpx_models
$ make
$ ./pomdpx -m ./data/Tiger.pomdpx --runs 2 

This command computes and simulates DESPOT's policy for N = 2 runs and reports the performance for the tiger problem specified in POMDPX format. See doc/Usage.txt for more options. For more details on the POMPDX format, see this page

2.To run Tiger specified in C++, compile and run:

$ cd despot/examples/cpp_models/tiger
$ make
$ ./tiger --runs 2

This command computes and simulates DESPOT's policy for N = 2 runs and reports the performance for the tiger problem specified in C++. See doc/Usage.txt for more options.

Most of options in doc/Usage.txt can also be specified the programmatically, see include/despot/config.h for the global parameters to use, and the InitializeDefaultParameters function in this section for an example.

Documentation

Documentation can be found in the "doc" directory.

For a description of our example domains and more POMDP problems see the POMDP page.

Using DESPOT with External Systems

An example of integrating DESPOT with an external Gazebo simulator can be found in the DESPOT tutorials page.

Package Contents

Makefile                  Makefile for compiling the solver library
README.md                 Overview
include                   Header files
src/core                  Core data structures for the solvers
src/solvers               Solvers, including despot, pomcp and aems
src/pomdpx                Pomdpx and its parser
src/util                  Math and logging utilities
license                   Licenses and attributions
examples/cpp_models       POMDP models implemented in C++
examples/pomdpx_models    POMDP models implemented in pomdpx
doc/pomdpx_model_doc      Documentation for POMDPX file format
doc/cpp_model_doc         Documentation for implementing POMDP models in C++
doc/usage.txt             Explanation of command-line options
doc/eclipse_guide.md      Guide for using Eclipse IDE for development

CMakeLists

(Optional)

If you are interested in integrating DESPOT into an existing CMake project or using an IDE for editing, we provide a CMakeLists.txt.

To install DESPOT libraries and header files into your system directory:

$ cd despot
$ mkdir build; cd build
$ cmake ../
$ make
$ sudo make install

To integrate DESPOT into your project, add this to your CMakeLists.txt file:

find_package(Despot CONFIG REQUIRED)

add_executable("YOUR_PROJECT_NAME"
  <your_src_files>
)

target_link_libraries("YOUR_PROJECT_NAME"
  despot
)

More Resources

HyP-DESPOT: A parallel belief tree search algorithm that integrates DESPOT with both CPU and GPU parallelization. Check out the paper and the code.

Acknowledgements

Pocman implementation and memorypool.h in the package are based on David Silver's POMCP code

Bugs and Suggestions

Please use the issue tracker.

Release Notes

2015/09/28 Initial release.

2017/03/07 Public release. Revised documentation.

2018/09/20 New API release.

despot's People

Contributors

autonomobil avatar cindycia avatar davidyhsu avatar luo-yuanfu avatar mohitshridhar avatar nehagarg avatar yenan avatar zsunberg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

despot's Issues

Possible bug in random root seed generation

Hi,

Thanks for the great repo!

I think I have found a possible bug with generating the pseudo-random random root seed:

long millis = (long) get_time_second() * 1000;

This casts the result of get_time_second() to long before the x1000 operation, so the millisecond component is lost before the multiplication, resulting in the seeds always having 000 as the final 3 digits. I don't believe this is the intended behaviour because it should use the 3-digit millisecond component in the generated seed.

Putting additional parenthesis around the calculation to delay the cast to long to the end seems to fix the problem:

long millis = (long) (get_time_second() * 1000);

I have provided example executions below, run on Ubuntu 20.04.

Original output:

gettimeofday ran with tv_sec: 1668419612, tv_usec: 610524
Generated random root seed 419612000. millis: 1668419612000, range: 1000000000

After fix:

gettimeofday ran with tv_sec: 1668419694, tv_usec: 729435
Generated random root seed 419694729. millis: 1668419694729, range: 1000000000

Cheers!
Ricardo

Questions: How define the planning horizon/tree height? Is it possible to get all actions from the branch with the most reward?

Let's say we a have a POMDP model with 9 actions, as well as a continuous state space.

  • First question: How define the planning horizon/tree height by using DESPOT with the cpp model?

  • Second question: Is there a possibility to get all actions which are on the highest rewarded branch?

To make things a bit clearer: This shows a planning horizon of 2 (two actions), correct?
image

Let's say the yellow branch has the highest reward, now I want an array with [a2, a2]. Is this possible?

DESPOT has incorrect termination condition

DESPOT accepts both a maximum number of scenarios to run and a timeout. I expected the planner to terminate when either condition was violated. However, that is not currently the case:

do {
  double start = clock();
  VNode* cur = Trial(root, streams, lower_bound, upper_bound, model, history, statistics);
  used_time += double(clock() - start) / CLOCKS_PER_SEC;

  start = clock();
  Backup(cur);
  if (statistics != NULL) {
    statistics->time_backup += double(clock() - start) / CLOCKS_PER_SEC;
  }
  used_time += double(clock() - start) / CLOCKS_PER_SEC;

  num_trials++;
} while (used_time * (num_trials + 1.0) / num_trials < timeout
  && (root->upper_bound() - root->lower_bound()) > 1e-6);

Building the DESPOT only terminates when (1) the gap between the upper and lower bounds at the root node is nearly zero or (2) the timeout expires. If the K scenarios are exhausted before then the loop continues to iterate, seemingly doing nothing because the RandomStreams are depleted, until the timeout expires.

How to specify a different sampling technique

I'm applying DESPOT for a robot grocery packing task. The robot's observation is represented by a vector of probabilities for each of the grocery items observed. Because of this, I find it difficult to represent the problem in a suitable manner on which I can apply DESPOT. My main issue currently is about how to modify the DESPOT sampler to sample from the vector of probabilities for each of the observed grocery items. Do you have any idea about how to go about this?

Reward based on state changes

Hi, I just had a quick question about possibly designing the reward matrix for a POMDP using the difference in the previous and current states, rather than the action. I see in the examples you have provided, all the reward matrices are dependent only on the previous state and action. Is there a reward function in DESPOT that accepts a different set of arguments?

POS_INFTY and NEG_INFTY are not infinite

The constants are currently defined as:

const double POS_INFTY = numeric_limits<double>::max();
const double NEG_INFTY = -POS_INFTY;

However, numeric_limits<>::max is defined to be:

Returns the maximum finite value representable by the numeric type T.

It would make more sense to use numeric_limits<double>::infinity(). I would be happy to make a pull request for this change. However, I want to check:

Is max() used instead of infinity() here intentionally? I.e. will something break if I make this change?

<InitialStateBelief>: 1 entries expected but 4 found.

Hi I specified in my POMDP model a latent state of time which has 4 states and it is kind of important to define it for the rest of my problem. I defined time states in the xml file first as follows

<StateVar vnamePrev="time_0" vnameCurr="time_1" fullyObs="false">
<ValueEnum>4</ValueEnum>
</StateVar>

Then I initialized its state values as

</CondProb>
<CondProb>
<Var>time_0</Var>
<Parent>null</Parent>
<Parameter type = "TBL">
<Entry>
<Instance>-</Instance>
<ProbTable>1.0 0.0 0.0 0.0</ProbTable>
</Entry>
</Parameter>
</CondProb>

But I get the following error for giving a vector of probabilities for state of time starting off with deterministic value of 1 for the first state and the rest will be zero:

ERROR: ./data/StarMaze.pomdpx:58:
 In <InitialStateBelief>: 1 entries expected but 4 found.

The errors are vague and the document doesn't clear up details and it is hard to figure out what causes this error. I appreciate if you can suggest why I get this error. Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.