简体繁体中英

Data structure for Markov Decision Process

原文 2012-12-20 20:36:04 6 3 python/ artificial-intelligence/ markov

I have implemented the value iteration algorithm for simple Markov decision process Wikipedia in Python. In order to keep the structure (states, actions, transitions, rewards) of the particular Markov process and iterate over it I have used the following data structures:

dictionary for states and actions that are available for those states:
SA = { 'state A': {' action 1', 'action 2', ..}, ...}
dictionary for transition probabilities:
T = {('state A', 'action 1'): {'state B': probability}, ...}
dictionary for rewards:
R = {('state A', 'action 1'): {'state B': reward}, ...} .

My question is: is this the right approach? What are the most suitable data structures (in Python) for MDP?

3 answers

I implemented Markov Decision Processes in Python before and found the following code useful.

http://aima.cs.berkeley.edu/python/mdp.html

This code is taken from Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig.

Whether a data structure is suitable or not mostly depends on what you do with the data. You mention that you want to iterate over the process, so optimize your data structure for this purpose.

Transitions in Markov processes are often modeled by matrix multiplications. The transition probabilities Pa(s1,s2) and the rewards Ra(s1,s2) could be described by (potentially sparse) matrices Pa and Ra indexed by the states. I think this would have a few advantages:

If you use numpy arrays for this, indexing will probably be faster than with the dictionaries.
Also state transitions could then be simply described by matrix multiplication.
Process simulation with for example roulette wheel selection will be faster and more clearly implemented, since you simply need to pick the corresponding column of the transition matrix.

There is an implementation of MDP with python called pymdptoolbox . It is developed based on the implementation with Matlab called MDPToolbox . Both are worth noting.

Basically, the probability transition matrix is represented as an ( A × S × S ) array , and rewards as an ( S × A ) matrix, where S and A represent number of states and number of actions. The package has some special treatment for sparse matrix as well.

Problems with coding Markov Decision Process

Must a transition matrix from a Markov Decision Process be stochastic?

How to build Markov Decision Processes model in Python for string data?

Performance testing a Python data structure decision

Understanding The Value Iteration Algorithm of Markov Decision Processes

Generating Binary decision diagram from an available data structure

Best strategy to process the structure data?

Is there any repetition in this decision structure?

Simple decision structure project in Python

Generating Markov transition matrix for continuous data in Python

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Problems with coding Markov Decision Process Must a transition matrix from a Markov Decision Process be stochastic? How to build Markov Decision Processes model in Python for string data? Performance testing a Python data structure decision Understanding The Value Iteration Algorithm of Markov Decision Processes Generating Binary decision diagram from an available data structure Best strategy to process the structure data? Is there any repetition in this decision structure? Simple decision structure project in Python Generating Markov transition matrix for continuous data in Python

Related Tags

Data structure for Markov Decision Process

Question

3 answers

solution1
9 2013-01-11 08:43:04

solution2
8 2012-12-20 21:09:12

solution3
0 2017-11-12 05:16:11

Data structure for Markov Decision Process

Question

3 answers

solution1 9 2013-01-11 08:43:04

solution2 8 2012-12-20 21:09:12

solution3 0 2017-11-12 05:16:11

solution1
9 2013-01-11 08:43:04

solution2
8 2012-12-20 21:09:12

solution3
0 2017-11-12 05:16:11