28 Hilarious Markov decision process Puns

Reconciling my understanding of dynamic programming and Markov decision process

Algorithms like policy iteration and value iteration are often classified as dynamic programming methods that try to solve the Bellman optimality equations.

My current understanding of dynamic programming is this:

It is a method applied to optimization problems. DP problems exhibit optimal substructure, i.e., the optimal solution to a problem contains the optimal solution to a subproblem. These subproblems are not independent of each other, but are overlapping. There are two approaches - one is a bottom-up, the other is top-down. I have the following questions:

Is this understanding of DP comprehensive? Does every DP algorithm have an optimal substructure with overlapping subproblems?

How does policy iteration and value iteration fit into this scheme? Can we call them bottom-up or top-down?

👍︎ 2

💬︎

👤︎ u/b3anz129

📅︎ Apr 15 2021

🚨︎ report

Below you will find a link to a Zoom recording where our team discusses Reinforcement Learning. Topics covered: Markov Decision Process, Double Q-Learning, the math behind Q-Learning, and the Bellman Equation. We also walk through the algorithms and provide coded examples.

Topic: Reinforcement Learning Math Discussion

Meeting Recording:

https://us02web.zoom.us/rec/share/xcdlLPLzrmxLfNbNuFHud4UtFaTVeaa823IYr6dYzUw-uzo3Q0gjSQwweD9oLgzf

👍︎ 43

💬︎

👤︎ u/davidstroud1123

📅︎ May 19 2020

🚨︎ report

Hey Everyone! Tried writing a small introduction to Markov Decision Process. This is my first technical blog! Feedback and Suggestions would be greatly appreciated. medium.com/@mitesh_shah/i…

👍︎ 14

💬︎

👤︎ u/mitesh1612

📅︎ Apr 27 2020

🚨︎ report

Reinforcement Learning: The Markov Decision Process

How is the Markov Decision Process used in AI? All that has been explained in a simple fashion in Reinforcement learning Part 3: Introduction to Markov Process. Take a look 👉https://medium.com/ai%C2%B3-theory-practice-business/reinforcement-learning-part-3-the-markov-decision-process-9f5066e073a2

👍︎ 32

💬︎

👤︎ u/cdossman

📅︎ Oct 30 2019

🚨︎ report

Markov decision process [minor Part 2 Spolier]

So In the final episode we get to see the code to get into HAP's Lab. I’m not just looking at every little thing. This was deliberate. The camera panned directly at it and he stood to the side.

268 #62

Now maybe this is just an easter egg or maybe I've been looking at this stuff for too long and everything can apply, but I see so many coincidences that are always somewhat applicable.

This lead me to a book on Signal Processing for Cognitive Radios.

There it talks about infinite horizon value functions and how they can be approximated by a piecewise linear finite horizon value function when computed for a sufficiently long horizon

Also on that page it references something called the Markov Decision Process.

>A Markov decision process (MDP) is a discrete time stochastic control process. It provides a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying optimization problems solved via dynamic programming and reinforcement learning.

>They are used in many disciplines, including robotics, automatic control, economics and manufacturing.

>The probability that the process moves into its new state is influenced by the chosen action. The next state depends on the current state and the decision maker's action But is conditionally independent of all previous states and actions

All sounds remarkably similar to the type of system that can propel forking life decisions and dimension jumps forward.

Also the whole robotics and automation stuff.

👍︎ 9

💬︎

👤︎ u/Cicer

📅︎ Mar 30 2019

🚨︎ report

[P] An introduction to Markov Decision Process [RL]

I have recently managed to set aside some time to clean up a couple of drafts that have been sitting on my lap for the last couple of months.

It's a series of three articles on the Markov Decision Processes, a piece of the mathematical framework underlying Reinforcement Learning techniques. A couple more are in the process of being written, but I believe that the material could already be useful to anyone interested in taking a look at the "nitty gritty" math formulation.

Link to the first article: https://www.lpalmieri.com/posts/rl-introduction-00/

Link to the index: https://www.lpalmieri.com/

👍︎ 185

💬︎

👤︎ u/LukeMathWalker

📅︎ Sep 01 2018

🚨︎ report

Are there works that provide geometric /topological viewpoint Markov Decision Process?

👍︎ 3

💬︎

👤︎ u/hmi2015

📅︎ Jul 25 2019

🚨︎ report

Is there a way to represent a Markov Decision Process as a BayesNet?

👍︎ 8

💬︎

👤︎ u/quickMLQuestion

📅︎ Mar 18 2019

🚨︎ report

[D] Simple intro to Markov Decision Process via Game of Thorns youtube.com/watch?v=Kllu_…

👍︎ 28

💬︎

👤︎ u/jaleyhd

📅︎ Apr 11 2018

🚨︎ report

partially observable Markov decision process: One of the best explanations i have come across youtube.com/watch?v=bVT7Q…

👍︎ 111

💬︎

👤︎ u/saltedcashew

📅︎ Apr 09 2018

🚨︎ report

Has anyone learned about the Markov Decision Process at UNLV? (Looking for a professor that teaches it)

Did you like the professor? Who was it? What class was it?

Thanks!

👍︎ 4

💬︎

👤︎ u/g3t0nmyl3v3l

📅︎ Oct 11 2018

🚨︎ report

[D] Reinforcement Learning - Markov Decision Process oneraynyday.github.io/ml/…

👍︎ 53

💬︎

👤︎ u/OneRaynyDay

📅︎ May 20 2018

🚨︎ report

Solving the egg drop puzzle in python using brute force, dynamic programming, and a Markov Decision Process declanoller.com/2018/09/0…

👍︎ 25

💬︎

👤︎ u/diddilydiddilyhey

📅︎ Sep 11 2018

🚨︎ report

The Markov Property, Chain, Reward Process and Decision Process xaviergeerinck.com/markov…

👍︎ 2

💬︎

👤︎ u/qznc_bot

📅︎ May 20 2018

🚨︎ report

Category Theory of Markov Decision Processes thenewflesh.net/2020/02/1…

👍︎ 69

💬︎

👤︎ u/hoj201

📅︎ Feb 12 2020

🚨︎ report

Some questions regarding Markov Decision Processes

Hi ! I am currently taking the Reinforcement Learning specialization by University of Alabama on coursera (auditing, no money T_T)

Please refer to this image before reading queries

https://imgur.com/a/t9qYB0D

The Markov property states that we have all the information needed in the current state and action to predict the next state

Queries :

If the markov property holds and we have all the information required to predict the next state then why do we sum over all rewards for a particular next state for ALL POSSIBLE NEXT STATES ? Isn't the next state known if the ACTION is given according to the markov property ?
Why are we summing over all rewards for EACH state ? Isn't the reward fixed for each state EVEN IF we are not given the action ?

👍︎ 4

💬︎

👤︎ u/CSGOvelocity

📅︎ Jun 28 2020

🚨︎ report

"A method for the online construction of the set of states of a Markov Decision Process using Answer Set Programming", Ferreira et al 2017 arxiv.org/abs/1706.01417

👍︎ 2

💬︎

👤︎ u/gwern

📅︎ Jun 11 2017

🚨︎ report

Policy in Markov Decision Processes

I am currently going through the university of alberta's course on RL on coursera

Confusion :

In MDPs the next state and the reward associated with it are stochastic. Given the current state, every state in the set of possible states has a finite possibility of occurring. Then how do you choose an optimal policy ? I understand that we are trying to maximise the discounted expected return (sum of all further rewards)

Do you evaluate multiple policies over episodes and then choose ?

And how are actions picked if states are stochastic ?

👍︎ 2

💬︎

👤︎ u/CSGOvelocity

📅︎ Jul 26 2020

🚨︎ report

Category Theory of Markov Decision Processes thenewflesh.net/2020/02/1…

👍︎ 15

💬︎

👤︎ u/hoj201

📅︎ Feb 12 2020

🚨︎ report

I summarised my experiences with learning a Partially Observable Markov Decision Process given input/output data - in case it helps anyone. danielmescheder.wordpress…

👍︎ 13

💬︎

👤︎ u/danielMe

📅︎ Dec 13 2011

🚨︎ report

Markov Decision Processes (MDPs) ?

anyone has any idea about Markov Decision Processes (MDPs) ? I am making a voice assistant android app and need a middleware for the backend to link everything.. so was wondering whether MDP would be a good option..

👍︎ 3

💬︎

👤︎ u/javaliciouz

📅︎ Apr 10 2020

🚨︎ report

The Mathematics of 2048: Optimal Play with Markov Decision Processes jdlm.info/articles/2018/0…

👍︎ 1k

💬︎

👤︎ u/begnini

📅︎ Apr 10 2018

🚨︎ report

Markov Decision Process in R for a song suggestion software?

Okay, so I'm not exactly sure if this belongs here, but this is my problem: We have a music player that has different playlists and automatically suggests songs from the current playlist I'm in. What I want the program to learn is, that if I skip the song, it should decrease the probability to be played in this playlist again. I think this is what's called reinforcement learning and I've read a bit about the algorithms, decidin that MDP seems to be exactly what we have here. I know that in MDP there are more than one state, so I figured for this case it would mean the different playlists. Like depending on the state (playlist) I'm in, it chooses the songs that it thinks fits the best and get "punished" (by skipping) if it has chosen wrongly.

So what I'm asking is, if you guys think this is the right approach? Or would you suggest a different algorithm? Does all of this even make any sense, should I provide more information?

If it does sound right, I'd like to ask for some tutorials or starting points getting about MDP in R. I've searched online but have only found the MDP Toolbox in R and it kind of doesn't really make sense to me. Do you have any suggestions? I'm really helpful for any kind of advice. :)

👍︎ 2

💬︎

👤︎ u/edevcimot

📅︎ May 19 2015

🚨︎ report

Category Theory of Markov Decision Processes thenewflesh.net/2020/02/1…

👍︎ 19

💬︎

👤︎ u/hoj201

📅︎ Feb 12 2020

🚨︎ report

What does E[...|...;...] mean in a mathematical formula? E.g. the state value function in Markov decision processes

I'm trying to implement this paper about Markov decision processes but am struggling with some of the formulas, for example the state value function definition at the end of the second page. I can understand everything in it except the double struck E[...] notation at the start, I've never seen it before, can't derive what it means from the formula and don't know what to look up. The only thing I can think of is set builder notation considering the 'where pipe "|"' and the big double struck letter but that wouldn't make any sense here. Could anyone help me out with this? Thanks a lot for reading!

👍︎ 21

💬︎

👤︎ u/LetsGetTrashed

📅︎ May 25 2019

🚨︎ report

The Mathematics of 2048: Optimal Play with Markov Decision Processes jdlm.info/articles/2018/0…

👍︎ 572

💬︎

👤︎ u/dezzion

📅︎ Apr 09 2018

🚨︎ report

Category Theory of Markov Decision Processes thenewflesh.net/2020/02/1…

👍︎ 11

💬︎

👤︎ u/hoj201

📅︎ Feb 12 2020

🚨︎ report

Reinforcement Learning: The Markov Decision Process

How is the Markov Decision Process used in AI? All that has been explained in a simple fashion in Reinforcement learning Part 3: Introduction to Markov Process. Take a look 👉https://medium.com/ai%C2%B3-theory-practice-business/reinforcement-learning-part-3-the-markov-decision-process-9f5066e073a2

👍︎ 3

💬︎

👤︎ u/cdossman

📅︎ Oct 30 2019

🚨︎ report