mcts | AlphaGo Zero Montre Carlo Tree Search for TicTacToe | Reinforcement Learning library

 by   RomainSa Python Version: Current License: No License

kandi X-RAY | mcts Summary

kandi X-RAY | mcts Summary

mcts is a Python library typically used in Artificial Intelligence, Reinforcement Learning applications. mcts has no bugs, it has no vulnerabilities, it has build file available and it has low support. You can download it from GitHub.

Implementation of AlphaGo Zero Montre Carlo Tree Search for TicTacToe and Oware games
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              mcts has a low active ecosystem.
              It has 6 star(s) with 0 fork(s). There are 1 watchers for this library.
              OutlinedDot
              It had no major release in the last 6 months.
              mcts has no issues reported. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of mcts is current.

            kandi-Quality Quality

              mcts has no bugs reported.

            kandi-Security Security

              mcts has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              mcts does not have a standard license declared.
              Check the repository for any license declaration and review the terms closely.
              OutlinedDot
              Without a license, all rights are reserved, and you cannot use the library in your applications.

            kandi-Reuse Reuse

              mcts releases are not available. You will need to build from source code and install.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi has reviewed mcts and discovered the below as its top functions. This is intended to give you an instant insight into mcts implemented functionality, and help decide if they suit your requirements.
            • Search the tree
            • Select a node from the tree
            • Expand all nodes of a tree
            • Perform a simulation
            • Backpropagate updates to node
            • Play a move
            • List of legal plays
            • Return the winner of the player
            • Show the tree
            • Show the board representation
            • Return the winner of the game
            • Return the best played play
            Get all kandi verified functions for this library.

            mcts Key Features

            No Key Features are available at this moment for mcts.

            mcts Examples and Code Snippets

            Performs a montelo - tree search on the root node .
            javadot img1Lines of Code : 30dot img1License : Permissive (MIT License)
            copy iconCopy
            public Node monteCarloTreeSearch(Node rootNode) {
                    Node winnerNode;
                    double timeLimit;
            
                    // Expand the root node.
                    addChildNodes(rootNode, 10);
            
                    timeLimit = System.currentTimeMillis() + TIME_LIMIT;
            
                    // Expl  

            Community Discussions

            QUESTION

            MCTS Agent making bad decisions on Tic-Tac-Toe
            Asked 2021-Feb-08 at 17:30

            I've been working on a MCTS AI for a couple days now. I tried to implement it on Tic-Tac-Toe, the least complex game I could think of, but for some reason, my AI keeps making bad decisions. I've tried change the values of UCB1's exploration constant, the number of iterations per search, and even the points awarded to winning, losing, and getting to tie the game (trying to make a tie more rewarding, as this AI only plays second, and try to get a draw, win otherwise). As of now, the code looks like this:

            ...

            ANSWER

            Answered 2021-Feb-08 at 17:30

            My mistake was choosing the node with the most visits in the expansion phase, when it should have been the one with the most potential according to the UCB1 formula. I also had some errors when it came to implementing some if clauses, as all the losses weren't being counted.

            Source https://stackoverflow.com/questions/66082171

            QUESTION

            How to restore previous state to gym environment
            Asked 2020-Jun-13 at 17:45

            I'm trying to implement MCTS on Openai's atari gym environments, which requires the ability to plan: acting in the environment and restoring it to a previous state. I read that this can be done with the ram version of the games:

            recording the current state in a snapshot: snapshot = env.ale.cloneState()

            restoring the environment to a specific state recorded in snapshot: env.ale.restoreState(snapshot)

            so I tried using the ram version of breakout:

            ...

            ANSWER

            Answered 2020-Jun-13 at 17:45

            For anyone who comes across this in the future: There IS a bug in the arcade learning environment (ale) in the atari gym. The bug is in the original code written in C. restoring the original state from a snapshot changes the entire state back to the original, WITHOUT changing back the observation's picture or ram. Still, if you make another action after restoring the last state you get the next state with a correct image and ram. So basically if you don't need to draw images from the game, or save the ram of a specific state, You can play with restore without any problem. If you do need to see the image or ram of a current state, to use for a learning algorithm, then this is a problem. You need to save and remember the correct image when cloning, and using that saved image after restoring the state, instead of the image you get from getScreenRGB() after using the restoreState() function.

            Source https://stackoverflow.com/questions/62334284

            QUESTION

            Self play AI on the same MCTS?
            Asked 2020-Apr-04 at 18:53

            I've been recently trying to play with MCTS implementation for simple board game. I'd like to make AI play with itself to gather some sample playthroughs. I'd figure out I could make them use the same MCTS tree (for better performance). Or so it looks like.

            But would that be valid ? Or I need 2 separate trees for both AI with separate win/plays data to behave correctly ?

            ...

            ANSWER

            Answered 2020-Apr-04 at 18:53

            If you are doing self-play and building the tree exactly the same for both players there won't be any bias inherent in the tree - you can re-use it for both players. But, if the players build the MCTS tree in a way that is specific to a particular player, then you'll need to rebuild the tree. In this case you'd need to keep two trees, one for each player, and each player could re-use their own tree but nothing else.

            Some things to analyze if you're trying to figure this out:

            • Does the game have hidden information? (Something one player knows that the other player doesn't.) In this case you can't re-use the tree because you'd leak private information to the other player.
            • Do your playouts depend on the player at the root of the MCTS tree?
            • Does you have any policies for pruning moves from either player that aren't applied symmetrically?
            • Do you evaluate states in a way that is not symmetric between players?
            • Do you perform any randomization differently for the players?

            If none of these are true, you can likely re-use the tree.

            Source https://stackoverflow.com/questions/60968702

            QUESTION

            Rust: Using a generic trait as a trait parameter
            Asked 2020-Mar-20 at 10:06

            How can I use related generic types in Rust?

            Here's what I've got (only the first line is giving me trouble):

            ...

            ANSWER

            Answered 2020-Mar-19 at 23:39

            The concrete type of G cannot be detemined based on the type of TreeNode; it is only known when expand is called. Note that expand could be called twice with different types for G.

            You can express this by constraining the type parameters for the method instead of the entire implementation block:

            Source https://stackoverflow.com/questions/60766088

            QUESTION

            Unity instant simulation of game
            Asked 2020-Feb-10 at 18:54

            I am looking for a method of simulating my game till win or lose, to feed into a Monte Carlo Tree Search algorithm.

            My game is a turn based , tile based tactical RPG similar to Final Fantasty Tactics, Fire Emblem etc.

            The idea is that the AI would perform thousands of playouts (or up until a threshold) until they determine the optimal next move.

            The Simulation

            Each AI and Player agent would make a random valid move, until the game is over.

            Why a MCTS Simulation? Why not minmax?

            I need to simulate the game as closely to the real thing for several reasons:

            1. Encoding the game state into a lower-dimensional structure is impossible as most actions are tightly coupled to Unity constructs like Colliders and Rays.
            2. It is fairly difficult, if not impossible to statically evaluate the game state at X moves ahead - without any knowledge of previous moves. Therefore I would need to carry out each move sequentially on a game state, to produce the next game state before anything can be evaluated.

            To expand on point 2: Using a simple minmax approach, and statically evaluating the game state by looking at something like the Current Health of all players, would be useful but not accurate. As not every action will provide an immediate change to health.

            Example:

            Which produces a higher (max damage) dealt over 2 turns:

            1. Move infront of player, attack -> Move behind player, attack

            OR

            1. Move infront of player, Use attack buff -> Attack for x4 damage

            In this example, the minmax approach would never result in the 2nd option, even though it does more damage over 2 turns, due to its static evaluation of the buff move resulting in 0, or perhaps even negatively.

            In order for it to select the 2nd option, it would need to retain knowledge of previous actions. Ie. it would need to simulate the game almost perfectly.

            When we add in other elements like: Stage Traps, destructible environment and status effects. It becomes pretty much impossible to use a static evaluation

            What I've tried

            Time.timeScale

            This allows me to speed up physics and other interactions, which is exactly what I need. However - this is a global property, so the game will appear to run at super speed for a fraction of a second when the AI is "thinking".

            Increasing the speed of NavMesh Agents

            All my movements take place on a NavMesh - so the only perceivable way of making these movements "instant" is to increase the speed. This is problematic as the movements are not fast enough - and it causes physics issues due to the increased velocity, sometimes characters spin out of control and fly off the map.

            For reference here is a screenshot of my (in active development) game.

             Question

            What I need is a method for "playing" my game extremely quickly.

            I just need to be able to run these simulations quickly and efficiently before every AI move.

            I would love to hear from someone with some experience doing something like this - but any input would be greatly appreciated!

            Thanks

            ...

            ANSWER

            Answered 2020-Feb-10 at 18:54
            Build an abstract model of your core mechanics

            For something to run quickly, we need it to be simple - that means stripping back the game to its basic mechanics, and representing that (and that only).

            So, what does that mean? Well, first we have a tile based world. A simple representation of that is a 2D array of Tile objects, like this:

            Source https://stackoverflow.com/questions/60155133

            QUESTION

            How to Allow Compilers to Find Libraries Installed via Brew
            Asked 2020-Jan-28 at 17:48

            I am working on macOS Catalina and I want to compile some c++ code using CMake.

            This code needs the library boost that I installed via Brew (brew install boost). In my code I added #include to use it.

            When I compile I got the following error:

            ...

            ANSWER

            Answered 2020-Jan-28 at 17:11

            Usually you would use find_package to find and configure libraries with CMake. Boost provides a CMake configuration file FindBoost.

            Here is an example for one library using targets (currently the recommended way)

            Source https://stackoverflow.com/questions/59953609

            QUESTION

            How to understand the 4 steps of Monte Carlo Tree Search
            Asked 2019-Nov-20 at 14:02

            From many blogs and this one https://web.archive.org/web/20160308070346/http://mcts.ai/about/index.html We know that the process of MCTS algorithm has 4 steps.

            1. Selection: Starting at root node R, recursively select optimal child nodes until a leaf node L is reached.

            What does leaf node L mean here? I thought it should be a node representing the terminal state of the game, or another word which ends the game. If L is not a terminal node (one end state of the game), how do we decide that the selection step stops on node L?

            1. Expansion: If L is a not a terminal node (i.e. it does not end the game) then create one or more child nodes and select one C.

            From this description I realise that obviously my previous thought incorrect. Then if L is not a terminal node, it implies that L should have children, why not continue finding a child from L at the "Selection" step? Do we have the children list of L at this step?
            From the description of this step itself, when do we create one child node, and when do we need to create more than one child nodes? Based on what rule/policy do we select node C?

            1. Simulation: Run a simulated playout from C until a result is achieved.

            Because of the confusion of the 1st question, I totally cannot understand why we need to simulate the game. I thought from the selection step, we can reach the terminal node and the game should be ended on node L in this path. We even do not need to do "Expansion" because node L is the terminal node.

            1. Backpropagation: Update the current move sequence with the simulation result.

            Fine. Last question, from where did you get the answer to these questions?

            Thank you

            BTW, I also post the same question https://ai.stackexchange.com/questions/16606/how-to-understand-the-4-steps-of-monte-carlo-tree-search

            ...

            ANSWER

            Answered 2019-Nov-20 at 14:00

            What does leaf node L mean here?

            For the sake of explanation I'm assuming that all the children of a selected node are added during the expansion phase of the algorithm.

            When the algorithm starts, the tree is formed only by the root node (a leaf node).

            The Expansion phase adds all the states reachable from the root to the tree. Now you have a bigger tree where the leaves are the last added nodes (the root node isn't a leaf anymore).

            At any given iteration of the algorithm, the tree (the gray area of the picture) grows. Some of its leaves could be terminal states (according to the rules of the game/problem) but it's not granted.

            If you expand too much, you could run out of memory. So the typical implementation of the expansion phase only adds a single node to the existing tree.

            In this scenario you could change the word leaf with not fully expanded:

            Starting at root node R, recursively select optimal child nodes until a not fully expanded node L is reached

            Based on what rule/policy do we select node C?

            It's domain-dependent. Usually you randomly choose a move/state.

            NOTES

            1. Image from Multi-Objective Monte Carlo Tree Search for Real-Time Games (Diego Perez, Sanaz Mostaghim, Spyridon Samothrakis, Simon M. Lucas).

            Source https://stackoverflow.com/questions/58911784

            QUESTION

            How to I make my AI algorithm play 9 board tic-tac-toe?
            Asked 2019-May-03 at 02:38

            In order to make it easy for others to help me I have put all the codes here https://pastebin.com/WENzM41k it will starts as 2 agents compete each other.

            I'm trying to implement Monte Carlo tree search to play 9-board tic-tac-toe in Python. The rules of the game is like the regular tic-tac-toe but with 9 3x3 sub-boards. The place of the last piece is placed decides which sub-board to place your piece. It's kind like ultimate tic-tac-toe but if one of the sub-board is won the game ends.

            I'm trying to learn MCTS and I found some code on here: http://mcts.ai/code/python.html

            I used node class and UCT class on the website and added my 9 board tic-tac-toe game state class and some other codes. All codes are here:

            ...

            ANSWER

            Answered 2019-May-03 at 02:38

            I spent some time to read about MCTS and more time to catch the rest of the bugs:

            1. I added OXOState (tic-tac-toe), so I can debug with a familiar and simple game. It was only one problem with the original source code from http://mcts.ai/code/python.html: it will continue to play after someone won the game. So, I fixed that.
            2. For debugging and fun added HumanPlayer.
            3. For the evaluation of the level of games added RandomPlayer and NegamaxPlayer (negamax algorithm https://en.wikipedia.org/wiki/Negamax)

            NegamaxPlayer vs UCT (Monte Carlo Tree Search)

            Source https://stackoverflow.com/questions/55824246

            QUESTION

            Java Heap Space Issue with my MCTS Gomoku player
            Asked 2018-Dec-01 at 00:44

            When I run my program I get this error:

            ...

            ANSWER

            Answered 2018-Nov-30 at 16:05

            I'm not 100% sure without seeing the code of methods like childrenLeft() and a few others, but I get the impression that you basically add b new nodes to the tree, where b is your branching factor. In other words, every iteration, you add a new, complete list of children to one node. This can probably indeed cause you to run out of memory quickly.

            By far the most common strategy is to expand your tree by only adding one single new node per iteration. Every node then needs:

            • A list of current children (corresponding to already-expanded actions)
            • A list of actions that have not yet been expanded

            Your Selection phase would then generally end once it reaches a node that has a non-empty list of actions to be expanded. MCTS would then randomly pick one action from that list, add a new node corresponding to that action (meaning your first list grows by one entry and the second list shrinks by one entry), and continue the rollout from there.

            With such an implementation, it should be quite unlikely to run out of memory unless you allow your algorithm to search for very long times. If you do still run out of memory, you could look into things such as:

            • Optimizing the amount of memory required per node (it stores full game states, is the memory usage of game states optimized?)
            • Increasing the heap size of the JVM using command line arguments (see Increase heap size in Java)

            Source https://stackoverflow.com/questions/53558933

            QUESTION

            MCTS *tree* parallelization in Python - possible?
            Asked 2018-Oct-02 at 01:14

            I would like to parallelize my MCTS program. There are several ways to do this:

            1. Leaf parallelization, where each leaf is expanded and simulated in parallel.
            2. Root parallelization, where each thread/process creates a separate tree and when some number of simulations are finished, trees are combined to give a better statistic
            3. Tree parallelization, where all threads/processes share the same tree and each thread/process explores different parts of the tree.

            (If my explanation is unclear, checkout this review paper on MCTS. On page 25, different methods on parallelizing MCTS are described in detail.)

            Question:

            Since multiprocessing in Python has to create separate subprocesses, 2. Root parallelization fits quite nicely, whereas I am assuming that 3. Tree parallelization is not feasible. (Since for tree parallelization, all subprocesses would have to share the same tree - which is difficult to do in Python)

            Am I correct? I skimmed through the multiprocessing documentation and if I understood correctly, it seems like it is possible to pass information back and forth between subprocesses for some basic data types, but are highly discouraged due to speed etc.

            If so, tree parallelization in Python would be a bad idea right?

            ...

            ANSWER

            Answered 2018-Oct-01 at 15:03

            Yes, you're correct that Root Parallelization would be the easiest of those variants to implement. The different processes would essentially be able to run completely independent of each other. Only at the end of your search process would you have to aggregate results in whatever way you choose, which I don't believe should be problematic to implement.

            I am familiar enough with multiprocessing in Python to know it's... a little bit of a pain when you want more communication (the kind of communication the other two approaches need). I am not familiar enough with it to tell with 100% certain that it's really "impossible" or "highly discouraged", but there's certainly a clear difference in ease of implementation.

            Source https://stackoverflow.com/questions/52584142

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install mcts

            You can download it from GitHub.
            You can use mcts like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/RomainSa/mcts.git

          • CLI

            gh repo clone RomainSa/mcts

          • sshUrl

            git@github.com:RomainSa/mcts.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Consider Popular Reinforcement Learning Libraries

            Try Top Libraries by RomainSa

            Saladvice

            by RomainSaPython

            BlockStats

            by RomainSaPython

            Nozama

            by RomainSaPython

            Visualization

            by RomainSaPython