xla | Enabling PyTorch on XLA Devices (eg Google TPU) | Machine Learning library

 by   pytorch C++ Version: v2.0.0 License: Non-SPDX

kandi X-RAY | xla Summary

kandi X-RAY | xla Summary

xla is a C++ library typically used in Artificial Intelligence, Machine Learning, Pytorch applications. xla has no bugs, it has no vulnerabilities and it has medium support. However xla has a Non-SPDX License. You can download it from GitHub.

PyTorch/XLA is a Python package that uses the XLA deep learning compiler to connect the PyTorch deep learning framework and Cloud TPUs. You can try it right now, for free, on a single Cloud TPU with Google Colab, and use it in production and on Cloud TPU Pods with Google Cloud.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              xla has a medium active ecosystem.
              It has 1996 star(s) with 339 fork(s). There are 56 watchers for this library.
              There were 1 major release(s) in the last 12 months.
              There are 253 open issues and 1141 have been closed. On average issues are closed in 17 days. There are 114 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of xla is v2.0.0

            kandi-Quality Quality

              xla has 0 bugs and 0 code smells.

            kandi-Security Security

              xla has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              xla code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              xla has a Non-SPDX License.
              Non-SPDX licenses can be open source with a non SPDX compliant license, or non open source licenses, and you need to review them closely before use.

            kandi-Reuse Reuse

              xla releases are available to install and integrate.
              Installation instructions are not available. Examples and code snippets are available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of xla
            Get all kandi verified functions for this library.

            xla Key Features

            No Key Features are available at this moment for xla.

            xla Examples and Code Snippets

            Create an XLA file string .
            pythondot img1Lines of Code : 44dot img1License : Non-SPDX (Apache License 2.0)
            copy iconCopy
            def _xla_makefile_string(output_prefix):
              """Returns a Makefile string with variables for using XLA binary object files.
            
              Attempts to identify the right include header paths when run from either
              an installed TensorFlow pip package, or from bazel  
            Removes unconnected ops from the XLA compile graph .
            pythondot img2Lines of Code : 30dot img2License : Non-SPDX (Apache License 2.0)
            copy iconCopy
            def prune_unconnected_ops_from_xla(prune_graph: ops.Graph):
              """Prunes unconnected ops as listed in _UNCONNECTED_OPS_TO_PRUNE.
            
              Args:
                prune_graph: A tensorflow graph from which we wish to prune unconnected ops
                  as listed in _UNCONNECTED_O  
            Returns the enclosing xla context .
            pythondot img3Lines of Code : 15dot img3License : Non-SPDX (Apache License 2.0)
            copy iconCopy
            def _enclosing_xla_context():
              """Returns the XLAControlFlowContext, which exists inside a tpu.rewrite()."""
              graph = ops.get_default_graph()
              while graph is not None:
                # pylint: disable=protected-access
                context_ = graph._get_control_flow_c  

            Community Discussions

            QUESTION

            htaccess don't use index.html to send everything else to wordpress
            Asked 2022-Apr-07 at 21:14

            I had a site completely run in wordpress. Made a new site from scratch and saved it to index.html. I made the htaccess file work for sending all other urls to the wordpress. The only problem is that I want the home page to be url.com/ instead of url.com/index.html in the address bar of the browser.

            How do i keep everything working, except this one little thing?

            ...

            ANSWER

            Answered 2022-Apr-07 at 21:14

            Set the following at the top of the .htaccess file:

            Source https://stackoverflow.com/questions/71786233

            QUESTION

            JAX(XLA) vs Numba(LLVM) Reduction
            Asked 2022-Apr-03 at 23:01

            Is it possible to make CPU only reductions with JAX comparable to Numba in terms of computation time?

            The compilers come straight from conda:

            ...

            ANSWER

            Answered 2022-Apr-01 at 18:31

            When performing these kinds of microbenchmarks with JAX, you have to be careful to ensure you're measuring what you think you're measuring. There are some tips in the JAX Benchmarking FAQ. Implementing some of these best practices, I find the following for your benchmarks:

            Source https://stackoverflow.com/questions/71701041

            QUESTION

            TensorFlow libdevice not found. Why is it not found in the searched path?
            Asked 2022-Mar-31 at 12:23

            Win 10 64-bit 21H1; TF2.5, CUDA 11 installed in environment (Python 3.9.5 Xeus)

            I am not the only one seeing this error; see also (unanswered) here and here. The issue is obscure and the proposed resolutions are unclear/don't seem to work (see e.g. here)

            Issue Using the TF Linear_Mixed_Effects_Models.ipynb example (download from TensorFlow github here) execution reaches the point of performing the "warm up stage" then throws the error:

            ...

            ANSWER

            Answered 2021-Sep-20 at 15:41

            The diagnostic information is unclear and thus unhelpful; there is however a resolution

            The issue was resolved by providing the file (as a copy) at this path

            C:\Users\Julian\anaconda3\envs\TF250_PY395_xeus\Library\bin\nvvm\libdevice\

            Note that C:\Users\Julian\anaconda3\envs\TF250_PY395_xeus\Library\bin was the path given to XLA_FLAGS, but it seems it is not looking for the libdevice file there it is looking for the \nvvm\libdevice\ path This means that I can't just set a different value in XLA_FLAGS to point to the actual location of the libdevice file because, to coin a phrase, it's not (just) the file it's looking for.

            The debug info earlier:

            Source https://stackoverflow.com/questions/68614547

            QUESTION

            What is XlaBuilder for?
            Asked 2022-Mar-20 at 18:41

            What's the XLA class XlaBuilder for? The docs describe its interface but don't provide a motivation.

            The presentation in the docs, and indeed the comment above XlaBuilder in the source code

            ...

            ANSWER

            Answered 2021-Dec-15 at 01:32

            XlaBuilder is the C++ API for building up XLA computations -- conceptually this is like building up a function, full of various operations, that you could execute over and over again on different input data.

            Some background, XLA serves as an abstraction layer for creating executable blobs that run on various target accelerators (CPU, GPU, TPU, IPU, ...), conceptually kind of an "accelerator virtual machine" with conceptual similarities to earlier systems like PeakStream or the line of work that led to ArBB.

            The XlaBuilder is a way to enqueue operations into a "computation" (similar to a function) that you want to run against the various set of accelerators that XLA can target. The operations at this level are often referred to as "High Level Operations" (HLOs).

            The returned XlaOp represents the result of the operation you've just enqueued. (Aside/nerdery: this is a classic technique used in "builder" APIs that represent the program in "Static Single Assignment" form under the hood, the operation itself and the result of the operation can be unified as one concept!)

            XLA computations are very similar to functions, so you can think of what you're doing with an XlaBuilder like building up a function. (Aside: they're called "computations" because they do a little bit more than a straightforward function -- conceptually they are coroutines that can talk to an external "host" world and also talk to each other via networking facilities.)

            So the fact XlaOps can't be used across XlaBuilders may make more sense with that context -- in the same way that when building up a function you can't grab intermediate results in the internals of other functions, you have to compose them with function calls / parameters. In XlaBuilder you can Call another built computation, which is a reason you might use multiple builders.

            As you note, you can choose to inline everything into one "mega builder", but often programs are structured as functions that get composed together, and ultimately get called from a few different "entry points". XLA currently aggressively specializes for the entry points it sees API users using, but this is a design artifact similar to inlining decisions, XLA can conceptually reuse computations built up / invoked from multiple callers if it thought that was the right thing to do. Usually it's most natural to enqueue things into XLA however is convenient for your description from the "outside world", and allow XLA to inline and aggressively specialize the "entry point" computations you've built up as you execute them, in Just-in-Time compilation fashion.

            Source https://stackoverflow.com/questions/70339753

            QUESTION

            Error upon compilation while using jax.jit
            Asked 2022-Feb-07 at 13:47

            Pardon me I'm still a noob with the inner workings of Jax and trying to find my way around it. I have this code which works well without the jit. But when I try to jit it, it throws an error. I initially used an if else statement within the code which also did not work and had to rewrite the code this way without an if else statement. How do I get around this?. MWE is below.

            ...

            ANSWER

            Answered 2022-Feb-07 at 13:47

            The issue is that indexing in JAX must be done with static values, and within JIT kvals[i] is not a static value (because it is computed from a JAX array).

            One easy way to fix this in your case is to make kvals a non-jax array; for example when you define it, do this;

            Source https://stackoverflow.com/questions/71016664

            QUESTION

            Getting while training RNN on TPU
            Asked 2022-Jan-13 at 17:59

            I get this error while using TPU while training my simple RNN model.

            ...

            ANSWER

            Answered 2022-Jan-13 at 14:08

            You can trying setting the unroll parameter of the SimpleRNN layer to True:

            Source https://stackoverflow.com/questions/70697406

            QUESTION

            jax woes (on an NVDIA DGX box, no less)
            Asked 2021-Oct-25 at 17:39

            I am trying to run jax on an nvidia dgx box, but am failing miserably, thus:

            ...

            ANSWER

            Answered 2021-Oct-25 at 17:39

            This means that your CUDA installation is not configured correctly, and can generally be fixed by ensuring that the CUDA toolkit binaries (including ptxas) are present in your $PATH. See https://github.com/google/jax/discussions/6843 and https://github.com/google/jax/issues/7239 for responses to users reporting similar issues.

            Source https://stackoverflow.com/questions/69712084

            QUESTION

            failed to alloc X bytes unified memory; result: CUDA_ERROR_OUT_OF_MEMORY: out of memory
            Asked 2021-Sep-01 at 12:43

            I am trying to run a tensorflow project and I am encountering memory problems on the university HPC cluster. I have to run a prediction job for hundreds of inputs, with differing lengths. We have GPU nodes with different amounts of vmem, so I am trying to set up the scripts in a way that will not crash in any combination of GPU node - input length.

            After searching the net for solutions, I played around with TF_FORCE_UNIFIED_MEMORY, XLA_PYTHON_CLIENT_MEM_FRACTION, XLA_PYTHON_CLIENT_PREALLOCATE, and TF_FORCE_GPU_ALLOW_GROWTH, and also with tensorflow's set_memory_growth. As I understood, with unified memory, I should be able to use more memory than a GPU has in itself.

            This was my final solution (only relevant parts)

            ...

            ANSWER

            Answered 2021-Aug-29 at 18:26

            Probably this answer will be useful for you. This nvidia_smi python module have some useful tools like checking the gpu total memory. Here I reproduce the code of the answer I mentioned earlier.

            Source https://stackoverflow.com/questions/68902851

            QUESTION

            Add two tensors with different dimensions in tensorflow
            Asked 2021-Aug-19 at 19:24

            I am basically trying to add two tensors in tensorflow, the crux is that they are of different lengths a = [1, 2, 3, 4, 5] and b = [1, 2, 3] and am looking for a function that I am calling tf.myadd in the following

            ...

            ANSWER

            Answered 2021-Aug-19 at 19:24

            Broadcasting is the default for all tensor operations in tf. In this case, you are trying to avoid broadcasting since the 2 tensors ((5,) and (3,)) are NOT broadcastable along the axis=0 by the standard broadcasting rules. So what you need is an element-wise addition without broadcasting.

            What you can do as in this case is use post-padding on the smaller array such that the two 1D tensors have the same shape and then add them elementwise over axis=0.

            Like this -

            Source https://stackoverflow.com/questions/68846981

            QUESTION

            How to test distributed layers on Tensorflow?
            Asked 2021-Jul-15 at 20:10

            I am trying to test a layer that I will add later in a distributed model however I want to be sure that it works before.

            This is the layer in question:

            ...

            ANSWER

            Answered 2021-Jul-15 at 20:10

            The major reason why you got the error messages may be because tf.distribute.get_replica_context().all_reduce() does not always work in eager mode. It will work properly in graph mode.(See example codes below)

            There are also some other potential problems in your codes.

            1. pass aggregation=tf.VariableAggregation.ONLY_FIRST_REPLICA to tf.Variable to make sure it is synchronized across replicas.
            2. strategy.reduce() shouldn't be called inside train_step

            Example codes:

            Source https://stackoverflow.com/questions/68383083

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install xla

            You can download it from GitHub.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries

            Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link