onnxruntime | ONNX Runtime : cross-platform , high performance ML | Machine Learning library

 by   microsoft C++ Version: 1.17.3 License: MIT

kandi X-RAY | onnxruntime Summary

kandi X-RAY | onnxruntime Summary

onnxruntime is a C++ library typically used in Artificial Intelligence, Machine Learning, Deep Learning, Pytorch, Tensorflow applications. onnxruntime has no bugs, it has no vulnerabilities, it has a Permissive License and it has medium support. You can download it from GitHub.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →. ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              onnxruntime has a medium active ecosystem.
              It has 9569 star(s) with 2174 fork(s). There are 221 watchers for this library.
              There were 5 major release(s) in the last 6 months.
              There are 1558 open issues and 3152 have been closed. On average issues are closed in 88 days. There are 316 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of onnxruntime is 1.17.3

            kandi-Quality Quality

              onnxruntime has no bugs reported.

            kandi-Security Security

              onnxruntime has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

            kandi-License License

              onnxruntime is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              onnxruntime releases are available to install and integrate.
              Installation instructions are available. Examples and code snippets are not available.

            Top functions reviewed by kandi - BETA

            kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
            Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of onnxruntime
            Get all kandi verified functions for this library.

            onnxruntime Key Features

            No Key Features are available at this moment for onnxruntime.

            onnxruntime Examples and Code Snippets

            Onnx inference throws an error with numpy float32 datatype inside the streamlit framework
            Pythondot img1Lines of Code : 15dot img1License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            prediction = pred_out(img, model_selector, plant_model_dictionary)[0]
            
            print(prediction.argmax(axis=0))  # 6
            
            [2.8063682e-14 3.1059124e-05 5.7825161e-11 8.3977110e-09 2.5989549e-13
             1.2324781
            Getting a prediction from an ONNX model in python
            Pythondot img2Lines of Code : 9dot img2License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            sess = ort.InferenceSession("onnx_model.onnx")
            
            input_name = sess.get_inputs()[0].name
            label_name = sess.get_outputs()[0].name
            
            pred = sess.run(...)[0]
            
            ([label_name],
            Using RNN Trained Model without pytorch installed
            Pythondot img3Lines of Code : 57dot img3License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            import torch
            import torch.nn as nn
            import torch.nn.functional as F
            import torch.optim as optim
            import random
            
            torch.manual_seed(1)
            random.seed(1)
            device = torch.device('cpu')
            
            class RNN(nn.Module):
              def __init__(self, input_size, hidden_s
            onnx.load() | ALBert throws DecodeError: Error parsing message
            Pythondot img4Lines of Code : 3dot img4License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            configs.output_dir = "albert-base-v2-MRPC"
            configs.model_name_or_path = "albert-base-v2-MRPC"
            
            Converting PyTorch to ONNX model increases file size for ALBert
            Pythondot img5Lines of Code : 19dot img5License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            from onnxruntime.transformers.onnx_model import OnnxModel
            model=onnx.load(path)
            onnx_model=OnnxModel(model)
            count = len(model.graph.initializer)
            same = [-1] * count
            for i in range(count - 1):
              if same[i] >= 0:
                continue
              for j in r
            Optimize Albert HuggingFace model
            Pythondot img6Lines of Code : 10dot img6License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            pip install torch_optimizer
            
            import torch_optimizer as optim
            
            # model = ...
            optimizer = optim.DiffGrad(model.parameters(), lr=0.001)
            optimizer.step()
            
            torch.save(model.state_dict(), PATH)
            
            TypeError: not a string | parameters in AutoTokenizer.from_pretrained()
            Pythondot img7Lines of Code : 2dot img7License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            tokenizer = AlbertTokenizer.from_pretrained('albert-base-v2')
            
            copy iconCopy
            import json
            
            json_filename = './MRPC/config.json'
            
            with open(json_filename) as json_file:
                json_decoded = json.load(json_file)
            
            json_decoded['model_type'] = # !!
            
            with open(json_filename, 'w') as json_file:
                json.dump(json_decoded, j
            ValueError: Unsupported ONNX opset version: 13
            Pythondot img9Lines of Code : 4dot img9License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            # Install or upgrade PyTorch 1.8.0 and OnnxRuntime 1.7.0 for CPU-only.
            
            pip install torch==1.10.0  # latest
            
            Use Quantization on HuggingFace Transformers models
            Pythondot img10Lines of Code : 14dot img10License : Strong Copyleft (CC BY-SA 4.0)
            copy iconCopy
            model_name = bert-base-uncased
            tokenizer = AutoTokenizer.from_pretrained(model_name )
            model = AutoModelForMaskedLM.from_pretrained(model_name)
                
            sequence = "Distilled models are smaller than the models they mimic. Using them instead of 

            Community Discussions

            QUESTION

            Getting a prediction from an ONNX model in python
            Asked 2022-Mar-27 at 20:02

            I can't find anyone who explains to a layman how to load an onnx model into a python script, then use that model to make a prediction when fed an image. All I could find were these lines of code:

            ...

            ANSWER

            Answered 2022-Mar-27 at 20:02

            Let's first start by going over the code you provided, to make everything clear.

            Source https://stackoverflow.com/questions/71279968

            QUESTION

            Cannot create the calibration cache for the QAT model in tensorRT
            Asked 2022-Mar-14 at 21:20

            I've trained a quantized model (with help of quantized-aware-training method in pytorch). I want to create the calibration cache to do inference in INT8 mode by TensorRT. When create calib cache, I get the following warning and the cache is not created:

            ...

            ANSWER

            Answered 2022-Mar-14 at 21:20

            If the ONNX model has Q/DQ nodes in it, you may not need calibration cache because quantization parameters such as scale and zero point are included in the Q/DQ nodes. You can run the Q/DQ ONNX model directly in TensorRT execution provider in OnnxRuntime (>= v1.9.0).

            Source https://stackoverflow.com/questions/71368760

            QUESTION

            Quantized model gives negative accuracy after conversion from pytorch to ONNX
            Asked 2022-Mar-06 at 07:24

            I'm trying to train a quantize model in pytorch and convert it to ONNX. I employ the quantized-aware-training technique with help of pytorch_quantization package. I used the below code to convert my model to ONNX:

            ...

            ANSWER

            Answered 2022-Mar-06 at 07:24

            After some tries, I found that there is a version conflict. I changed the versions accordingly:

            Source https://stackoverflow.com/questions/71362729

            QUESTION

            PyTorch to ONNX export, ATen operators not supported, onnxruntime hangs out
            Asked 2022-Mar-03 at 14:05

            I want to export roberta-base based language model to ONNX format. The model uses ROBERTA embeddings and performs text classification task.

            ...

            ANSWER

            Answered 2022-Mar-01 at 20:25

            Have you tried to export after defining the operator for onnx? Something along the lines of the following code by Huawei.

            On another note, when loading a model, you can technically override anything you want. Putting a specific layer to equal your modified class that inherits the original, keeps the same behavior (input and output) but execution of it can be modified. You can try to use this to save the model with changed problematic operators, transform it in onnx, and fine tune in such form (or even in pytorch).

            This generally seems best solved by the onnx team, so long term solution might be to post a request for that specific operator on the github issues page (but probably slow).

            Source https://stackoverflow.com/questions/71220867

            QUESTION

            Pytorch to ONNX: Could not find an implementation for RandomNormalLike
            Asked 2022-Feb-15 at 07:05

            I am trying to convert a fairly complex model from pytorch into ONNX. The conversion succeeds without error, but I am encountering this error when loading the model:

            ...

            ANSWER

            Answered 2022-Feb-08 at 11:38

            From checking online I found a similar issue on GitHub about conv (https://github.com/microsoft/onnxruntime/issues/3130), could be that the types of the parameters used in torch are not compatible with the implementation of RandomNormalLike available in ONNX.

            Could you check in netron what's inside the RandomNormalLike node/nodes to see if they comply with the spec: https://github.com/onnx/onnx/blob/main/docs/Operators.md#RandomNormal or https://github.com/onnx/onnx/blob/main/docs/Operators.md#RandomNormalLike

            Cheers

            EDIT: turns out the RandomNormal node has a type of 10 which corresponds to fp16

            While the onnxruntime implementation only supports float and doubles see source code here: https://github.com/microsoft/onnxruntime/blob/24e35fba3217bf33b0e4064bc71d271a61938ba0/onnxruntime/core/providers/cpu/generator/random.cc#L354

            Solution here is either to run the whole model in fp32 or ask explicitely RandomNormalLike to use floats or doubles hoping that torch allows mixed computation on fp16 and fp32/fp64 I guess

            Source https://stackoverflow.com/questions/71031604

            QUESTION

            onnx.load() | ALBert throws DecodeError: Error parsing message
            Asked 2022-Jan-31 at 18:53

            Goal: re-develop this BERT Notebook to use textattack/albert-base-v2-MRPC.

            Kernel: conda_pytorch_p36. PyTorch 1.8.1+cpu.

            I convert a PyTorch / HuggingFace Transformers model to ONNX and store it. DecodeError occurs on onnx.load().

            Are my ONNX files corrupted? This seems to be a common solution; but I don't know how to check for this.

            ALBert Notebook and model files on Google Colab.

            I've also this Git Issue, detailing debugging.

            Problem isn't...
            • Quantisation - any Quantisation code I try, throws the same error.
            • Optimisation - error occurs with or without Optimisation.

            Section 2.2 Quantize ONNX model:

            ...

            ANSWER

            Answered 2022-Jan-31 at 18:53

            The problem was with updating the config variables for my new model.

            Changes:

            Source https://stackoverflow.com/questions/70787151

            QUESTION

            OnnxRuntime vs OnnxRuntime+OpenVinoEP inference time difference
            Asked 2022-Jan-27 at 09:16

            I'm trying to accelerate my model's performance by converting it to OnnxRuntime. However, I'm getting weird results, when trying to measure inference time.

            While running only 1 iteration OnnxRuntime's CPUExecutionProvider greatly outperforms OpenVINOExecutionProvider:

            • CPUExecutionProvider - 0.72 seconds
            • OpenVINOExecutionProvider - 4.47 seconds

            But if I run let's say 5 iterations the result is different:

            • CPUExecutionProvider - 3.83 seconds
            • OpenVINOExecutionProvider - 14.13 seconds

            And if I run 100 iterations, the result is drastically different:

            • CPUExecutionProvider - 74.19 seconds
            • OpenVINOExecutionProvider - 46.96seconds

            It seems to me, that the inference time of OpenVinoEP is not linear, but I don't understand why. So my questions are:

            • Why does OpenVINOExecutionProvider behave this way?
            • What ExecutionProvider should I use?

            The code is very basic:

            ...

            ANSWER

            Answered 2022-Jan-27 at 09:16

            The use of ONNX Runtime with OpenVINO Execution Provider enables the inferencing of ONNX models using ONNX Runtime API while the OpenVINO toolkit runs in the backend. This accelerates ONNX model's performance on the same hardware compared to generic acceleration on Intel® CPU, GPU, VPU and FPGA.

            Generally, CPU Execution Provider works best with small iteration since its intention is to keep the binary size small. Meanwhile, the OpenVINO Execution Provider is intended for Deep Learning inference on Intel CPUs, Intel integrated GPUs, and Intel® MovidiusTM Vision Processing Units (VPUs).

            This is why the OpenVINO Execution Provider outperforms the CPU Execution Provider during larger iterations.

            You should choose Execution Provider that would suffice your own requirements. If you going to execute complex DL with large iteration, then go for OpenVINO Execution Provider. For a simpler use case, where you need the binary size to be smaller with smaller iterations, you can choose the CPU Execution Provider instead.

            For more information, you may refer to this ONNX Runtime Performance Tuning

            Source https://stackoverflow.com/questions/70844974

            QUESTION

            ONNX Runtime Inference | session.run() multiprocessing
            Asked 2022-Jan-21 at 16:56

            Goal: run Inference in parallel on multiple CPU cores

            I'm experimenting with Inference using simple_onnxruntime_inference.ipynb.

            Individually:

            ...

            ANSWER

            Answered 2022-Jan-21 at 16:56
            def run_inference(i):
                output_name = session.get_outputs()[0].name
                return session.run([output_name], {input_name: inputs[i]})[0]  # [0] bc array in list
            
            outputs = pool.map(run_inference, [i for i in range(test_data_num)])
            

            Source https://stackoverflow.com/questions/70803924

            QUESTION

            Converting PyTorch to ONNX model increases file size for ALBert
            Asked 2022-Jan-21 at 12:30

            Goal: Use this Notebook to perform quantisation on albert-base-v2 model.

            Kernel: conda_pytorch_p36.

            Outputs in Sections 1.2 & 2.2 show that:

            • converting vanilla BERT from PyTorch to ONNX stays the same size, 417.6 MB.
            • Quantization models are smaller than vanilla BERT, PyTorch 173.0 MB and ONNX 104.8 MB.

            However, when running ALBert:

            • PyTorch and ONNX model sizes are different.
            • Quantized model sizes are bigger than vanilla.

            I think this is the reason for poorer model performance of both Quantization methods of ALBert, compared to vanilla ALBert.

            PyTorch:

            ...

            ANSWER

            Answered 2022-Jan-21 at 12:09
            Explanation

            ALBert model has shared weights among layers. torch.onnx.export outputs the weights to different tensors, which causes the model size to grow larger.

            A number of Git Issues have been marked Solved regarding this phenomena.

            The most common solution is to remove shared weights, that is to remove tensor arrays that contain the exact same values.

            Solutions

            Section "Removing shared weights" in onnx_remove_shared_weights.ipynb.

            Pseudo-code:

            Source https://stackoverflow.com/questions/70786010

            QUESTION

            Optimize Albert HuggingFace model
            Asked 2022-Jan-18 at 15:35

            Goal: Amend this Notebook to work with albert-base-v2 model

            Kernel: conda_pytorch_p36.

            Section 2.1 exports the finalised model. It too uses a BERT specific function. However, I cannot find an equivalent for Albert.

            I've successfully implemented alternatives for Albert up until this section.

            Code:

            ...

            ANSWER

            Answered 2022-Jan-18 at 15:35

            Optimise any PyTorch model, using torch_optimizer.

            Installation:

            Source https://stackoverflow.com/questions/70740565

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install onnxruntime

            Usage documention and tutorials: onnxruntime.ai/docs.
            ONNX Runtime Inferencing: microsoft/onnxruntime-inference-examples
            ONNX Runtime Training: microsoft/onnxruntime-training-examples

            Support

            We welcome contributions! Please see the contribution guidelines. For feature requests or bug reports, please file a GitHub Issue. For general discussion or questions, please use GitHub Discussions.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
          • PyPI

            pip install onnxruntime

          • CLONE
          • HTTPS

            https://github.com/microsoft/onnxruntime.git

          • CLI

            gh repo clone microsoft/onnxruntime

          • sshUrl

            git@github.com:microsoft/onnxruntime.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link