elasticsearch-py | Official Python client for Elasticsearch | REST library

 by   elastic Python Version: v8.8.0 License: Apache-2.0

kandi X-RAY | elasticsearch-py Summary

kandi X-RAY | elasticsearch-py Summary

elasticsearch-py is a Python library typically used in Web Services, REST applications. elasticsearch-py has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has high support. You can install using 'pip install elasticsearch-py' or download it from GitHub, PyPI.

Official Python client for Elasticsearch
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              elasticsearch-py has a highly active ecosystem.
              It has 3968 star(s) with 1154 fork(s). There are 396 watchers for this library.
              There were 1 major release(s) in the last 12 months.
              There are 37 open issues and 978 have been closed. On average issues are closed in 138 days. There are 10 open pull requests and 0 closed requests.
              It has a positive sentiment in the developer community.
              The latest version of elasticsearch-py is v8.8.0

            kandi-Quality Quality

              elasticsearch-py has 0 bugs and 0 code smells.

            kandi-Security Security

              elasticsearch-py has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              elasticsearch-py code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              elasticsearch-py is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              elasticsearch-py releases are available to install and integrate.
              Deployable package is available in PyPI.
              Build file is available. You can build the component from source.
              It has 48667 lines of code, 1284 functions and 132 files.
              It has high code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed elasticsearch-py and discovered the below as its top functions. This is intended to give you an instant insight into elasticsearch-py implemented functionality, and help decide if they suit your requirements.
            • Submit search results
            • Perform a request
            • Escape value
            • Updates the query by index
            • Performs a request
            • Return the current stack level
            • Lists jobs
            • Updates a job
            • Get data feed feeds
            • Search a template by index
            • Perform search
            • Performs a search query
            • Update an index
            • Execute a transformation
            • Get ML trained models
            • Create auto follow pattern
            • Put a mapping
            • Creates a new job
            • Update a datafeed
            • Creates a new data feed
            • Delete documents by index
            • Update index by query
            • Perform a search
            • Perform a search operation
            Get all kandi verified functions for this library.

            elasticsearch-py Key Features

            No Key Features are available at this moment for elasticsearch-py.

            elasticsearch-py Examples and Code Snippets

            tableschema-elasticsearch-py,Documentation,Usage overview
            Pythondot img1Lines of Code : 62dot img1License : Permissive (MIT)
            copy iconCopy
            import elasticsearch
            import jsontableschema_es
            
            INDEX_NAME = 'testing_index'
            
            # Connect to Elasticsearch instance running on localhost
            es=elasticsearch.Elasticsearch()
            storage=jsontableschema_es.Storage(es)
            
            # List all indexes
            print(list(storage.buck  
            tableschema-elasticsearch-py,Documentation,Mappings
            Pythondot img2Lines of Code : 42dot img2License : Permissive (MIT)
            copy iconCopy
            {
              "fields": [
                {
                  "name": "my-number",
                  "type": "number"
                },
                {
                  "name": "my-array-of-dates",
                  "type": "array",
                  "es:itemType": "date"
                },
                {
                  "name": "my-person-object",
                  "type": "object",
                  "e  
            requests-auth-aws-sigv4,Usage,Usage with Elasticsearch Client (elasticsearch-py)
            Pythondot img3Lines of Code : 12dot img3License : Permissive (Apache-2.0)
            copy iconCopy
            from elasticsearch import Elasticsearch, RequestsHttpConnection
            from requests_auth_aws_sigv4 import AWSSigV4
            
            es_host = 'search-service-foobar.us-east-1.es.amazonaws.com'
            aws_auth = AWSSigV4('es')
            
            # use the requests connection_class and pass in our   

            Community Discussions

            QUESTION

            Does the AsyncElasticsearch client use the same session for async actions?
            Asked 2022-Mar-01 at 14:20

            Does the AsyncElasticsearch client open a new session for each async request?

            AsyncElasticsearch (from elasticsearch-py) uses AIOHTTP. From what I understand, AIOHTTP recommends a using a context manager for the aiohttp.ClientSession object, so as to not generate a new session for each request:

            ...

            ANSWER

            Answered 2022-Mar-01 at 14:20

            Turns out the AsyncElasticsearch was not the right client to speed up bulk ingests in this case. I use the helpers.parallel_bulk() function instead.

            Source https://stackoverflow.com/questions/70455506

            QUESTION

            "document_missing_exception" at Opensearch client.update (python)
            Asked 2022-Feb-17 at 12:03

            My query is simple - add field:value to existing doc, but it fails with error of document_missing_exception. the code below is without parameters to make it easy to view i use opensearch py client and set the index,co_type as that indx, id of the document and query body, as seen in previous post How to update a document using elasticsearch-py?

            ...

            ANSWER

            Answered 2022-Feb-17 at 12:03

            What is the Elasticsearch version you are using?

            Please try by giving

            Source https://stackoverflow.com/questions/71144961

            QUESTION

            Elasticsearch - How to add range filter to search query
            Asked 2022-Feb-04 at 22:13

            I use elasticsearch-dsl in order to query Elasticsearch in python.
            I want to search documents with text field and get all documents that created field of them is less than datetime.now().
            I execute the following query but elasticsearch raises error.

            ...

            ANSWER

            Answered 2022-Feb-04 at 22:13

            You can't combine few types of queries like this, use bool:

            Source https://stackoverflow.com/questions/70990983

            QUESTION

            elasticsearch on python gives name or service not known
            Asked 2022-Jan-29 at 11:01

            I'm working with an elastic API on a url like https://something.xyzw.eg/api/search/advance (not the real url). The API works fine on postman. Also the python code generated by postman works fine and returns results. However when using leasticsearch-dsl package I keep getting:

            Failed to establish a new connection: [Errno -2] Name or service not known)

            Here is my code similar to the first example on documents:

            ...

            ANSWER

            Answered 2022-Jan-29 at 09:46

            Can you try to add port=443 as in one of the examples from the doc you mentioned https://elasticsearch-py.readthedocs.io/en/v7.16.3/#tls-ssl-and-authentication ?

            Source https://stackoverflow.com/questions/70903867

            QUESTION

            Periodically process and update documents in elasticsearch index
            Asked 2022-Jan-12 at 10:21

            I need to come up with a strategy to process and update documents in an elasticsearch index periodically and efficiently. I do not have to look at documents that I processed before.

            My setting is that I have a long running process, which continuously inserts documents to an index, say approx. 500 documents per hour (think about the common logging example).

            I need to find a solution to update some amount of documents periodically (via cron job, e.g) to run some code on a specific field (text field, eg.) to enhance that document with a number of new fields. I want to do this to offer more fine grained aggregations on the index. In the logging analogy, this could be, e.g., I get the UserAgent-string from a log entry (document), do some parsing on that, and add some new fields back to that document and index it.

            So my approach would be:

            1. Get some amount of documents (or even all) that I haven't looked at before. I could query them by combining must_not and exists, for instance.
            2. Run my code on these documents (run the parser, compute some new stuff, whatever).
            3. Update the documents obtained previously (probably most preferably via bulk api).

            I know there is the Update by query API. But this does not seem to be right here, since I need to run my own code (which btw depends on external libraries), on my server and not as a painless script, which would not offer that comprehensive tasks I need.

            I am accessing elasticsearch via python.

            The problem is now that I don't know how to implement the above approach. E.g. what if the amount of document obtained in step 1. is larger than myindex.settings.index.max_result_window?

            Any ideas?

            ...

            ANSWER

            Answered 2022-Jan-12 at 10:21

            I considered @Jay's comment and ended up with this pattern, for the moment:

            Source https://stackoverflow.com/questions/70656110

            QUESTION

            Aggregation query fails using ElasticSearch Python client
            Asked 2021-Nov-07 at 14:39
            Here is an aggregation query that works as expected when I use dev tools in on Elastic Search :  
            
               search_query = {
                  "aggs": {
                    "SHAID": {
                      "terms": {
                        "field": "identiferid",
                        "order": {
                          "sort": "desc"
                        },
                #         "size": 100000
                      },
                      "aggs": {
                        "update": {
                          "date_histogram": {
                            "field": "endTime",
                            "calendar_interval": "1d"
                          },
                          "aggs": {
                            "update1": {
                                  "sum": {
                                    "script": {
                                      "lang": "painless",
                                      "source":"""
                                          if (doc['distanceIndex.att'].size()!=0) { 
                                              return doc['distanceIndex.att'].value;
                                          } 
                                          else { 
                                              if (doc['distanceIndex.att2'].size()!=0) { 
                                              return doc['distanceIndex.att2'].value;
                                          }
                                          return null;
                                          }
                                          """
                                    }
                                  }
                                },
                            "update2": {
                                     "sum": {
                                    "script": {
                                      "lang": "painless",
                                      "source":"""
                                          if (doc['distanceIndex.att3'].size()!=0) { 
                                              return doc['distanceIndex.att3'].value;
                                          } 
                                          else { 
                                              if (doc['distanceIndex.at4'].size()!=0) { 
                                              return doc['distanceIndex.att4'].value;
                                          }
                                          return null;
                                          }
                                          """
                                    }
                                  }
                              },
                          }
                        },
                        "sort": {
                          "sum": {
                            "field": "time2"
                          }
                        }
                      }
                    }
                  },
                "size": 0,
                  "query": {
                    "bool": {
                      "filter": [
                        {
                          "match_all": {}
                        },
                        {
                          "range": {
                            "endTime": {
                              "gte": "2021-11-01T00:00:00Z",
                              "lt": "2021-11-03T00:00:00Z"
                            }
                          }
                        }
                      ]
                    }
                  }
                }
            
            ...

            ANSWER

            Answered 2021-Nov-07 at 14:39

            helpers.scan is a

            Simple abstraction on top of the scroll() api - a simple iterator that yields all hits as returned by underlining scroll requests.

            It's meant to iterate through large result sets and comes with a default keyword argument of size=1000

            To run an aggregation, use the es_client.search() method directly, passing in your query as body, and including "size": 0 in the query should be fine.

            Source https://stackoverflow.com/questions/69870574

            QUESTION

            Elasticsearch Bulk insert w/ Python - socket timeout error
            Asked 2021-May-18 at 13:29

            ElasticSearch 7.10.2

            Python 3.8.5

            elasticsearch-py 7.12.1

            I'm trying to do a bulk insert of 100,000 records to ElasticSearch using elasticsearch-py bulk helper.

            Here is the Python code:

            ...

            ANSWER

            Answered 2021-May-18 at 13:29
            TL;DR:

            Reduce the chunk_size from 10000 to the default of 500 and I'd expect it to work. You probably want to disable automatic retries if that can give you duplicates.

            What happened?

            When creating your Elasticsearch object, you specified chunk_size=10000. This means that the streaming_bulk call will try to insert chunks of 10000 elements. The connection to elasticsearch has a configurable timeout, which by default is 10 seconds. So, if your elasticsearch server takes more than 10 seconds to process the 10000 elements you want to insert, a timeout will happen and this will be handled as an error.

            When creating your Elasticsearch object, you also specified retry_on_timeout as True and in the streaming_bulk_call you set max_retries=max_insert_retries, which is 3.

            This means that when such a timeout happens, the library will try reconnecting 3 times, however, when the insert still has a timeout after that, it will give you the error you noticed. (Documentation)

            Also, when the timeout happens, the library can not know whether the documents were inserted successfully, so it has to assume that they were not. Thus, it will try to insert the same documents again. I don't know how your input lines look like, but if they do not contain an _id field, this would create duplicates in your index. You probably want to prevent this -- either by adding some kind of _id, or by disabling the automatic retry and handling it manually.

            What to do?

            There is two ways you can go about this:

            • Increase the timeout
            • Reduce the chunk_size

            streaming_bulk by default has chunk_size set to 500. Your 10000 is much higher. I don't you can expect a high performance gain when increasing this over 500, so I'd advice you to just use the default of 500 here. If 500 still fails with a timeout, you may even want to reduce it further. This could happen if the documents you want to index are very complex.

            You could also increase the timeout for the streaming_bulk call, or, alternatively, for your es object. To only change it for the streaming_bulk call, you can provide the request_timeout keyword argument:

            Source https://stackoverflow.com/questions/67522617

            QUESTION

            python3.6 async/await still works synchronously with fastAPI
            Asked 2021-May-13 at 17:37

            I have a fastAPI app that posts two requests, one of them is longer (if it helps, they're Elasticsearch queries and I'm using the AsyncElasticsearch module which already returns coroutine). This is my attempt:

            ...

            ANSWER

            Answered 2021-Apr-02 at 09:30

            Yes, that's correct the coroutine won't proceed until the results are ready. You can use asyncio.gather to run tasks concurrently:

            Source https://stackoverflow.com/questions/66916601

            QUESTION

            Connecting to remote Elasticsearch server with python's Elasticsearch package
            Asked 2021-Jan-20 at 19:06

            I want to use a remote Elasticsearch server for my website.

            I have used elastic.co/ cloud service to create a remote Elasticsearch server. I can connect/ping remote Elasticsearch server using the following command (it is scrubbed of sensitive info): curl -u username:password https://55555555555bb0c30d1cba4e9e6.us-central1.gcp.cloud.es.io:9243

            After tying this command into terminal, I receive the following response:

            ...

            ANSWER

            Answered 2021-Jan-20 at 19:06

            You need to connect using TLS/SSL and Authentication as described in the documentation.

            In your case you should use something like this.

            Source https://stackoverflow.com/questions/65814910

            QUESTION

            Elasticsearch _bulk returns empty dict
            Asked 2021-Jan-18 at 14:25

            I have a strange behavior on my ES7.8 cluster, when inserting data using elasticsearch.helpers.streaming_bulk it will say this strange error:

            ...

            ANSWER

            Answered 2021-Jan-18 at 14:25

            Turns out it's my own stupidity. I specified filter_path=['hits.hits._id']) parameter all the way down during bulk request.

            Thanks for the tip @Val.

            Source https://stackoverflow.com/questions/65774395

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install elasticsearch-py

            You can install using 'pip install elasticsearch-py' or download it from GitHub, PyPI.
            You can use elasticsearch-py like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/elastic/elasticsearch-py.git

          • CLI

            gh repo clone elastic/elasticsearch-py

          • sshUrl

            git@github.com:elastic/elasticsearch-py.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Explore Related Topics

            Reuse Pre-built Kits with elasticsearch-py

            Consider Popular REST Libraries

            public-apis

            by public-apis

            json-server

            by typicode

            iptv

            by iptv-org

            fastapi

            by tiangolo

            beego

            by beego

            Try Top Libraries by elastic

            elasticsearch

            by elasticJava

            kibana

            by elasticTypeScript

            logstash

            by elasticJava

            beats

            by elasticGo

            eui

            by elasticTypeScript