vespa | https : //vespa.ai

by vespa-engine Java Version: v8.171.43 License: Apache-2.0

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | vespa Summary

vespa is a Java library typically used in Big Data, Tensorflow, Docker, Spark applications. vespa has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has high support. You can download it from GitHub, Maven.

The open big data serving engine - Store, search, organize and make machine-learned inferences over big data at serving time. This is the primary repository for Vespa where all development is happening. New production releases from this repository's master branch are made each weekday from Monday through Thursday.

Support

Quality

Security

License

Reuse

Support

vespa has a highly active ecosystem.

It has 4455 star(s) with 526 fork(s). There are 153 watchers for this library.

It had no major release in the last 12 months.

There are 147 open issues and 592 have been closed. On average issues are closed in 64 days. There are 16 open pull requests and 0 closed requests.

It has a positive sentiment in the developer community.

The latest version of vespa is v8.171.43

Quality

vespa has no bugs reported.

Security

vespa has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

License

vespa is licensed under the Apache-2.0 License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

vespa releases are available to install and integrate.

Deployable package is available in Maven.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi has reviewed vespa and discovered the below as its top functions. This is intended to give you an instant insight into vespa implemented functionality, and help decide if they suit your requirements.

Recursivelyurse through all tenants recursively .
Instantiates a URI item .
Convert a YARL expression tree into VESPA operators .
Export deployments from a collection of deployments .
Converts a GroupingExpression into an ExpressionNode
Write the definition of a class
Convert an indexed block to a StringBuilder
Returns the metrics for the search node .
Writes the logs in the specified time range to the specified output stream .
Reads a string from the input .

Get all kandi verified functions for this library.

vespa Key Features

No Key Features are available at this moment for vespa.

vespa Examples and Code Snippets

No Code Snippets are available at this moment for vespa.

Community Discussions

Trending Discussions on vespa

Dynamic addition of fields in vespa

Starting and stopping services in vespa

How to downgrade the version of the vespa engine installed in ec2 server

Versioning document changes in Vespa

Queries are very slow in local vespa

`vespa` tutorial : ./src/python/user_search.py U33527 10 KeyError: 'children'

Is there a way to reuse a struct in Vespa

Vespa response: Summary data is incomplete: Timed out waiting for summary data

Mismatch between content node storage ratio and server storage

Vespa.ai exploit multiple instances to answer queries

QUESTION

Dynamic addition of fields in vespa

Asked 2022-Mar-14 at 09:55

In Elastic Search, to add new fields while running the application we have to provide

"dynamic":true

More info about the same: https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic.html

Is there any functionality which can replicate same behaviour in Vespa? I was not able to find in vespa documentation. Kindly help me in this regard. Thank you.

...

ANSWER

Answered 2022-Mar-14 at 09:47

https://docs.vespa.ai/en/schemas.html#schema-modifications is the best place to start - just modify the schema with new fields and redeploy the application. The new fields can not have a default value, they are empty. It is not necessary to restart Vespa, this can be done on a running instance.

Source https://stackoverflow.com/questions/71465539

QUESTION

Starting and stopping services in vespa

Asked 2022-Feb-16 at 18:43

In the benchmarking page "https://docs.vespa.ai/en/performance/vespa-benchmarking.html" it is given that we need to restart the services after we increase the persearch thread using the commands vespa-stop-services and vespa-start-services. Could you tell us if we need to do this on all the content nodes or just the config nodes?

...

ANSWER

Answered 2022-Feb-16 at 18:43

When deploying a change that requires a restart, the deploy command will list the actions you need to take. For example when changing the global per search thread setting changing from 2 to 5 in the below example:

Source https://stackoverflow.com/questions/71147132

QUESTION

How to downgrade the version of the vespa engine installed in ec2 server

Asked 2022-Jan-31 at 09:45

I have Vespa version 7.534.29 compiled with go1.15.14 on linux/amd64 how can i downgrade it to 7.220.14 as its causing my custom searchers to not work in production

...

ANSWER

Answered 2022-Jan-31 at 07:23

Installing a new Vespa version is described at https://docs.vespa.ai/en/operations/live-upgrade.html. It could be an idea to try your custom searchers in an environment using the same Docker containers as in https://docs.vespa.ai/en/getting-started.html - to see if the problem is your build of Vespa or something else.

Source https://stackoverflow.com/questions/70914602

QUESTION

Versioning document changes in Vespa

Asked 2022-Jan-23 at 16:00

I would like to allow for versioning of text in Vespa. If a user changes certain fields over time the changes would be tracked and versions could be restored.

I imagine a solution running in parallel to Vespa would be the way to go, with version numbers being stored in the vespa doc as unindexed data.

Any recommendations on a solution to use to do this? Something like http://jsonpatch.com?

...

ANSWER

Answered 2022-Jan-23 at 16:00

I would just store each version as a separate document by including the version in the document id.

Source https://stackoverflow.com/questions/70822134

QUESTION

Queries are very slow in local vespa

Asked 2022-Jan-06 at 10:30

I am having difficulty executing correctly a vespa query. i want to query 2 different index fields with or between them, i want to to the equivalent of elastic match query.

i got a lot of soft timeouts so i increased timeout to get the true result and check how much time it took.

this is the query i sent:

...

ANSWER

Answered 2022-Jan-03 at 11:00

See the section on index versus attribute here and also fast-search doc https://docs.vespa.ai/en/performance/feature-tuning.html

By default, fields with attribute definitions are not fast searchable, that is likely the problem here. Adding fast-search attribute property will build B-tree structures for faster search.

Source https://stackoverflow.com/questions/70564334

QUESTION

`vespa` tutorial : ./src/python/user_search.py U33527 10 KeyError: 'children'

Asked 2021-Dec-14 at 10:36

I'm following step by step the Vespa tutorials: https://docs.vespa.ai/en/tutorials/news-5-recommendation.html

...

ANSWER

Answered 2021-Dec-14 at 10:36

The Vespa index has no user documents here, so most likely the user and news embeddings have not been fed to the system. After they are calculated in the previous step (https://docs.vespa.ai/en/tutorials/news-4-embeddings.html), be sure to feed them to Vespa:

Source https://stackoverflow.com/questions/70347106

QUESTION

Is there a way to reuse a struct in Vespa

Asked 2021-Nov-09 at 15:15

Is there a way to reuse a set of fields in a Vespa Schema?

If I had a set of fields that were structs and each struct had a set of identical fields is there a way to define the identical set once and reuse it in each struct through out the schema?

...

ANSWER

Answered 2021-Nov-09 at 14:54

No there is not. Pretty much everything else supports inheritance but not structs. I can add it in a little while.

Source https://stackoverflow.com/questions/69900079

QUESTION

Vespa response: Summary data is incomplete: Timed out waiting for summary data

Asked 2021-Aug-29 at 19:27

I am deploying a simple text retrieval system with Vespa. However, I found when setting topk to some large number, e.g. 40, the response will include the error message "Summary data is incomplete: Timed out waiting for summary data." and also some unexpected ids. The system works fine for some small topk like 10. The response was as follows:

{'root': {'id': 'toplevel', 'relevance': 1.0, 'fields': {'totalCount': 1983140}, 'coverage': {'coverage': 19, 'documents': 4053984, 'degraded': {'match-phase': False, 'timeout': True, 'adaptive-timeout': False, 'non-ideal-state': False}, 'full': False, 'nodes': 1, 'results': 1, 'resultsFull': 0}, 'errors': [{'code': 12, 'summary': 'Timed out', 'message': 'Summary data is incomplete: Timed out waiting for summary data. 1 responses outstanding.'}], 'children': [{'id': 'index:square_datastore_content/0/34b46b2e96fc0aa18ed4941b', 'relevance': 44.44359956427316, 'source': 'square_datastore_content'}, {'id': 'index:square_datastore_content/0/16dbc34c5e77684cd6f554fd', 'relevance': 43.94371735208669, 'source': 'square_datastore_content'}, {'id': 'index:square_datastore_content/0/9f2fd93f6d74e88f96d7014f', 'relevance': 43.298002713993384, 'source': 'square_datastore_content'}, {'id': 'index:square_datastore_content/0/76c4e3ee15dc684a78938a9d', 'relevance': 40.908658368905485, 'source': 'square_datastore_content'}, {'id': 'index:square_datastore_content/0/c04ceee4b9085a4d041d8c81', 'relevance': 36.13561898237115, 'source': 'square_datastore_content'}, {'id': 'index:square_datastore_content/0/13806c518392ae7b80ab4e4c', 'relevance': 35.688377118163714, 'source': 'square_datastore_content'}, {'id': 'index:square_datastore_content/0/87e0f13fdef1a1c404d3c8c6', 'relevance': 34.74150232183567, 'source': 'square_datastore_content'}, ...]}}

I am using the schema:

...

ANSWER

Answered 2021-Aug-29 at 19:27

The default Vespa timeout is 500 ms and can be adjusted by &timeout=x where x is given in seconds, e.g &timeout=2 would use an overall request timeout of 2 seconds.

A query is executed in two protocol phases:

Find the top k matches given the query/ranking profile combination, each node returns up to k results
The stateless container merges the results and finally asks for summary data (e.g the contents of only the top k results)

See https://docs.vespa.ai/en/performance/sizing-search.html for an explanation of this.

In your case you are hit by two things

A soft timeout at the content node (coverage is reported to be only 19%) so within the default timeout of 500ms it could retrieve and rank 19% of the available content. At 500ms minus a factor it timed out and returned what it had retrieved and rank up til the.
When trying to use the time left it also timed out waiting for the hits data for those documents which it managed to retrieve and rank within the soft timeout, this is the incomplete summary data response.

Generally, if you want cheap BM25 search use WAND (https://docs.vespa.ai/en/using-wand-with-vespa.html) If you want to search using embeddings, use ANN instead of brute force NN. We also have a complete sample application reproducing the DPR (Dense Passage Retrieval) here https://github.com/vespa-engine/sample-apps/tree/master/dense-passage-retrieval-with-ann

Source https://stackoverflow.com/questions/68972157

QUESTION

Mismatch between content node storage ratio and server storage

Asked 2021-Aug-19 at 10:56

I am using Vespa in a docker with one single content node on a Ubuntu server. The total storage is:

...

ANSWER

Answered 2021-Aug-19 at 10:56

In this case you are limited by memory:

Source https://stackoverflow.com/questions/68845159

QUESTION

Vespa.ai exploit multiple instances to answer queries

Asked 2021-Jun-23 at 12:50

I have Vespa.ai cluster with multiple container/content nodes. After Vespa is loaded with data, my app sends queries and gets the data from Vespa. I want to be sure that I utilize well all the nodes and I get the data as fast as possible. My app builds HTTP request and sends it to one of the nodes.
Which node/nodes should I direct my request to? How can I be sure that all instances participate in answering queries?
What should I do to utilize all the cluster nodes?
Does Vespa know to load balance these requests to other instances for better performance?

...

ANSWER

Answered 2021-Jun-23 at 07:56

Vespa is a 2-tier system:

The containers will load balance over the content nodes (if you have multiple groups), but since you are sending the requests to the containers, you need to load balance over those.

This can be done by code you write in your client, by VIP, by another tier of nodes you host yourself such as e.g Nginx, or by a hosted load balancer such as AWS ELB.

Source https://stackoverflow.com/questions/68095336

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install vespa

Run your own Vespa instance: https://docs.vespa.ai/en/getting-started.html Or deploy your Vespa applications to the cloud service: https://cloud.vespa.ai.
Use this if you only need to build the Java modules, otherwise follow the complete development guide above.
See Building Vespa RPM for details.

Support

We welcome contributions! See CONTRIBUTING.md to learn how to contribute. If you want to contribute to the documentation, see https://github.com/vespa-engine/documentation.

Find more information at: