kandi X-RAY | janusgraph Summary
kandi X-RAY | janusgraph Summary
JanusGraph is a highly scalable graph database optimized for storing and querying large graphs with billions of vertices and edges distributed across a multi-machine cluster. JanusGraph is a transactional database that can support thousands of concurrent users, complex traversals, and analytic graph queries.
Top functions reviewed by kandi - BETA
- Returns the unique ID block
- Returns the block size of the given partition
- Returns the current current ID
- Compute the partition key
- Merge two pages
- Creates a buffer page from an entry array
- Processes edge indices
- Re - index an element
- Compares two relation types
- Compares two values
- Returns a String representation of the CVS
- Gets a slice of a key
- Reads the message
- Sets the HBase configuration
- Retrieves an object from the configuration
- Closes the log
- Creates an index if it doesn t already exist
- Creates a new segment buffer report
- Pre - load configuration
- Mutate Solr mutations
- Build store features wrapper
- Creates and returns an in - memory transaction
- Starts the worker thread
- Makes a batch of documents using the given indexes
- Create the elements
- Creates a composite index
janusgraph Key Features
janusgraph Examples and Code Snippets
Trending Discussions on janusgraph
I've been trying to create interact with my JanusGraph setup in docker. But after many tries I still don't succeed.
How I connect to JG....
ANSWERAnswered 2022-Mar-06 at 13:51
The GraphTraversal object is only a "plan" to be carried out. To have it take effect, you need a closing method like next, toList, etc., like you did for the count.
The confusion probably arose from the fact that the gremlin console automatically keeps nexting the traversal a configured number of times.
I am setting up Apache Atlas with Cassandra as backend and solr as indexer. Below are my configuration setting for Atlas-cassandra...
ANSWERAnswered 2022-Feb-24 at 17:11
Officially till Atlas 2.2.0 not support for authenticated Cassandra. But while building code we can update two files which are given in below PR and build Atlas. Which works as expected
While exploring the JanusGraph-core library, I seen the id generation part (StandardIDPool.nextID()), which seems to be id for a janus vertex is getting generated by the application logic. In this case, how can I horizontally scale an application that is using janusGraph, can't I get id conflicting problem while scaling the application?
What is the best approach to scale the app that use JanusGraph?...
ANSWERAnswered 2022-Jan-26 at 06:54
The JanusGraph instances for a graph select one instance that maintains an ID pool manager. The JanusGraph reference documentation says the following about optimizing ID allocation:
ID Block Size
Each newly added vertex or edge is assigned a unique id. JanusGraph’s id pool manager acquires ids in blocks for a particular JanusGraph instance. The id block acquisition process is expensive because it needs to guarantee globally unique assignment of blocks. Increasing ids.block-size reduces the number of acquisitions but potentially leaves many ids unassigned and hence wasted. For transactional workloads the default block size is reasonable, but during bulk loading vertices and edges are added much more frequently and in rapid succession. Hence, it is generally advisable to increase the block size by a factor of 10 or more depending on the number of vertices to be added per machine.
Rule of thumb: Set ids.block-size to the number of vertices you expect to add per JanusGraph instance per hour.
Important: All JanusGraph instances MUST be configured with the same value for ids.block-size to ensure proper id allocation. Hence, be careful to shut down all JanusGraph instances prior to changing this value.
ID Acquisition Process
When id blocks are frequently allocated by many JanusGraph instances in parallel, allocation conflicts between instances will inevitably arise and slow down the allocation process. In addition, the increased write load due to bulk loading may further slow down the process to the point where JanusGraph considers it failed and throws an exception. There are three configuration options that can be tuned to avoid this.
- ids.authority.wait-time configures the time in milliseconds the id pool manager waits for an id block application to be acknowledged by the storage backend. The shorter this time, the more likely it is that an application will fail on a congested storage cluster.
Rule of thumb: Set this to the sum of the 95th percentile read and write times measured on the storage backend cluster under load. Important: This value should be the same across all JanusGraph instances.
- ids.renew-timeout configures the number of milliseconds JanusGraph’s id pool manager will wait in total while attempting to acquire a new id block before failing.
Rule of thumb: Set this value to be as large feasible to not have to wait too long for unrecoverable failures. The only downside of increasing it is that JanusGraph will try for a long time on an unavailable storage backend cluster.
I'm following up with these 2 questions --
I'm viewing StackOverflow intensively(wanted to thank the community!) but unfortunately I didn't post/write a lot, so I don't even have enough reputation for posting a comment on the posts above...therefore I'm asking my questions here..
In 2nd post above, Hieu and I work together, and I want to provide a bit more background on the question.
As Stephen asked in the comment(for 2nd post), the reason that I want to chain
V() in the middle is simply because I want to start the traversal from the beginning, i.e. each and every node of the whole graph just like what
g.V() does, which appears at the beginning of most of the queries in gremlin documentation.
A bit more illustration: suppose I need 2 conditional filters on the results. Basically I want to write...
ANSWERAnswered 2021-Oct-01 at 20:41
In the middle of a traversal, a V() is applied to every traverser that has been created by the prior steps. Consider this example using the air-routes data set:
I tried following query with Gremlin and Tinkerpop...
ANSWERAnswered 2021-Sep-14 at 19:59
I believe what you are seeing is a special case condition with TinkerGraph where if you create two properties with the same key while creating the vertex as well, list cardinality is assumed. If you were to perform the same property additions on a vertex that already exists, single cardinality would be assumed and one would replace the other. For example:
I was reading about Janusgraph Cache in Janusgraph documentation. I have few doubts Regarding the transaction cache. I'm using an embedded janusgrah server in my application.
- If I'm only doing a read query for eg. - g. V().has("name","ABC") using gremlin HTTP endpoint, so will this value be cached in transaction cache or database level cache? because here I'm not opening any transaction.
- If it is stored in the transaction cache, how updated values will be fetched for this vertex if I have multi-node deployment?
ANSWERAnswered 2021-Sep-12 at 08:46
Regarding question 1:
If not created explicitly, transactions are created automatically. From the JanusGraph reference docs:
Every graph operation in JanusGraph occurs within the context of a transaction. According to the TinkerPop’s transactional specification, each thread opens its own transaction against the graph database with the first operation (i.e. retrieval or mutation) on the graph.
A vertex retrieved during a transaction is stored in both the transaction cache and database cache. After closing the transaction the vertex is still in the database cache (but note that since janusgraph-0.5.x the database cache is disabled by default).
Regarding question 2:
Indeed, a JanusGraph instance cannot know about modifications to vertices in the transaction caches of other instances. Only after these transactions have been closed and persisted to the storage and index backends, other instances can read modified vertices from the the backends. This also means that caches in other JanusGraph instances can be out of date, so if you want to be sure that you have the latest data from the backends, you should start a new transaction and disable the database cache (default setting).
The vertex caches are private members of JanusGraph and nowhere exposed to the user (not even in the debug logging). Cache hits in a traversal are only visible from a fast (sub-millisecond) return time.
If data consistency between transactions or janusgraph instances matters to you, you can take a look at:
- the new CacheVertex::refresh feature in janusgraph-0.6.0 (still undocumented).
I want to add/update vertex properties in through the following function to janusgraph with Gremlin.Net version=3.4.6; JanusGraph.Net version=0.2.2...
ANSWERAnswered 2021-Sep-03 at 10:14
The way to work with a traversal is by building it up iteratively first by concatenating the steps that you want to execute (like
has() and so on) and then to terminate the traversal with a terminal step like
iterate() which will execute the traversal.
You however use two terminal steps in your example which doesn't work. First you execute
HasNext() to verify that the vertex exists and then you try to modify its properties which you then want to execute via
The traversal was however already evaluated and its Bytecode was sent to the server when you executed
HasNext(). It is afterwards not possible any more to modulate the traversal object.
This becomes more clear when you try to do the same in the Gremlin Console:
I'm trying to test if janusgraph could be queried with sparql. So, I found one of plug-in 'SPARQL-Gremlin' could do that. (At least, the doc said it is working.) However, when I followed the doc https://tinkerpop.apache.org/docs/current/reference/#sparql-gremlin I found that if current storage is TinkGraph, then the sparql could be run successfully. However, if I change storage solution to the remote(janusgraph), then I got error message. So, is there anyone who have been succeed based on this plugin and janusgraph?...
ANSWERAnswered 2021-Aug-26 at 13:47
The comment form HadoopMarc is the likely lead to solving your problem. You need to be sure
sparql-gremlin is on your classpath for Gremlin Server when using this the way that you are in Gremlin Console. In other words, you are sending the Gremlin string of
g = traversal(SparqlTraversalSource).withGraph(graph) to be executed on the server and the server doesn't know anything about
sparql-gremlin is not in its classpath.
To get this working via Gremlin Console and
:remote (i.e. sending scripts) I would do try the following:
sparql-gremlinon your classpath for Gremlin Server (i.e. Janus Server). You can do that best with
bin/gremlin-server.sh install org.apache.tinkerpop sparql-gremlin 3.4.12(or whatever version you are using)
- Start the server with a
graphconfigured for JanusGraph in the server YAML file
- Connect with Gremlin Console using
:remoteas you did in your example
- Test to see if Gremlin Server picked up the package by sending just submitting a script of
SparqlTraversalSourcewhich should return the classname.
- If the previous step worked you should be able to do:
If you have that all working then a next step might be to configure a special traversal source in the Gremlin Server initialization script that is already setup with the
SparqlTraversalSource so that you can just reference it from your scripts directly.
In JanusGraph, we can specify
indexOnly() for a composite index like this:
ANSWERAnswered 2021-Aug-22 at 12:18
Good news, this got fixed in the upcoming v0.6.0 release of JanusGraph (already available as prerelease at https://github.com/JanusGraph/janusgraph/releases/tag/v0.6.0).
See the mgmt.getIndexOnlyConstraint(indexName) method at: https://github.com/JanusGraph/janusgraph/blob/v0.6.0/janusgraph-core/src/main/java/org/janusgraph/core/schema/JanusGraphManagement.java
I'm connecting to a remote JanusGraph server. There is no problem with the connection but I can't understand the Java API.
I'm running a Java Spring HTTP server. Inside the HTTP response method, I'm creating a graph traversal source like
GraphTraversalSource g = traversal().withRemote("conf/remote-graph.properties");
It seems fine. But When I want to get data about a Vertex I can not with the below code.
IT GIVES AN EMPTY RESPONSE! WHY?
If I do the same with gremlin shell, I see it. See the below picture
If I do
g.V(28712).valueMap(true).unfold().toList(); in Java, I see some results.
I just want to fetch data for a vertex or edge with Java. How should I do that?
ANSWERAnswered 2021-Aug-14 at 12:30
By default, when you connect to a Gremlin Server using one of the Gremlin clients and you ask for a vertex, what you get back is a reference vertex. A reference vertex just contains the ID and the label. To get some properties you need to include them using
project etc. Alternatively you can configure the Gremlin Server to return all properties. The default is set this way to reduce the amount of data that gets sent back to a client.
Please see the documentation for further details.
There is a link in that documentation to a post about why the decisions were made. For convenience, I am including that link here as well.
No vulnerabilities reported
You can use janusgraph like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the janusgraph component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .
Reuse Trending Solutions
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page