giraph | Mirror of Apache Giraph
kandi X-RAY | giraph Summary
kandi X-RAY | giraph Summary
giraph : large-scale graph processing on hadoop. web and online social graphs have been rapidly growing in size and scale during the past decade. in 2008, google estimated that the number of web pages reached over a trillion. online social networking and email sites, including yahoo!, google, microsoft, facebook, linkedin, and twitter, have hundreds of millions of users and are expected to grow much more in the future. processing these graphs plays a big role in relevant and personalized information for users, such as results from a search engine or news in an online social networking site. graph processing platforms to run large-scale algorithms (such as page rank, shared connections, personalization-based popularity, etc.) have become quite popular. some recent examples include pregel and haloop. for general-purpose big data computation, the map-reduce computing
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Populates the giraph configuration .
- Start the server .
- Runs a given test graph with vertex output .
- coordinate a superstep
- Instruments the sandbox .
- Performs a single input split .
- Starts the ZooKeeper server .
- Gets the next IOCommand .
- Saves the vertices .
- Connects all the tasks to their addresses .
giraph Key Features
giraph Examples and Code Snippets
Community Discussions
Trending Discussions on giraph
QUESTION
I have a cluster running Hadoop 1.2.1 with Giraph on top. The server runs ok, but when I stop it, I am unable to make it run again. In the datanode log I get the following error: ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /pathToFolder/data/datanode. The directory is already locked.
I have tried many solutions that I found online:
- Checking permissions of folders.
- Checking equal versions of VERSION file for namenode and datanode.
- Checking configuration files (core-site, hdfs-site, mapred-site, master, slaves, ...)
- Deleting / Changing the namenode and datanode data folders
- Removing hadoop temporary files
Bottomline is, everything seems fine, but it is still failing to start the datanode. A complete log file for the datanode is the following:
...ANSWER
Answered 2020-Jun-29 at 08:23Still haven't managed to get rid of the problem (Datanode not shutting down correctly), but I found a workaround to the situation. I used lsof +D /path
to detect active processes and killed them. The weird part is that this process was invisible to top
and jps
commands.
QUESTION
This is my first foray into java
on Spark. The following error is happening when using either Spark 1.X
(tried 1.5.0
) or 2.X
(tried 2.2.0
), java 1.8
and with scala 2.10
:
ANSWER
Answered 2018-Apr-27 at 06:31Both spark-core
and giraph-core
have dependency for netty-all
. You need to exclude it from giraph-core
.
QUESTION
So, I've successfully executed the SimpleShortestPathComputation on my computer via the script shown here:
...ANSWER
Answered 2018-Mar-14 at 03:32You need to have the jar which contains the class GiraphAlgs.GiraphPBFS in the hadoop classpath.
Also, verify that your classpath is correct set by running $bin/hadoop classpath
.
Once in hadoop 2.7 setting HADOOP_CLASSPATH variable didn't work, I had to copy the jar in the hadoop share lib directory: HADOOP_HOME/share/hadoop/mapreduce/lib
.
QUESTION
I am new to Hadoop/Giraph and Java. As part of a task, I downloaded Cloudera Quickstart VM and Giraph on top of it. I am using this book named "Practical Graph Analytics with Apache Giraph; Authors: Shaposhnik, Roman, Martella, Claudio, Logothetis, Dionysios" from which I tried to run the first example on Page 111 (Twitter Followership Graph).
Please find the below error while trying to run the changed pom.xml file with the hadoop version on the cluster 2.6.0-mr1-cdh5.12.0
...ANSWER
Answered 2017-Dec-08 at 23:38The pom.xml
in your book's copy is outdated. Use this one instead. Source: book examples repository on Github.
You want to use a recent version of hadoop-core
, but the most recent one Maven Central Repository (the default respository) offers is the 1.2.1. You will need to use the Cloudera Repository to get the most recent version of the library. To do that, simply add the repository to your pom.xml
:
QUESTION
I am trying to submit a giraph job to a hadoop 1.2.1 cluster. The cluster has a name node master, a map reduce master, and four slaves. The job is failing with the following exception:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, must have only one worker since only 1 task at a time!
However, here is my mapred-site.xml file:
...ANSWER
Answered 2017-Apr-20 at 17:21The problem wasn't that hadoop was running in local job mode, the problem is that giraph, configured on another machine, assumed that hadoop was running in local job mode.
I was submitting the job via gremlin, I needed to add the following line to the its configuration file:
QUESTION
Is it possible for me to use Giraph if I have Spark clusters and Cassandra but no Hadoop clusters?
Currently, I am using GraphX and would like to use Giraph instead. Is this possible considering that I have Spark clusters and am using Cassandra?
...ANSWER
Answered 2017-Apr-02 at 04:23I have only limited experience with Giraph from years ago, and I never tried using it outside of a Hadoop cluster. But it looks like what you want is at least technically possible if not necessarily easy.
This code is the companion to Practical Graph Analytics with Apache Giraph. As you can see, it requires Hadoop in the classpath for DoubleWritable
and Text
, for example, but it does nothing with a Hadoop cluster. Instead, it works with in-memory arrays. It looks like all you need to do is implement compute
in the BasicComputation
class to do whatever you need with Cassandra as long as you keep Hadoop around as a dependency to help satisfy the type boundaries for BasicComputation
.
I never found Giraph terribly intuitive, but hopefully you can make this unconventional setup work.
QUESTION
I created a maven project with eclipse and made jar file from it with this below comand
mvn package
when i try to know my mvn project config is true or not with this command
mvn exec:java -D exec.mainClass="giraph.helloworld.App"
i get this error :
failed to execute goal
org.codehaus.mojo:exec-maven-plugin:1.2.1:java(default-cli) on project helloworld: An exception occured while executing the java class. null: InvocationTargetException: No arguments were provided
POM.xml setting of project is as follows. I will be so grateful if anyone can help me and specifies the reasons of this error?
...ANSWER
Answered 2017-Feb-13 at 02:16I think the problem is that your manifest file does not contain information about the entry point (the main class) of your jar. See Setting an Application entry point.
There are many ways to rectify this problem. You can use maven assembly plugin. For more details, check here
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install giraph
You can use giraph like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the giraph component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page