incubator-gobblin | distributed data integration framework that simplifies
kandi X-RAY | incubator-gobblin Summary
kandi X-RAY | incubator-gobblin Summary
Apache Gobblin is a universal data ingestion framework for extracting, transforming, and loading large volume of data from a variety of data sources: databases, rest APIs, FTP/SFTP servers, filers, etc., onto Hadoop. Apache Gobblin handles the common routine tasks required for all data ingestion ETLs, including job/task scheduling, task partitioning, error handling, state management, data quality checking, data publishing, etc. Gobblin ingests data from different data sources in the same execution framework, and manages metadata of different sources all in one place. This, combined with other features such as auto scalability, fault tolerance, data quality assurance, extensibility, and the ability of handling data model evolution, makes Gobblin an easy-to-use, self-serving, and efficient data ingestion framework.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of incubator-gobblin
incubator-gobblin Key Features
incubator-gobblin Examples and Code Snippets
Community Discussions
Trending Discussions on incubator-gobblin
QUESTION
I'm new to gobblin. I try to build a distribution using master branch of the project. I'm getting bellow error while following the instruction.
...ANSWER
Answered 2020-May-06 at 17:11Current Gobblin build scripts use features that are present in JDK 8, but were removed in newer JDK versions. Gradle can use the latest JDK installed on your machine, e.g. JDK 13. As a result, the build process can fail.
As a workaround, you can tell Gradle to use JDK 8.
For example, on Windows, this can be achieved by making a change in gradle.properties (given that you have jre1.8.0_202 installed):
QUESTION
I am new to gobblin. I build gobblin from incubator-gobblin GitHub master branch. Now I am tring wikipedia example from getting started guide but getting following error.
WARN: HADOOP_HOME is not defined. Gobblin Hadoop libs will be used in classpath.
Error: Could not find or load main class org.apache.gobblin.runtime.cli.GobblinCli
with --show-classpath
it gives /mnt/c/users/name/incubator-gobblin/conf/classpath::
How can I solve it? Please let me know if anyone know the solution.
ANSWER
Answered 2020-Feb-04 at 18:18Make sure that you run this command in incubator-gobblin/build/gobblin-distribution/distributions/gobblin-dist
and not in incubator-gobblin/gobblin-distribution
QUESTION
I am trying mysql to hdfs data ingestion using gobblin. While running mysql-to-gobblin.pull using steps below:
1) start hadoop:
sbin\start-all.cmd
2) start mysql service:
sudo service mysql start
3) set GOBBLIN_WORK_DIR:
export GOBBLIN_WORK_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_WORK_DIR
4) set GOBBLIN_JOB_CONFIG_DIR
export GOBBLIN_JOB_CONFIG_DIR=/mnt/c/users/name/incubator-gobblin/GOBBLIN_JOB_CONFIG_DIR
5) Start standalone
bin/gobblin.sh service standalone start --jars /mnt/C/Users/name/incubator-gobblin/build/gobblin-sql/libs/gobblin-sql-0.15.0.jar
gives below error
...ANSWER
Answered 2020-Mar-06 at 12:23solution is to add this jar or dependency to get rid of Caused by: java.lang.ClassNotFoundException: org.apache.gobblin.source.extractor.extract.jdbc.MysqlSource
QUESTION
I want to install Apache Gobblin
on my MacOS X. For this, I downloaded version 0.14.0 and followed the steps here.
The first thing I did was this:
...ANSWER
Answered 2018-Dec-14 at 09:59Just digged a little into the codes. Are you sure that Java 9 is supported by their build scripts?
Look at the line you have issue with: globalDependencies.gradle:44
. See ToolProvider.getSystemToolClassLoader()
. Now let's look at its docs for Java 9:
Deprecated. This method is subject to removal in a future version of Java SE. Use the system tool provider or service loader mechanisms to locate system tools as well as user-installed tools. Returns a class loader that may be used to load system tools, or null if no such special loader is provided.
Implementation Requirements:
This implementation always returns null.
See that? It always returns null
!
Things were different in Java 8, though:
Returns the class loader for tools provided with this platform. This does not include user-installed tools. Use the service provider mechanism for locating user installed tools.
So the script is calling getURLs
on a null
object and obviously throws an NPE. It probably needs to be fixed!
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install incubator-gobblin
Download gradle-wrapper.jar (version 2.13) and place it in the gradle/wrapper folder. See 'Instructions to download gradle wrapper' above.
Skip tests and build the distribution: Run ./gradlew build -x findbugsMain -x test -x rat -x checkstyleMain The distribution will be created in build/gobblin-distribution/distributions directory. (or)
Run tests and build the distribution (requires Maven): Run ./gradlew build The distribution will be created in build/gobblin-distribution/distributions directory.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page