metastore | Flexible Metadata , Data and Configuration information store

by pentaho Java Version: 9.6.0.0-50 License: No License

X-Ray Key Features Code Snippets Community Discussions(10)Vulnerabilities Install Support

kandi X-RAY | metastore Summary

metastore is a Java library typically used in Utilities applications. metastore has no bugs, it has no vulnerabilities, it has build file available and it has low support. You can download it from GitHub, GitLab.

This project contains a flexible metadata, data and configuration information store. Anyone can use it but it was designed for use within the Pentaho software stack. The "meta-model" is simple and very generic. The top level entry is always a namespace. The namespace can be used by non-Pentaho companies to store their own information separate from anyone else. The next level in the meta-model is an Element Type. A very generic name was chosen on purpose to reflect the fact that you can store just about anything. The element is at this point in time nothing more than a simple placeholder: an ID, a name and a description. Finally, each element type can have a series of Elements. Each element has an ID and a set of key/value pairs (called "id" and "value") as child attributes. All attributes can have children of their own. An element has security information: an owner and a set of owner-permissions describing who has which permission to use the element. (CRUD permissions).

Support

Quality

Security

License

Reuse

Support

metastore has a low active ecosystem.

It has 20 star(s) with 86 fork(s). There are 72 watchers for this library.

It had no major release in the last 6 months.

There are 1 open issues and 0 have been closed. There are 1 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of metastore is 9.6.0.0-50

Quality

metastore has 0 bugs and 0 code smells.

Security

metastore has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

metastore code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

metastore does not have a standard license declared.

Check the repository for any license declaration and review the terms closely.

Without a license, all rights are reserved, and you cannot use the library in your applications.

Reuse

metastore releases are not available. You will need to build from source code and install.

Build file is available. You can build the component from source.

Installation instructions, examples and code snippets are available.

Top functions reviewed by kandi - BETA

kandi has reviewed metastore and discovered the below as its top functions. This is intended to give you an instant insight into metastore implemented functionality, and help decide if they suit your requirements.

Saves the specified element
Saves attributes on a meta store
Saves a name reference
Saves a list attribute
Creates an element type
Gets an element type
Saves the meta store data
Returns a list of element types for a namespace
Deletes a namespace
Gets the list of element type ids
Deletes the specified element
Creates a namespace
Removes an element from the metastore
Load an element type
Get the namespaces of all the files in the folder
Unregister an element type from the given namespace
Updates an element type
Updates an element
Gets the names of the elements in a specific namespace
Delete a namespace
Deletes a child attribute with the given id
Updates an existing element
Return the list of all the element ids for the given element type
Deletes an element type
Load security attributes
Creates an element in the meta store

Get all kandi verified functions for this library.

metastore Key Features

No Key Features are available at this moment for metastore.

metastore Examples and Code Snippets

No Code Snippets are available at this moment for metastore.

Community Discussions

Trending Discussions on metastore

Bigquery as metastore for Dataproc

Not able to query AWS Glue/Athena views in Databricks Runtime ['java.lang.IllegalArgumentException: Can not create a Path from an empty string;']

Unable to run pyspark on local windows environment: org.apache.hadoop.io.nativeio.NativeIO$POSIX.stat(Ljava/lang/String;)Lorg/apache/hadoop/io/nativei

Confluent Platform - how to properly use ksql-datagen?

Spark-SQL plug in on HIVE

How to Set Log Level for Third Party Jar in Spark

Snowflake Pyspark: Failed to find data source: snowflake

Spark SQL queries against Delta Lake Tables using Symlink Format Manifest

How to run Spark SQL Thrift Server in local mode and connect to Delta using JDBC

Why Uncache table in spark-sql not working?

QUESTION

Bigquery as metastore for Dataproc

Asked 2022-Apr-01 at 04:00

We are trying to migrate pyspark script from on-premise which creates and drops tables in Hive with data transformations to GCP platform.

Hive is replaced by BigQuery. In this case, the hive reads and writes is converted to bigquery reads and writes using spark-bigquery-connector.

However the problem lies with creation and dropping of bigquery tables via spark sql as spark sql will default run the create and drop queries on hive backed by hive metastore not on big query.

I wanted to check if there is plan to incorporate DDL statements support as well as part of spark-bigquery-connector.

Also, from architecture perspective is it possible to base the metastore for spark sql on bigquery so that any create or drop statement can be run on bigquery from spark.

...

ANSWER

Answered 2022-Apr-01 at 04:00

I don't think Spark SQL will support BigQuery as metastore, nor BQ connector will support BQ DDL. On Dataproc, Dataproc Metastore (DPMS) is the recommended solution for Hive and Spark SQL metastore.

In particular, for no-prem to Dataproc migration, it is more straightforward to migrate to DPMS, see this doc.

Source https://stackoverflow.com/questions/71676161

QUESTION

Not able to query AWS Glue/Athena views in Databricks Runtime ['java.lang.IllegalArgumentException: Can not create a Path from an empty string;']

Asked 2022-Mar-30 at 15:27

Attempting to read a view which was created on AWS Athena (based on a Glue table that points to an S3's parquet file) using pyspark over a Databricks cluster throws the following error for an unknown reason:

...

ANSWER

Answered 2022-Mar-30 at 15:27

I was able to come up with a python script to fix the problem. It turns out that this exception occurs because Athena and Presto store view's metadata in a format that is different from what Databricks Runtime and Spark expect. You'll need to re-create your views through Spark

Python script example with execution example:

Source https://stackoverflow.com/questions/70018541

QUESTION

Unable to run pyspark on local windows environment: org.apache.hadoop.io.nativeio.NativeIO$POSIX.stat(Ljava/lang/String;)Lorg/apache/hadoop/io/nativei

Asked 2022-Mar-16 at 09:49

I'm trying to create local spark environment in Windows 11 with python.
I am using python 3.9 and spark version 3.2.1. I have set my environmental variables to:

...

ANSWER

Answered 2022-Mar-16 at 09:29

Not sure if this would be the fix, but neither of the links you posted for hadoop.dll and winutils.exe are for the version of Spark you're using (3.2.1)

I use 3.2.1 on Windows as well and always use this link to download the files and add them to my Spark bin https://github.com/cdarlint/winutils/tree/master/hadoop-3.2.1/bin

Source https://stackoverflow.com/questions/71494205

QUESTION

Confluent Platform - how to properly use ksql-datagen?

Asked 2022-Mar-14 at 19:57

I'm using a dockerized version of the Confluent Platform v 7.0.1:

...

ANSWER

Answered 2022-Feb-18 at 22:37

You may be hitting issues since you are running an old version of ksqlDB's quickstart (0.7.1) with Confluent Platform 7.0.1.

If you check out a quick start like this one: https://ksqldb.io/quickstart-platform.html, things may work better.

I looked for an updated version of that data generator and didn't find it quickly. If you are looking for more info about structured data, give https://docs.ksqldb.io/en/latest/how-to-guides/query-structured-data/ a read.

Source https://stackoverflow.com/questions/71177830

QUESTION

Spark-SQL plug in on HIVE

Asked 2022-Mar-11 at 13:53

HIVE has a metastore and HIVESERVER2 listens for SQL requests; with the help of metastore, the query is executed and the result is passed back. The Thrift framework is actually customised as HIVESERVER2. In this way, HIVE is acting as a service. Via programming language, we can use HIVE as a database.

The relationship between Spark-SQL and HIVE is that:

Spark-SQL just utilises the HIVE setup (HDFS file system, HIVE Metastore, Hiveserver2). When we invoke /sbin/start-thriftserver2.sh (present in spark installation), we are supposed to give hiveserver2 port number, and the hostname. Then via spark's beeline, we can actually create, drop and manipulate tables in HIVE. The API can be either Spark-SQL or HIVE QL. If we create a table / drop a table, it will be clearly visible if we login into HIVE and check(say via HIVE beeline or HIVE CLI). To put in other words, changes made via Spark can be seen in HIVE tables.

My understanding is that Spark does not have its own meta store setup like HIVE. Spark just utilises the HIVE setup and simply the SQL execution happens via Spark SQL API.

Is my understanding correct here?

Then I am little confused about the usage of bin/spark-sql.sh (which is also present in Spark installation). Documentation says that via this SQL shell, we can create tables like we do above (via Thrift Server/Beeline). Now my question is: How the metadata information is maintained by spark then?

Or like the first approach, can we make spark-sql CLI to communicate to HIVE (to be specific: hiveserver2 of HIVE) ? If yes, how can we do that ?

Thanks in advance!

...

ANSWER

Answered 2022-Mar-11 at 13:53

My understanding is that Spark does not have its own meta store setup like HIVE

Spark will start a Derby server on its own, if a Hive metastore is not provided

can we make spark-sql CLI to communicate to HIVE

Start an external metastore process, add a hive-site.xml file to $SPARK_CONF_DIR with hive.metastote.uris, or use SET SQL statements for the same

Source https://stackoverflow.com/questions/68595361

QUESTION

How to Set Log Level for Third Party Jar in Spark

Asked 2022-Mar-08 at 14:38

I use Spark to write data from Hive Table to Kinetica using this jar: kinetica-spark-7.0.6.1-jar-with-dependencies.jar. However, when I run spark-submit, the logger from the jar is printing the JDBC connection string with its credentials as follows:

...

ANSWER

Answered 2022-Mar-08 at 14:38

In my configuration, I use the following to log the LoaderParams statements at WARN and everything else from the Kinetica Spark connector at INFO:

Source https://stackoverflow.com/questions/71331384

QUESTION

Snowflake Pyspark: Failed to find data source: snowflake

Asked 2022-Mar-03 at 12:49

I'm unable to connect to snowflake via a dockerized pyspark container. I do not find the snowflake documentation to be helpful nor the pyspark documentation at this point in time.

I'm using the following configuration installed & can be seen below in the Dockerfile

python 3.7.12
pyspark 3.1.1
Hadoop 3.2
jre-1.8.0-openjdk
snowflake-jdbc-3.13.15.jar
spark-snowflake_2.12-2.10.0-spark_3.1.jar
snowflake-connector-python 2.7.4

...

ANSWER

Answered 2022-Mar-01 at 20:58

instead of --jars, try --packages=net.snowflake:snowflake-jdbc:3.13.14,net.snowflake:spark-snowflake_2.11:2.9.3-spark_2.4

Source https://stackoverflow.com/questions/71313564

QUESTION

Spark SQL queries against Delta Lake Tables using Symlink Format Manifest

Asked 2022-Feb-18 at 03:49

I'm running spark 3.1.1 and an AWS emr-6.3.0 cluster with the following hive/metastore configurations:

...

ANSWER

Answered 2022-Feb-18 at 03:49

I ended up figuring this out myself.

You can save yourself a lot of pain and misunderstanding by grasping the distinction between querying a delta lake external table (via glue) and querying a delta lake table directly, see: https://docs.delta.io/latest/delta-batch.html#read-a-table

In order to query the delta lake table directly without having to interact or go through the external table, simple change the table reference in your spark sql query to the following format:

Source https://stackoverflow.com/questions/71043995

QUESTION

How to run Spark SQL Thrift Server in local mode and connect to Delta using JDBC

Asked 2022-Jan-08 at 06:42

I'd like connect to Delta using JDBC and would like to run the Spark Thrift Server (STS) in local mode to kick the tyres.

I start STS using the following command:

...

ANSWER

Answered 2022-Jan-08 at 06:42

Once you can copy io.delta:delta-core_2.12:1.0.0 JAR file to $SPARK_HOME/lib and restart, this error goes away.

Source https://stackoverflow.com/questions/69862388

QUESTION

Why Uncache table in spark-sql not working?

Asked 2021-Dec-17 at 02:19

I'm learning Spark SQL, when I'm using spark-sql to uncache a table which has previously cached, but after submitted the uncache command, I can still query the cache table. Why this happened?

Spark version 3.2.0(Pre-built for Apache Hadoop 2.7)

Hadoop version 2.7.7

Hive metastore 2.3.9

Linux Info

...

ANSWER

Answered 2021-Dec-17 at 02:19

UNCACHE TABLE removes the entries and associated data from the in-memory and/or on-disk cache for a given table or view, not drop the table. So you can still query it.

Source https://stackoverflow.com/questions/70387638

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install metastore

metastore uses the maven framework. This is a maven project, and to build it use the following command. Optionally you can specify -Drelease to trigger obfuscation and/or uglification (as needed). Optionally you can specify -Dmaven.test.skip=true to skip the tests (even though you shouldn't as you know). The build result will be a Pentaho package located in target.
Maven, version 3+
Java JDK 11
This settings.xml in your /.m2 directory
Don't use IntelliJ's built-in maven. Make it use the same one you use from the commandline. Project Preferences -> Build, Execution, Deployment -> Build Tools -> Maven ==> Maven home directory

Support

For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .

Find more information at: