FiloDB | Distributed Prometheus time series database | Pub Sub library

by filodb Scala Version: v0.9.17.2 License: Apache-2.0

X-Ray Key Features Code Snippets Community Discussions(2)Vulnerabilities Install Support

kandi X-RAY | FiloDB Summary

FiloDB is a Scala library typically used in Messaging, Pub Sub, Kafka applications. FiloDB has no bugs, it has no vulnerabilities, it has a Permissive License and it has medium support. You can download it from GitHub.

FiloDB is an open-source distributed, real-time, in-memory, massively scalable, multi-schema time series / event / operational database with Prometheus query support and some Spark support as well. The normal configuration for real-time ingestion is deployment as stand-alone processes in a cluster, ingesting directly from Apache Kafka. The processes form a cluster using peer-to-peer Akka Cluster technology. Overview presentation -- see the docs folder for design docs. To compile the .mermaid source files to .png's, install the Mermaid CLI.

Support

Quality

Security

License

Reuse

Support

FiloDB has a medium active ecosystem.

It has 1378 star(s) with 225 fork(s). There are 91 watchers for this library.

It had no major release in the last 12 months.

There are 12 open issues and 86 have been closed. On average issues are closed in 539 days. There are 22 open pull requests and 0 closed requests.

It has a neutral sentiment in the developer community.

The latest version of FiloDB is v0.9.17.2

Quality

FiloDB has 0 bugs and 0 code smells.

Security

FiloDB has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.

FiloDB code analysis shows 0 unresolved vulnerabilities.

There are 0 security hotspots that need review.

License

FiloDB is licensed under the Apache-2.0 License. This license is Permissive.

Permissive licenses have the least restrictions, and you can use them in most projects.

Reuse

FiloDB releases are available to install and integrate.

Installation instructions, examples and code snippets are available.

It has 73401 lines of code, 5994 functions and 470 files.

It has medium code complexity. Code complexity directly impacts maintainability of the code.

Top functions reviewed by kandi - BETA

kandi's functional review helps you automatically verify the functionalities of the libraries and avoid rework.
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of FiloDB

Get all kandi verified functions for this library.

FiloDB Key Features

No Key Features are available at this moment for FiloDB.

FiloDB Examples and Code Snippets

No Code Snippets are available at this moment for FiloDB.

Community Discussions

Trending Discussions on FiloDB

Get date months with iteration over Spark dataframe

FiloDB + Spark Streaming Data Loss

QUESTION

Get date months with iteration over Spark dataframe

Asked 2018-Jan-08 at 13:15

I have a problem case to iterate for last 36 months based on an input date. Currently using Scala, through a DataFrame I am getting the max value of a timestamp field. For example:

...

ANSWER

Answered 2018-Jan-08 at 13:15

If you had access to Spark 1.5+ you could do that easily with the year and month functions, but since you only have access to Spark 1.4 you'll have to replicate their functionality in UDFs, as follows:

Source https://stackoverflow.com/questions/48149546

QUESTION

FiloDB + Spark Streaming Data Loss

Asked 2017-Jan-26 at 21:23

I'm using FiloDB 0.4 with Cassandra 2.2.5 column and meta store and trying to insert data into it using Spark Streaming 1.6.1 + Jobserver 0.6.2. I use the following code to insert data:

...

ANSWER

Answered 2017-Jan-26 at 21:23

@psyduck, this is most likely because data for each partition can only be ingested on one node at a time -- for the 0.4 version. So to stick with the current version, you would need to partition your data into multiple partitions and then ensure each worker only gets one partition. The easiest way to achieve the above is to sort your data by partition key.

I would highly encourage you to move to the latest version though - master (Spark 2.x / Scala 2.11) or spark1.6 branch (spark 1.6 / Scala 2.10). The latest version has many changes that are not in 0.4 that would solve your problem:

Using Akka Cluster to automatically route your data to the right ingestion node. In this case with the same model your data would all go to the right node and ensure no data loss
TimeUUID-based chunkID, so even in case multiple workers (in case of a split brain) somehow write to the same partition, data loss is avoided
A new "segment less" data model so you don't need to define any segment keys, more efficient for both reads and writes

Feel free to reach out on our mailing list, https://groups.google.com/forum/#!forum/filodb-discuss

Source https://stackoverflow.com/questions/41838428

Community Discussions, Code Snippets contain sources that include Stack Exchange Network

Vulnerabilities

No vulnerabilities reported

Install FiloDB

Clone the project and cd into the project directory,. Follow the instructions below to set up an end to end local environment.
Clone the project and cd into the project directory, $ git clone https://github.com/filodb/FiloDB.git $ cd FiloDB It is recommended you use the last stable released version. To build, run filo-cli (see below) and also sbt spark/assembly.
Since FiloDB exposes a Prometheus-compatible HTTP API, it is possible to set up FiloDB as a Grafana data source.
Set the data source type to "Prometheus"
In the HTTP URL box, enter in the FiloDB HTTP URL (usually the load balancer for all the FiloDB endpoints). Be sure to append /promql/timeseries/, where you would put the name of the dataset instead of "timeseries" if it is not called timeseries.

Support

One major difference FiloDB has from the Prometheus data model is that FiloDB supports histograms as a first-class entity. In Prometheus, histograms are stored with each bucket in its own time series differentiated by the le tag. In FiloDB, there is a HistogramColumn which stores all the buckets together for significantly improved compression, especially over the wire during ingestion, as well as significantly faster query speeds (up to two orders of magnitude). There is no "le" tag or individual time series for each bucket. Here are the differences users need to be aware of when using HistogramColumn:. FiloDB offers an improved accuracy histogram_max_quantile function designed to work with a max column from the source. If clients are able to send the max value captured during a window, then we can report more accurate upper quantiles (ie 99%, 99.9%, etc.) that do not suffer from clipping.

Find more information at: