kiba | Data processing & ETL framework for Ruby | Data Migration library
kandi X-RAY | kiba Summary
kandi X-RAY | kiba Summary
Writing reliable, concise, well-tested & maintainable data-processing code is tricky. Kiba lets you define and run such high-quality ETL (Extract-Transform-Load) jobs using Ruby.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Creates a new Table instance .
- Convert instance to a class instance
- Runs the generator .
- Converts all definitions into a hash .
- Enumerates sources
- Creates a new matrix with the given rows .
- Parse the context object
- Close all destinations
- The destination list
- Adds a source to the source
kiba Key Features
kiba Examples and Code Snippets
Community Discussions
Trending Discussions on kiba
QUESTION
I need to parse a JSON that comes from a crypto exchange API. In this case, I need to parse my open orders. If there are no open orders, the Json is :
...ANSWER
Answered 2022-Feb-17 at 15:40Public Async Function ExecuteAsync() As Task
Dim wc As New WebClient
'you can await this task and get the json string result instead of adding it to a list first
Dim json = Await wc.DownloadStringTaskAsync("https://api.hotbit.io/api/v1/order.pending?market=KIBA/USDT&offset=0&limit=100&api_key=44812d8f-66d3-01c0-94c3b29305040b03&sign=F3330B924E1873B9C8FAB40A25D7B851")
'deserialize the json
Dim rootObject = Example.FromJson(json)
'navigate to Kibausdt
Dim kibausdt = rootObject.result.KIBAUSDT
'check total
If kibausdt.total = 0 Then
RichTextBox1.Text = "0 opened orders"
Else
'loop through records
For Each record In kibausdt.records
Dim side As String = record.side.ToString()
Dim amount As Long = record.amount
Dim price As String = record.price
RichTextBox1.Text &= side & amount & price
Next
End If
End Function
QUESTION
I'm trying to parse this Json
With the code:
...ANSWER
Answered 2022-Feb-13 at 17:34Give a go at this..
- Paste this into a new class file:
QUESTION
I'm trying to parse this API
that is basically an order book with "asks" and "bids".
How can I parse them splitting the asks from the bids?
for example the api start with asks property
So if Json is {"asks":[["0.00001555","3264400"],["0.00001556","3662200"],["0.00001573","3264400"]
I'm expecting an output like:
[asks]
Price- Quantity
0.00001555 - 3264400
0.00001556 - 3662200
0.00001573 - 3264400
ecc
and After there is "bids":[["0.00001325","300"],["0.00001313","100"],["0.00001312","1051400"],["0.00001311","1300000"],["0.0000131","9336700"]
so I'm expecting
[bids]
Price- Quantity
0.00001325- 300
0.00001313 - 100
0.00001312 - 1051400
0.00001311 - 1300000
0.0000131 - 9336700
ecc
I know how to parse each single value with the code:
...ANSWER
Answered 2022-Feb-13 at 17:19Use JsonUtils which will create the classes for you and then you can copy these into your project.
In this instance it will create:
QUESTION
ANSWER
Answered 2022-Jan-31 at 16:50I chose to act on the output string instead of tackling the OCR API.
Fixing the issue within the OCR API would probably be a superior solution if possible, but I could not get your code properly referenced in my system.
So you can add this function to transpose the string
QUESTION
I am trying to get the values in a table cell with the same class name through the control WebView2
it's useful to get a single value, but now I want to take in consideration that can happen to have more than a single value. The table full html code is:
...ANSWER
Answered 2022-Jan-02 at 01:05This requires some more javascript to work. However, once implemented it should work better.
First save the following javascript to 'script.js' in your project's root directory (or a subdirectory, if you change the path in code). Make sure you select the file's properties and select Copy to output directory: Copy if newer
. That copies the script file so WebView2
can find it.
Here's the javascript:
QUESTION
In Realm, I had problem understanding ( Im new in Realm T,T ) the implementations of LinkingObjects , let's say a Person could have more than one Dog ( List of Dog ) so I would write the code such as below:
...ANSWER
Answered 2021-Jun-13 at 15:23You can think of LinkingObjects almost as a computed property - it automagically creates an inverse link to the parent object when the child object is added to the parent objects List.
So when a Dog is added to a person's dogs list, a person reference is added to the Dog's walkers list. Keeping in mind that it's a many to many relationship so technically if Person A adds Doggo, and Person B adds Doggo, the Doggo's inverse relationship 'walkers' will contain Person A and Person B
the app still can run normally without any diff
Which is true, it doesn't affect he operation of the app. HOWEVER the difference is that by removing the walkers LinkingObjects, there's no way to query Dogs for their Person and get Dog Results (i.e. you can't traverse the graph of the relationship back to the person)
In other words we can query Person for kinds of dog stuff
QUESTION
I have a kiba job that takes a CSV file (with Kiba::Common::Sources::CSV
), enrich its data, merge some rows (with the ChainableAggregateDestination
destination described here) and saves it to another CSV file (with Kiba::Common::Destinations::CSV
).
Now, I want to sort the rows differently (based on the first column) in my destination CSV. I can't find a way to write a transform that does this. I could use post_process
to reopen the destination CSV, sort it and rewrite it but I guess there is a cleaner way...
Can someone point me in the right direction?
...ANSWER
Answered 2021-May-14 at 18:44To sort rows, a good strategy is to use an "aggregating transform", as explained in this article, to store all the rows in memory (although you could do it out of memory), then at transform "close" time, sort them and re-emit them in the pipeline.
This is the most flexible design IMO.
QUESTION
I'm looking at writing one of our ETL (or ETL like) processes in kiba and I wonder how to structure it. The main question I have is the overall architecture. The process works roughly like this:
- Fetch data from an HTTP endpoint.
- For each item returned from that API and make one more HTTP call
- Do some transformations for each of the items returned from step 2
- Send each item somewhere else
Now my question is: Is it OK if only step one is a source
and anything until the end is a transform
? Or would it be better to somehow have each HTTP call be a source
and then combine these somehow, maybe using multiple jobs?
ANSWER
Answered 2021-Mar-12 at 13:54It is indeed best to use a single source
, that you will use to fetch the main stream of the data.
General advice: try to work in batches as much as you can (e.g. pagination in the source, but also bulk HTTP lookup if the API supports it in step 2).
Source sectionThe source in your case could be a paginating HTTP resource, for instance.
A first option to implement it would be to write write a dedicated class like explained in the documentation.
A second option is to use Kiba::Common::Sources::Enumerable
(https://github.com/thbar/kiba-common#kibacommonsourcesenumerable) like this:
QUESTION
A third party system produces an HTML table of parent teacher bookings:
...ANSWER
Answered 2021-Mar-10 at 08:52Kiba author here!
I see at least two ways of doing this (no matter if you work with plain Ruby or with Kiba):
- converting your HTML to a table, then work from that data
- work directly with the HTML table (using Nokogiri & selectors), applicable only if the HTML is mostly clean
In all cases, because you are doing some scraping; I recommend that you have a very defensive code (because HTML changes and can contain bugs or cornercases later), e.g. strong assertions on the fact that the lines / columns contain what you expect, verifications etc.
If you go plain Ruby, then for instance you could do something like (here modelizing your data as text separated with commas to keep things clear):
QUESTION
I'm working on an ETL pipeline with Kiba which imports into multiple, related models in my Rails app. For example, I have records
which have many images
. There might also be collections
which contain many records
.
The source of my data will be various, including HTTP APIs and CSV files. I would like to make the pipeline as modular and reusable as possible, so for each new type of source, I only have to create the source, and the rest of the pipeline definition is the same.
Given multiple models in the destination, and possibly several API calls to get the data from the source, what's the standard pattern for this in Kiba?
I could create one pipeline where the destination is 'the application' and has responsibility for all these models, this feels like the wrong approach because the destination would be responsible for saving data across different Rails models, uploading images etc.
Should I create one master pipeline which triggers more specific ones, passing in a specific type of data (e.g. image URLs for import)? Or is there a better approach than this?
Thanks.
...ANSWER
Answered 2020-Sep-12 at 12:17Kiba author here!
It is natural & common to look for some form of genericity, modularity and reusability in data pipelines. I would say though, that like for regular code, it can be hard initially to figure out what is the correct way to get that (it will depend quite a bit on your exact situation).
This is why my recommendation would be instead to:
- Start simple (on one specific job)
- Very important: make sure to implement end-to-end automated tests (use
webmock
or similar to stub out API requests & make tests completely isolated, create tests with 1 row from source to destination) - this will make it easy to refactor stuff later - Once you have that (1 pipeline with tests), you can start implementing a second one, and refactor to extract interesting patterns as reusable bits, and iterate from there
Depending on your exact situation, maybe you will extract specific components, or maybe you will end up extracting a whole generic job, or generic families of jobs etc.
This approach works well even as you get more experience working with Kiba (this is how I gradually extracted the components that you will find in kiba-common and kiba-pro, too.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install kiba
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page