scriptella-etl | open source ETL and script | Data Migration library

 by   scriptella Java Version: scriptella-parent-1.2 License: Apache-2.0

kandi X-RAY | scriptella-etl Summary

kandi X-RAY | scriptella-etl Summary

scriptella-etl is a Java library typically used in Migration, Data Migration applications. scriptella-etl has no bugs, it has no vulnerabilities, it has build file available, it has a Permissive License and it has low support. You can download it from GitHub, Maven.

Open source ETL (Extract-Transform-Load) and script execution tool written in Java. Its primary focus is simplicity. It doesn't require the user to learn another complex XML-based language to use it, but allows the use of SQL or another scripting language suitable for the data source to perform required transformations.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              scriptella-etl has a low active ecosystem.
              It has 91 star(s) with 41 fork(s). There are 18 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 19 open issues and 11 have been closed. On average issues are closed in 332 days. There are 4 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of scriptella-etl is scriptella-parent-1.2

            kandi-Quality Quality

              scriptella-etl has 0 bugs and 0 code smells.

            kandi-Security Security

              scriptella-etl has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              scriptella-etl code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              scriptella-etl is licensed under the Apache-2.0 License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              scriptella-etl releases are available to install and integrate.
              Deployable package is available in Maven.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              scriptella-etl saves you 14400 person hours of effort in developing the same functionality from scratch.
              It has 28818 lines of code, 1768 functions and 495 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed scriptella-etl and discovered the below as its top functions. This is intended to give you an instant insight into scriptella-etl implemented functionality, and help decide if they suit your requirements.
            • Reads the next statement
            • Search for a C - style comment
            • Read a normalized character
            • Positions the end of line
            • Create a dataset based on the given properties
            • Sorts the tables
            • Append column names
            • Gets the columns for a table
            • Configures the element
            • Execute the script
            • Executes the given script
            • Attempts to load one of the specified drivers
            • Shut down HSQLDB
            • Executes a query on a text provider
            • Returns a string representation of the connection
            • Sets the value of the designated parameter
            • Parses a property as a URL parameter
            • Initializes EtlExecutor
            • Configures the dialect
            • Compiles queries into a list of patterns
            • Returns a string representation of this entry
            • Parses the URL and extracts the query parameters
            • Parses the ldif file
            • Parse an SQL statement
            • Configures the properties
            • Executes a JavaScript source
            Get all kandi verified functions for this library.

            scriptella-etl Key Features

            No Key Features are available at this moment for scriptella-etl.

            scriptella-etl Examples and Code Snippets

            No Code Snippets are available at this moment for scriptella-etl.

            Community Discussions

            QUESTION

            Migrating multiple Databases using Migration Assistant Tool - SQL SERVER to AZURE SQL Database
            Asked 2022-Mar-29 at 08:15

            I'm trying to migrate all the old databases from SQL SERVER to AZURE SQL Database using Database Migration tool and successfully able to do.

            There are more than 100 databases to migrate so for each and every database running the tool and repeating the process is lot of process.

            Can someone help with migrating Multiple databases in one go or doing one at a time is the only solution.

            Thanks.

            ...

            ANSWER

            Answered 2022-Mar-29 at 08:15

            You can only select a single database from your source to migrate to the Azure SQL database using the Data Migration Assistant tool.

            You can create an Azure Database Migration Service resource in the Azure portal and select one or more databases to migrate from SQL Server to Azure SQL Database.

            Refer this Microsoft documentation for detailed process.

            Source https://stackoverflow.com/questions/71657444

            QUESTION

            Oracle to Postgres Table load using SSIS load performance improvement suggestions?
            Asked 2022-Feb-20 at 09:22

            I'm trying to load a table from Oracle to Postgres using SSIS, with ~200 million records. Oracle, Postgres, and SSIS are on separate servers.

            Reading data from Oracle

            To read data from the Oracle database, I am using an OLE DB connection using "Oracle Provider for OLE DB". The OLE DB Source is configured to read data using an SQL Command.

            In total there are 44 columns, mostly varchar, 11 numeric, and 3 timestamps.

            Loading data into Postgres

            To lead data into Postgres, I am using an ODBC connection. The ODBC destination component is configured to load data in batch mode (not row-by-row insertion).

            SSIS configuration

            I created an SSIS package that only contains a straightforward Data Flow Task.

            Issue

            The load seems to take many hours to reach even a million count. The source query is giving results quickly while executing in SQL developer. But when I tried to export it threw limit exceeded error.

            In SSIS, when I tried to preview the result of the Source SQL command it returned: The system cannot find message text for message number 0x80040e51 in the message file for OraOLEDB. (OraOLEDB)

            Noting that the source(SQL command) and target table don't have any indexes.

            Could you please suggest any methods to improve the load performance?

            ...

            ANSWER

            Answered 2022-Feb-20 at 09:22

            I will try to give some tips to help you improve your package performance. You should start troubleshooting your package systematically to find the performance bottleneck.

            Some provided links are related to SQL Server. No worries! The same rules are applied in all database management systems.

            1. Available resources

            First, you should ensure that you have sufficient resources to load the data from the source server into the destination server.

            Ensure that the available memory on the source, ETL, and destination servers can handle the amount of data you are trying to load. Besides, make sure that your network connection bandwidth is not decreasing the data transfer performance.

            Make sure that the following hardware issues are not occurring in any of the servers:

            1. Drive out of storage
            2. Server is out of memory

            Make sure that your machine is not running out of memory. You can simply use the Task Manager to identify the amount of available memory.

            2. The Data Source Make sure that the table is not a heap

            After checking the available resources, you should ensure that your data source is not a heap. At least it would be best if you had a primary key created on that table.

            Indexes

            If your SQL Command contains any filtering, ordering, or Joins, you should create the indexes needed by those operations.

            OLE DB provider

            Instead of using OLE DB source to connect to Oracle, try using the Microsoft Connector for Oracle (Previously known as Attunity connectors). Microsoft previously mentioned that it should provide faster performance than traditional OLE DB providers.

            Use the Oracle connection manager rather than OLE DB connection manager.

            To be honest, I am not sure if this component can make the data load faster since I didn't test it before

            Removing the destination and adding a dummy task

            The last thing to try is to remove the ODBC destination and add any dummy task. For example, use a row count component.

            Run the package; if the data is loaded faster, then loading data from Oracle is not decreasing the performance.

            3. SSIS configuration

            Now, let us start troubleshooting the SSIS package.

            Running in 64-bit mode

            First, try to execute the package in 64-bit mode. You can change this from the package configuration. Make sure that the Run64bitRuntime property is set to True.

            Data flow task buffer size/limits

            Using SSIS, data is loaded in memory while being transferred from source to destination. There are two properties in the data flow task that specifies how much data is transferred in memory buffers used by the SSIS pipelines.

            Based on the following Integration Services performance tuning white paper:

            DefaultMaxBufferRows – DefaultMaxBufferRows is a configurable setting of the SSIS Data Flow task that is automatically set at 10,000 records. SSIS multiplies the Estimated Row Size by the DefaultMaxBufferRows to get a rough sense of your dataset size per 10,000 records. You should not configure this setting without understanding how it relates to DefaultMaxBufferSize.

            DefaultMaxBufferSize – DefaultMaxBufferSize is another configurable setting of the SSIS Data Flow task. The DefaultMaxBufferSize is automatically set to 10 MB by default. As you configure this setting, keep in mind that its upper bound is constrained by an internal SSIS parameter called MaxBufferSize which is set to 100 MB and can not be changed.

            You should try to change those values and test your package performance each time you change them until the package performance increases.

            4. Destination Indexes/Triggers/Constraints

            You should make sure that the destination table does not have any constraints or triggers since they significantly decrease the data load performance; each batch inserted should be validated or preprocessed before storing it.

            Besides, the more you have indexes the lower is the data load performance.

            ODBC Destination custom properties

            ODBC destination has several custom properties that can affect the data load performance such as BatchSize (Rows per batch), TransactionSize (TransactionSize is only available in the advanced editor).

            Source https://stackoverflow.com/questions/71175983

            QUESTION

            How to seed users, roles when I have TWO DbContext, a) IdentityDbContext & b) AppDbContext
            Asked 2022-Jan-18 at 06:06

            normally I would put the code for seeding users in the OnModelCreating() Its not an issue with one DbContext.

            Recently it was recommended to separate identity's DbContext from the App's DbContext, which has added two DbContexts and some confusion.
            1. public UseManagementDbContext(DbContextOptions options) : base(options) // security/users separation
            2. public AppDbContext(DbContextOptions options) : base(options)

            But when you have two separate DbContexts, How do I seed the user / role data?

            1. IdentityDbContext -- this has all the context for seeding

            2. AppDbContext -- this does NOT have context, but my Migrations use this context.

            Can you please help how to seed the user and role data, and is part of the migrations or startup etc.

            Update: using core 6 sample, @Zhi Lv - how do I retrofit into program.cs my seedData when the app is fired up

            My Program.cs was from created originally from the ASP Core 3.1 template, it looks like this, what should I refactor, (oddly in the link at MS, there are no class name brackets, so where does my seed get setup and invoked from?

            ...

            ANSWER

            Answered 2022-Jan-18 at 06:06

            To seed the user and role data, you can create a static class, like this:

            Source https://stackoverflow.com/questions/70734620

            QUESTION

            How to copy and find the last 125 row?
            Asked 2021-Oct-29 at 08:59

            I have a task where I need to get the last 125 data from an excel workbook copied to another workbook. And I want the user to select from a file browser the excel file where the data has been stored. The data will always in the the range of C17:C2051, F17:F2051 and goes on...

            At last I want to put two formula above these ranges.

            There are the formulas:

            ...

            ANSWER

            Answered 2021-Oct-27 at 17:46

            This should get you started:

            Source https://stackoverflow.com/questions/69741255

            QUESTION

            Convert a file URL to file byte in using a stored procedure?
            Asked 2021-Oct-27 at 00:21

            My database has a table of person, and on that table there is a column named PersonImageUrl (which is hosted in a public container in azure). I have an upcoming migration to a new database. The new column in the new database table requires VARBINARY(max). I want to create a stored procedure to convert the contents of PersonImageUrl (file where the URL points to) into a byte array so that it would meet the requirements of my migration. Is this possible?

            ...

            ANSWER

            Answered 2021-Oct-27 at 00:13

            Azure SQL Database (and SQL Server 2017+) has a built-in integration to Azure BLOB storage, and for a public container you don't even need a credential, Eg:

            Source https://stackoverflow.com/questions/69729954

            QUESTION

            How Copy the Data from Azure CosmosDb to local using AzCopy tool?
            Asked 2021-Oct-06 at 09:07

            I want to copy the Data from the AzureCosmosDb Database\Container to the Local.

            I am trying with the Azcopy tool. I have tried the query as per this URL. But its not working. The query which I have tried is as below :

            ...

            ANSWER

            Answered 2021-Oct-06 at 09:07

            The above method is recommended for Table API as mentioned in the doc, if you want to migrate data of SQL API use the Data Migration Tool.

            Source https://stackoverflow.com/questions/69462596

            QUESTION

            ADF vs SSIS recommendation for data migration
            Asked 2021-Sep-28 at 13:44

            I hope this is ok to ask here. I have been looking through so many sites but still unable to come up with a decision. Here's the scenario. I have a legacy application that has it's data in Sql Server database(s). A new application has now been created that also will be storing data in a Sql Server database. I need to now migrate the data from the legacy to the new application. The legacy database(s) structures have been modified in the new application to follow best practices and make it more efficient (eg: use of PK, FK, indexes, lookups, better table structures etc). So there will be a lot of transformation (lookups, data cleaning, merging/splitting data etc) happening from source to destination. Initially we will be doing only 5 years worth of data, but at a later point we may need to move the rest of the data across.
            The company uses Azure for storage and there is no on-prem resources.
            Given this situation, what would be the best option for Data Migration? SSIS or ADF? What are the advantages of one over the other (other than the fact that ADF is Azure cloud based and MS are probably moving to ADF more in the future). We will also need Dev/Test/Prod environments if that matters.

            ...

            ANSWER

            Answered 2021-Sep-28 at 11:33

            Considering the company doesn't have on-prem resources, I would be looking at implementing the data migration on Azure Data Factory. Below are few points to take into consideration:

            Pros:

            1. Integration between ADF and other Azure resources e.g. SQL Database is seamless and doesn't require connector setup, etc.
            2. You would take advantage of Microsoft's network to improve your data transfer, your data won't go over the network, everything is within MS data centers.
            3. More secure and reliable transfer, you could take advantage of ADF Managed Identity for authenticating to your source and destination.
            4. Since there will be a lot of changes, splits, etc, you can take advantage of ADF's ability to start from where the pipelines failed. On the other hand, in SSIS you'd need to start all over again.
            5. Better monitoring capabilities

            Cons:

            1. You'd need an infrastructure to develop, deploy, and run your SSIS packages, which will increase implementation time and maintenance overhead.
            2. You could run SSIS packages using ADF, but it requires a much bigger implementation to host your packages and run them. Also, it'd be more costly.
            3. If the plan is to use a VM, there is an additional overhead to set up the VM and SSIS. Also, the cost associated with spinning up a new VM and SQL Server.
            4. Not great monitoring and retry capabilities

            Source https://stackoverflow.com/questions/69360356

            QUESTION

            How to migrate DB2 z/OS data to MySQL?
            Asked 2021-Sep-23 at 07:18

            As part of the data migration from DB2 z/OS (Mainframe) to Google cloud SQL, I don't see the direct service/connector provided by google or IBM. So, I am exploring the option to move the data to MySQL first and then to Cloud SQL.

            I could see the solution to migrate from Mysql to cloud SQL but not DB2 to MYSQL.

            I searched in google for this need but I could not find the resolution.

            Will it be connected based on JDBC connection or something else?

            ...

            ANSWER

            Answered 2021-Sep-23 at 07:18

            The approach of migrating first to CSV and then to Cloud SQL for MySQL sounds good. As you said, you will need to create a Cloud Storage Bucket [1], upload your CSV file to the bucket, and follow the steps here [2] corresponding to MySQL.

            [1] - https://cloud.google.com/storage/docs/creating-buckets

            [2] - https://cloud.google.com/sql/docs/mysql/import-export/import-export-csv#import

            Source https://stackoverflow.com/questions/69209914

            QUESTION

            how to delete using data_migration?
            Asked 2021-Jul-28 at 06:29

            I wanted to know how to complete this method to delete conversation classes from September 7th, 2021 onwards

            ...

            ANSWER

            Answered 2021-Jul-28 at 06:29

            You could get time like so Time.new(2021, 9, 7)

            Source https://stackoverflow.com/questions/68551455

            QUESTION

            Migrating Cosmos Db sql api from one container to another using DMT tool
            Asked 2021-Jul-14 at 09:41

            I am trying to copy my documents from one container of my db to another container in the same db. I followed this document https://docs.microsoft.com/en-us/azure/cosmos-db/cosmosdb-migrationchoices

            and tried using DMT tool. After verifying my connection string of source and target and on clicking Import, I get error as

            Errors":["The collection cannot be accessed with this SDK version as it was created with newer SDK version."]}".

            I simply created the target collection from the UI. I tried by both ways(inserting Partition Key and keeping it blank). What wrong am I doing?

            ...

            ANSWER

            Answered 2021-Jul-14 at 07:24

            What wrong am I doing?

            You're not doing anything wrong. It's just that the Cosmos DB SDK used by this tool is very old (Microsoft.Azure.DocumentDB version 2.4.1) which targets an older version of the Cosmos DB REST API. Since you created your container using a newer version of the Cosmos DB REST API, you're getting this error.

            If your container is pretty basic (in the sense that it does not make use of anything special like auto scaling etc.), what you can do is create the container from the Data Migration Tool UI itself. That way you will not run into compatibility issues.

            Source https://stackoverflow.com/questions/68373639

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install scriptella-etl

            You can download it from GitHub, Maven.
            You can use scriptella-etl like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the scriptella-etl component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

            Support

            Documentation is available in the docs/ directory. Up to date reference manual is available at http://scriptella.org/reference. Guidelines for developers available at https://github.com/scriptella/scriptella-etl/wiki.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            CLONE
          • HTTPS

            https://github.com/scriptella/scriptella-etl.git

          • CLI

            gh repo clone scriptella/scriptella-etl

          • sshUrl

            git@github.com:scriptella/scriptella-etl.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Explore Related Topics

            Consider Popular Data Migration Libraries

            Try Top Libraries by scriptella

            scriptella-examples

            by scriptellaJava

            scriptella-mongodb

            by scriptellaJava

            scriptella.github.io

            by scriptellaHTML