datafactory | Java library for generating test data

 by   andygibson Java Version: 0.8 License: LGPL-3.0

kandi X-RAY | datafactory Summary

kandi X-RAY | datafactory Summary

datafactory is a Java library. datafactory has no bugs, it has no vulnerabilities, it has build file available, it has a Weak Copyleft License and it has low support. You can download it from GitHub, Maven.

#Generate Test Data with DataFactory. Feb 8th, 2011 in Articles by Andy Gibson. DataFactory is a project I just released which allows you to easily generate test data. It was primarily written for populating database for dev or test environments by providing values for names, addresses, email addresses, phone numbers, text, and dates. To add DataFactory to your maven project, just add it as a dependency in your pom.xml file.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              datafactory has a low active ecosystem.
              It has 149 star(s) with 51 fork(s). There are 15 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 4 open issues and 2 have been closed. On average issues are closed in 55 days. There are 6 open pull requests and 0 closed requests.
              It has a neutral sentiment in the developer community.
              The latest version of datafactory is 0.8

            kandi-Quality Quality

              datafactory has 0 bugs and 0 code smells.

            kandi-Security Security

              datafactory has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              datafactory code analysis shows 0 unresolved vulnerabilities.
              There are 0 security hotspots that need review.

            kandi-License License

              datafactory is licensed under the LGPL-3.0 License. This license is Weak Copyleft.
              Weak Copyleft licenses have some restrictions, but you can use them in commercial projects.

            kandi-Reuse Reuse

              datafactory releases are not available. You will need to build from source code and install.
              Deployable package is available in Maven.
              Build file is available. You can build the component from source.
              Installation instructions are not available. Examples and code snippets are available.
              datafactory saves you 502 person hours of effort in developing the same functionality from scratch.
              It has 1180 lines of code, 95 functions and 9 files.
              It has medium code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed datafactory and discovered the below as its top functions. This is intended to give you an instant insight into datafactory implemented functionality, and help decide if they suit your requirements.
            • Generates a random business name
            • Generates a random city value
            • Gets the boolean value
            • Returns a random item from an array of items
            • Generate an email address
            • Get a random first name
            • Get a random last name
            • Generates a random address
            • Returns a random suffix
            • Get a random street name
            • Generates a random birthdate
            • Generate a random date
            • Returns a random number within the specified range
            • Returns a random number between min and max values
            • Returns a set of numbers
            • Generates a random date between two dates
            • Gets the name and last name
            • Returns a random number
            • Get person prefix
            • Returns a suffix
            Get all kandi verified functions for this library.

            datafactory Key Features

            No Key Features are available at this moment for datafactory.

            datafactory Examples and Code Snippets

            No Code Snippets are available at this moment for datafactory.

            Community Discussions

            QUESTION

            Bicep - data factory identity property
            Asked 2022-Mar-21 at 12:24

            I'm running bicep 0.4.1318.

            I have a "main" bicep module that calls a sub module to provision data factory:

            ...

            ANSWER

            Answered 2022-Mar-21 at 12:24

            QUESTION

            Pipeline failed after implementing MFA
            Asked 2022-Mar-03 at 07:27

            I have made a few pipelines in Azure Data Factory, which transfer and modify data from Blob Storage (Excel Files) to Azure SQL. They were off for like 2 month and the company has implemented MFA on whole Azure Active Directory.

            After that when I try to run the pipelines I have only "Failed status". For every pipeline the error is the same. They are look like this:

            Operation on target Data flow1 failed: {"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: java.lang.Exception: fail to reach https://we.frontend.clouddatahub.net/subscriptions/aa2d32bf-f0d0-4656-807b-7e929da73853/entities/99264214-3071-4faa-87c2-32d9dec7e5a4/identities/00000000-0000-0000-0000-000000000000/token?api-version=2.0 with status code:403, payload:{"error":{"code":"ManagedIdentityInvalidCredential","message":"Acquire MI token from AAD failed. ErrorCode: invalid_client, Message: A configuration issue is preventing authentication - check the error message from the server for details. You can modify the configuration in the application registration portal. See https://aka.ms/msal-net-invalid-client for details. Original exception: AADSTS700027: Client assertion failed signature validation.\r\nTrace ID: 4eef805e-a0ca-494e-bcc2-c01cd755f400\r\nCorrelation ID: f313ba30-9455-4065-90ab-a0fe28dadc99\r\nTimestamp: 2022-02-21 13:11:56Z","details":[],"additionalInfo":[]}}, CorrelationId:171b73ff-5721-45e5-bf95-2b29dc4dd1b4, RunId:887b22ec-6cae-42d3-9580-b93a98800b3c","Details":"java.lang.Exception: fail to reach https://we.frontend.clouddatahub.net/subscriptions/aa2d32bf-f0d0-4656-807b-7e929da73853/entities/99264214-3071-4faa-87c2-32d9dec7e5a4/identities/00000000-0000-0000-0000-000000000000/token?api-version=2.0 with status code:403, payload:{"error":{"code":"ManagedIdentityInvalidCredential","message":"Acquire MI token from AAD failed. ErrorCode: invalid_client, Message: A configuration issue is preventing authentication - check the error message from the server for details. You can modify the configuration in the application registration portal. See https://aka.ms/msal-net-invalid-client for details. Original exception: AADSTS700027: Client assertion failed signature validation.\r\nTrace ID: 4eef805e-a0ca-494e-bcc2-c01cd755f400\r\nCorrelation ID: f313ba30-9455-4065-90ab-a0fe28dadc99\r\nTimestamp: 2022-02-21 13:11:56Z","details":[],"additionalInfo":[]}}, CorrelationId:171b73ff-5721-45e5-bf95-2b29dc4dd1b4, RunId:887b22ec-6cae-42d3-9580-b93a98800b3c\n\tat com.microsoft.datafactory.dat"}

            Is there any way I can evade this error without deactivating MFA?

            ...

            ANSWER

            Answered 2022-Mar-03 at 07:27

            Thank you David Browne - Microsoft for your valuable suggestion. Posting your suggestion as answer to help other community members.

            Use either of Managed identity or Provision a Service principle for authentication. Switch the Authentication to SQL Auth for SQL Server and SAS/Account Key auth for Azure Storage.

            Source https://stackoverflow.com/questions/71209553

            QUESTION

            Moving already read Avro Files from One Directory to Another Using Azure DataFactory Pipeline
            Asked 2022-Mar-02 at 18:59

            I have an Eventhub Capture which emits Avro Files Every 15 minutes. I can successfully read these files with an Azure Data Factory Data Flow however how can I move these files once I have finished reading them using an Azure DataFactory Pipeline?

            There will be new files coming in and I don't want to move the new files only the already processed files to an archive folder.

            I tried to do a GetMetadata Activity and then Delete but it doesn't seem to be able to ready Avro Files in a directory. Could be I'm missing something.

            Thanks

            ...

            ANSWER

            Answered 2022-Mar-02 at 18:59

            Data flow can handle this organically:

            Source https://stackoverflow.com/questions/71326740

            QUESTION

            Trying to start several ADf triggers using asyncio with azure python SDK, I get RuntimeError: got Future attached to a different loop
            Asked 2022-Feb-24 at 06:52

            I'm trying to write a simple python script to start several triggers in my Azure Data Factory. However, when I tried using the aio variants of the azure SDK classes, I get the following error:

            RuntimeError: Task > got Future attached to a different loop

            Here is what my script looks like:

            ...

            ANSWER

            Answered 2022-Feb-24 at 06:52

            You can try either of the following ways to resolve this RuntimeError: Task > got Future attached to a different loop error:

            1. Add asyncio.get_event_loop().run_until_complete(main())` or `app.run(main())

            2. Add loop = asyncio.get_event_loop() app.run(debug=1,loop=loop)

            References: https://github.com/pyrogram/pyrogram/issues/391 and RuntimeError: Task got Future attached to a different loop

            Source https://stackoverflow.com/questions/71228555

            QUESTION

            Get Azure Data Factory logs
            Asked 2022-Feb-16 at 16:53

            I need to retrieve azure data factory pipelines execution logs. I've tried with Web Acticity by using the following request:

            ...

            ANSWER

            Answered 2021-Oct-20 at 03:55

            You can get a list of pipeline runs using Pipeline Runs - Query By Factory

            Next, you can Get a pipeline run by its run ID.

            Source https://stackoverflow.com/questions/69614957

            QUESTION

            Re-Run ADF activity from the point of failure using Azure CLI
            Asked 2022-Feb-15 at 10:28

            Is there a way to execute re-run from failed activity in ADF using Azure CLI?? I read the documetation and only found a way to re-run the trigger

            ...

            ANSWER

            Answered 2022-Feb-15 at 10:01

            This can be done using the following command

            Source https://stackoverflow.com/questions/71124086

            QUESTION

            Azure Datafactory , multi level complex csv structure
            Asked 2022-Feb-07 at 15:19

            We have to deliver a rather complex csv structure and we would like to use Data factory for this. The structure has multiple levels with a global header and trailer+ a subheader (per topic) and it's detail lines. The first column defines which type of line it is. I've simplified the real format just to highlight the questions I have.

            HEADER - common data like export date and number sequence SUBHEADER - topic name 1 DETAIL - detail line of above topic DETAIL - detail line of above topic DETAIL - detail line of above topic SUBHEADER - topic name 2 DETAIL - detail line of above topic DETAIL - detail line of above topic DETAIL - detail line of above topic TRAILER - A closing line with total linecount

            The source data would be the detail lines + the topic name.

            There are 2 problems I'm unable to solve :

            1. How do I convert the source data into the complex SUBHEADER + DETAIL format. To be honest no clue on how to approach this.
            2. Is there a way to add the global header + trailer with total linecount via Datafactory? An alternative would be doing this with an azure function.

            All suggestions are welcome ...

            Regards, Sven Peeters

            ...

            ANSWER

            Answered 2022-Feb-07 at 15:19

            You have a couple of choices with Azure Data Factory:

            • take an ELT approach where you use some type of compute (eg a SQL database, Databricks, Azure Batch, Azure Function or Azure Synapse serverless SQL pools if you're working in Synapse) to do the hard work structuring the file and outputting it. ADF is really just doing the orchestration (telling other processes what to do in what order) and handling the output. The compute is handling the fiddly bit.
            • take an ETL approach and use Mapping Data Flows. This is a low-code approach which uses on-demand Spark clusters in the background. You do not have to manage them.

            I would be tempted to use SQL to do this, particularly if you already have some in your infrastructure. A simplified example:

            Source https://stackoverflow.com/questions/71014927

            QUESTION

            Azure Databricks Execution Fail - CLOUD_PROVIDER_LAUNCH_FAILURE
            Asked 2022-Feb-07 at 14:09

            I'm using Azure DataFactory for my data ingestion and using an Azure Databricks notebook through ADF's Notebook activity.

            The Notebook uses an existing instance pool of Standard DS3_V2 (2-5 nodes autoscaled) with 7.3LTS Spark Runtime version. The same Azure subscription is used by multiple teams for their respective data pipelines.

            During the ADF pipeline execution, I'm facing a notebook activity failure frequently with the below error message

            ...

            ANSWER

            Answered 2022-Feb-07 at 13:26

            The problem arise from the fact that when your workspace was created, the network and subnet sizes wasn't planned correctly (see docs). As result, when you're trying to launch a cluster, then there is not enough IP addresses in a given subnet, and given this error.

            Unfortunately right now it's not possible to expand network/subnets size, so if you need a bigger network, then you need to deploy a new workspace and migrate into it.

            Source https://stackoverflow.com/questions/71018067

            QUESTION

            Data Factory Deploy Managed Private Endpoint. Error: Invalid payload
            Asked 2022-Jan-25 at 02:49

            I have been using the new ADF CI/CD process as described here: ms doc. This worked well until I secured the linked services through managed private endpoints.

            A build pipeline generates an ARM template and parameters file based what what is deployed to the data factory in my "Dev" environment. The template and parameters file are then published from the build and made available to the release pipeline. At this point, the generated parameters just contains placeholder values.

            The release pipeline executes the ARM template, taking template values from the "Override template parameters" text box:

            My problem is, when this runs I get the following error from the resource group deployment:

            "Invalid resource request. Resource type: 'ManagedPrivateEndpoint', Resource name: 'pe-ccsurvey-blob-001' 'Error: Invalid payload'."

            From the Azure Portal, I navigated to the resource group deployment, where I was able to view the template and parameters file used.

            Definition of the required private endpoint from the template file is shown below:

            ...

            ANSWER

            Answered 2021-Oct-31 at 11:17

            Going through the official Best practices for CI/CD,

            If a private endpoint already exists in a factory and you try to deploy an ARM template that contains a private endpoint with the same name but with modified properties, the deployment will fail.

            Source https://stackoverflow.com/questions/69765263

            QUESTION

            Syncing Metadata from Azure DataFactory to SQL Server database
            Asked 2022-Jan-06 at 12:07

            I am trying to sync metadata from an Azure DataFactory pipeline to a table in a SQL Server database.

            The output visible in the Read Metadata activity in Azure is as follows:

            ...

            ANSWER

            Answered 2022-Jan-06 at 12:07

            I figured out what I was doing wrong here. The parameter value has to be stored as string (using parentheses).

            Source https://stackoverflow.com/questions/70602543

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install datafactory

            You can download it from GitHub, Maven.
            You can use datafactory like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the datafactory component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .

            Support

            For any new features, suggestions and bugs create an issue on GitHub. If you have any questions check and ask questions on community page Stack Overflow .
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
            Maven
            Gradle
            CLONE
          • HTTPS

            https://github.com/andygibson/datafactory.git

          • CLI

            gh repo clone andygibson/datafactory

          • sshUrl

            git@github.com:andygibson/datafactory.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link

            Reuse Pre-built Kits with datafactory

            Consider Popular Java Libraries

            CS-Notes

            by CyC2018

            JavaGuide

            by Snailclimb

            LeetCodeAnimation

            by MisterBooo

            spring-boot

            by spring-projects

            Try Top Libraries by andygibson

            datavalve

            by andygibsonJava

            knappsack

            by andygibsonJava

            giftwrap

            by andygibsonJava

            jtexgen

            by andygibsonHTML