datafactory | Java library for generating test data
kandi X-RAY | datafactory Summary
kandi X-RAY | datafactory Summary
#Generate Test Data with DataFactory. Feb 8th, 2011 in Articles by Andy Gibson. DataFactory is a project I just released which allows you to easily generate test data. It was primarily written for populating database for dev or test environments by providing values for names, addresses, email addresses, phone numbers, text, and dates. To add DataFactory to your maven project, just add it as a dependency in your pom.xml file.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Generates a random business name
- Generates a random city value
- Gets the boolean value
- Returns a random item from an array of items
- Generate an email address
- Get a random first name
- Get a random last name
- Generates a random address
- Returns a random suffix
- Get a random street name
- Generates a random birthdate
- Generate a random date
- Returns a random number within the specified range
- Returns a random number between min and max values
- Returns a set of numbers
- Generates a random date between two dates
- Gets the name and last name
- Returns a random number
- Get person prefix
- Returns a suffix
datafactory Key Features
datafactory Examples and Code Snippets
Community Discussions
Trending Discussions on datafactory
QUESTION
I'm running bicep 0.4.1318.
I have a "main" bicep module that calls a sub module to provision data factory:
...ANSWER
Answered 2022-Mar-21 at 12:24You need to add
QUESTION
I have made a few pipelines in Azure Data Factory, which transfer and modify data from Blob Storage (Excel Files) to Azure SQL. They were off for like 2 month and the company has implemented MFA on whole Azure Active Directory.
After that when I try to run the pipelines I have only "Failed status". For every pipeline the error is the same. They are look like this:
Operation on target Data flow1 failed: {"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: java.lang.Exception: fail to reach https://we.frontend.clouddatahub.net/subscriptions/aa2d32bf-f0d0-4656-807b-7e929da73853/entities/99264214-3071-4faa-87c2-32d9dec7e5a4/identities/00000000-0000-0000-0000-000000000000/token?api-version=2.0 with status code:403, payload:{"error":{"code":"ManagedIdentityInvalidCredential","message":"Acquire MI token from AAD failed. ErrorCode: invalid_client, Message: A configuration issue is preventing authentication - check the error message from the server for details. You can modify the configuration in the application registration portal. See https://aka.ms/msal-net-invalid-client for details. Original exception: AADSTS700027: Client assertion failed signature validation.\r\nTrace ID: 4eef805e-a0ca-494e-bcc2-c01cd755f400\r\nCorrelation ID: f313ba30-9455-4065-90ab-a0fe28dadc99\r\nTimestamp: 2022-02-21 13:11:56Z","details":[],"additionalInfo":[]}}, CorrelationId:171b73ff-5721-45e5-bf95-2b29dc4dd1b4, RunId:887b22ec-6cae-42d3-9580-b93a98800b3c","Details":"java.lang.Exception: fail to reach https://we.frontend.clouddatahub.net/subscriptions/aa2d32bf-f0d0-4656-807b-7e929da73853/entities/99264214-3071-4faa-87c2-32d9dec7e5a4/identities/00000000-0000-0000-0000-000000000000/token?api-version=2.0 with status code:403, payload:{"error":{"code":"ManagedIdentityInvalidCredential","message":"Acquire MI token from AAD failed. ErrorCode: invalid_client, Message: A configuration issue is preventing authentication - check the error message from the server for details. You can modify the configuration in the application registration portal. See https://aka.ms/msal-net-invalid-client for details. Original exception: AADSTS700027: Client assertion failed signature validation.\r\nTrace ID: 4eef805e-a0ca-494e-bcc2-c01cd755f400\r\nCorrelation ID: f313ba30-9455-4065-90ab-a0fe28dadc99\r\nTimestamp: 2022-02-21 13:11:56Z","details":[],"additionalInfo":[]}}, CorrelationId:171b73ff-5721-45e5-bf95-2b29dc4dd1b4, RunId:887b22ec-6cae-42d3-9580-b93a98800b3c\n\tat com.microsoft.datafactory.dat"}
Is there any way I can evade this error without deactivating MFA?
...ANSWER
Answered 2022-Mar-03 at 07:27Thank you David Browne - Microsoft for your valuable suggestion. Posting your suggestion as answer to help other community members.
Use either of
Managed identity
orProvision a Service principle
for authentication. Switch the Authentication toSQL Auth for SQL Server
andSAS/Account Key auth for Azure Storage
.
QUESTION
I have an Eventhub Capture which emits Avro Files Every 15 minutes. I can successfully read these files with an Azure Data Factory Data Flow however how can I move these files once I have finished reading them using an Azure DataFactory Pipeline?
There will be new files coming in and I don't want to move the new files only the already processed files to an archive folder.
I tried to do a GetMetadata Activity and then Delete but it doesn't seem to be able to ready Avro Files in a directory. Could be I'm missing something.
Thanks
...ANSWER
Answered 2022-Mar-02 at 18:59QUESTION
I'm trying to write a simple python script to start several triggers in my Azure Data Factory. However, when I tried using the aio variants of the azure SDK classes, I get the following error:
RuntimeError: Task > got Future attached to a different loop
Here is what my script looks like:
...ANSWER
Answered 2022-Feb-24 at 06:52You can try either of the following ways to resolve this RuntimeError: Task > got Future attached to a different loop
error:
Add
asyncio.get_event_loop().run_until_complete(main())` or `app.run(main())
Add
loop = asyncio.get_event_loop() app.run(debug=1,loop=loop)
References: https://github.com/pyrogram/pyrogram/issues/391 and RuntimeError: Task got Future attached to a different loop
QUESTION
I need to retrieve azure data factory pipelines execution logs. I've tried with Web Acticity by using the following request:
...ANSWER
Answered 2021-Oct-20 at 03:55You can get a list of pipeline runs using Pipeline Runs - Query By Factory
Next, you can Get a pipeline run by its run ID.
QUESTION
Is there a way to execute re-run from failed activity in ADF using Azure CLI?? I read the documetation and only found a way to re-run the trigger
...ANSWER
Answered 2022-Feb-15 at 10:01This can be done using the following command
QUESTION
We have to deliver a rather complex csv structure and we would like to use Data factory for this. The structure has multiple levels with a global header and trailer+ a subheader (per topic) and it's detail lines. The first column defines which type of line it is. I've simplified the real format just to highlight the questions I have.
HEADER - common data like export date and number sequence SUBHEADER - topic name 1 DETAIL - detail line of above topic DETAIL - detail line of above topic DETAIL - detail line of above topic SUBHEADER - topic name 2 DETAIL - detail line of above topic DETAIL - detail line of above topic DETAIL - detail line of above topic TRAILER - A closing line with total linecount
The source data would be the detail lines + the topic name.
There are 2 problems I'm unable to solve :
- How do I convert the source data into the complex SUBHEADER + DETAIL format. To be honest no clue on how to approach this.
- Is there a way to add the global header + trailer with total linecount via Datafactory? An alternative would be doing this with an azure function.
All suggestions are welcome ...
Regards, Sven Peeters
...ANSWER
Answered 2022-Feb-07 at 15:19You have a couple of choices with Azure Data Factory:
- take an ELT approach where you use some type of compute (eg a SQL database, Databricks, Azure Batch, Azure Function or Azure Synapse serverless SQL pools if you're working in Synapse) to do the hard work structuring the file and outputting it. ADF is really just doing the orchestration (telling other processes what to do in what order) and handling the output. The compute is handling the fiddly bit.
- take an ETL approach and use Mapping Data Flows. This is a low-code approach which uses on-demand Spark clusters in the background. You do not have to manage them.
I would be tempted to use SQL to do this, particularly if you already have some in your infrastructure. A simplified example:
QUESTION
I'm using Azure DataFactory for my data ingestion and using an Azure Databricks notebook through ADF's Notebook activity.
The Notebook uses an existing instance pool of Standard DS3_V2 (2-5 nodes autoscaled) with 7.3LTS Spark Runtime version. The same Azure subscription is used by multiple teams for their respective data pipelines.
During the ADF pipeline execution, I'm facing a notebook activity failure frequently with the below error message
...ANSWER
Answered 2022-Feb-07 at 13:26The problem arise from the fact that when your workspace was created, the network and subnet sizes wasn't planned correctly (see docs). As result, when you're trying to launch a cluster, then there is not enough IP addresses in a given subnet, and given this error.
Unfortunately right now it's not possible to expand network/subnets size, so if you need a bigger network, then you need to deploy a new workspace and migrate into it.
QUESTION
I have been using the new ADF CI/CD process as described here: ms doc. This worked well until I secured the linked services through managed private endpoints.
A build pipeline generates an ARM template and parameters file based what what is deployed to the data factory in my "Dev" environment. The template and parameters file are then published from the build and made available to the release pipeline. At this point, the generated parameters just contains placeholder values.
The release pipeline executes the ARM template, taking template values from the "Override template parameters" text box:
My problem is, when this runs I get the following error from the resource group deployment:
"Invalid resource request. Resource type: 'ManagedPrivateEndpoint', Resource name: 'pe-ccsurvey-blob-001' 'Error: Invalid payload'."
From the Azure Portal, I navigated to the resource group deployment, where I was able to view the template and parameters file used.
Definition of the required private endpoint from the template file is shown below:
...ANSWER
Answered 2021-Oct-31 at 11:17Going through the official Best practices for CI/CD,
If a private endpoint already exists in a factory and you try to deploy an ARM template that contains a private endpoint with the same name but with modified properties, the deployment will fail.
QUESTION
I am trying to sync metadata from an Azure DataFactory pipeline to a table in a SQL Server database.
The output visible in the Read Metadata activity in Azure is as follows:
...ANSWER
Answered 2022-Jan-06 at 12:07I figured out what I was doing wrong here. The parameter value has to be stored as string (using parentheses).
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install datafactory
You can use datafactory like any standard Java library. Please include the the jar files in your classpath. You can also use any IDE and you can run and debug the datafactory component as you would do with any other Java program. Best practice is to use a build tool that supports dependency management such as Maven or Gradle. For Maven installation, please refer maven.apache.org. For Gradle installation, please refer gradle.org .
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page