csv-schema | CSV file and generates database table schema | SQL Database library
kandi X-RAY | csv-schema Summary
kandi X-RAY | csv-schema Summary
Analyzes a CSV file and generates database table schema, all within the browser. This application parses CSV files (including huge ones) within the browser. It analyzes each field to suggest the best database field type, max length, and whether or not there are any null values. From there, you can rename fields, ignore them, override field types/lengths, etc. and generate database table creation sql for MySQL, MariaDB, Postres, Oracle, or SQLite3.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Detect the type of a sample .
- incoming function
- populate the header
- Check if a string is readable
- Parse the string delimited into a string
- this is the right logic
- return true if e is a function
- create interpolated function
- value - > integer
- setup worker
csv-schema Key Features
csv-schema Examples and Code Snippets
Community Discussions
Trending Discussions on csv-schema
QUESTION
I am getting a CSV file from a 3rd party. Schema for this file is dynamic, the only thing I can be certain of is,
- each column with data will also have header name.
- file will always have a header.
- header name will always be a string of alphabets with no spaces and dots. (so, kind of "clean").
- values should be treated as strings, as I am not sure what they will be sending.
Now to use this type of data in my system, I am thinking of using MongoDB as staging area. As no. of columns, or order of columns, or columns name are not constant from one load to another. I think MongoDB will serve a good staging area.
I read about ConvertRecord processor, which is ideal for CSV to JSON converter, but I don't have a schema. I just want each row to go as a document, with header name as a key and value as value.
How should I go about it? Also this file is going to be in some 25-30 GB range, so I do not want to bring down my system.
I thought of doing it by my own processor (in Java), and I was able to get what I am looking for, but it seems to be taking too much time, and it kind of doesn't look optimal.
Let me know, if this can be achieved via existing processor?
Thanks, Rakesh
Updated on : 09/05/2018
a2bd0551-0165-1000-7c6a-a32ca4db047ccsv_to_json_no_schema_v191bc4a66-704c-3a2f-0000-000000000000defb04c4-c15c-3a07-0000-0000000000001 GB10000defb04c4-c15c-3a07-0000-000000000000bb6c25ae-f2b6-386a-0000-000000000000PROCESSOR0 sec1successdefb04c4-c15c-3a07-0000-000000000000eb6cd54a-e1f1-3871-0000-000000000000PROCESSOR0ad804e3c-f233-3556-0000-000000000000defb04c4-c15c-3a07-0000-0000000000001 GB10000defb04c4-c15c-3a07-0000-00000000000064b15a56-8a5f-3297-0000-000000000000PROCESSOR0 sec1invaliddefb04c4-c15c-3a07-0000-000000000000bb6c25ae-f2b6-386a-0000-000000000000PROCESSOR0c30bd123-c436-36ce-0000-000000000000defb04c4-c15c-3a07-0000-0000000000001 GB10000defb04c4-c15c-3a07-0000-0000000000008a0e37da-acd2-3d72-0000-000000000000PROCESSOR0 sec1validdefb04c4-c15c-3a07-0000-000000000000bb6c25ae-f2b6-386a-0000-000000000000PROCESSOR0247d2139-26b7-31fe-0000-000000000000defb04c4-c15c-3a07-0000-0000000000001 GB10000defb04c4-c15c-3a07-0000-0000000000001297bea9-b30f-3f45-0000-000000000000PROCESSOR0 sec1failuredefb04c4-c15c-3a07-0000-0000000000008a0e37da-acd2-3d72-0000-000000000000PROCESSOR045e5403f-99f7-3ddf-0000-000000000000defb04c4-c15c-3a07-0000-0000000000001 GB10000defb04c4-c15c-3a07-0000-0000000000009f8f32f7-130c-35bd-0000-000000000000PROCESSOR0 sec1successdefb04c4-c15c-3a07-0000-0000000000008a0e37da-acd2-3d72-0000-000000000000PROCESSOR088b0195a-34b2-34f0-0000-000000000000defb04c4-c15c-3a07-0000-000000000000nifi-record-serialization-services-narorg.apache.nifi1.6.0Schema Write StrategySchema Write Strategyschema-access-strategyschema-access-strategyschema-registryorg.apache.nifi.schemaregistry.services.SchemaRegistryschema-registryschema-nameschema-nameschema-versionschema-versionschema-branchschema-branchschema-textschema-textDate FormatDate FormatTime FormatTime FormatTimestamp FormatTimestamp FormatPretty Print JSONPretty Print JSONsuppress-nullssuppress-nullsJsonRecordSetWriterfalseSchema Write Strategyno-schemaschema-access-strategyschema-registryschema-nameschema-versionschema-branchschema-textDate FormatTime FormatTimestamp FormatPretty Print JSONsuppress-nullsENABLEDorg.apache.nifi.json.JsonRecordSetWriterc3e80a29-498b-36d4-0000-000000000000defb04c4-c15c-3a07-0000-000000000000nifi-record-serialization-services-narorg.apache.nifi1.6.0schema-access-strategyschema-access-strategyschema-registryorg.apache.nifi.schemaregistry.services.SchemaRegistryschema-registryschema-nameschema-nameschema-versionschema-versionschema-branchschema-branchschema-textschema-textcsv-reader-csv-parsercsv-reader-csv-parserDate FormatDate FormatTime FormatTime FormatTimestamp FormatTimestamp FormatCSV FormatCSV FormatValue SeparatorValue SeparatorSkip Header LineSkip Header Lineignore-csv-headerignore-csv-headerQuote CharacterQuote CharacterEscape CharacterEscape CharacterComment MarkerComment MarkerNull StringNull StringTrim FieldsTrim Fieldscsvutils-character-setcsvutils-character-setCSVReaderfalseschema-access-strategyschema-registryschema-nameschema-versionschema-branchschema-textcsv-reader-csv-parserDate FormatTime FormatTimestamp FormatCSV FormatValue SeparatorSkip Header Linetrueignore-csv-headertrueQuote CharacterEscape CharacterComment MarkerNull StringTrim Fieldscsvutils-character-setENABLEDorg.apache.nifi.csv.CSVReader8a0e37da-acd2-3d72-0000-000000000000defb04c4-c15c-3a07-0000-0000000000000.0227.99996948242188nifi-standard-narorg.apache.nifi1.6.0WARN1record-readerorg.apache.nifi.serialization.RecordReaderFactoryrecord-readerrecord-writerorg.apache.nifi.serialization.RecordSetWriterFactoryrecord-writerALLfalse30 secrecord-readerc3e80a29-498b-36d4-0000-000000000000record-writer88b0195a-34b2-34f0-0000-00000000000000 secTIMER_DRIVEN1 secConvertRecordfalsefailurefalsesuccessSTOPPEDorg.apache.nifi.processors.standard.ConvertRecord9f8f32f7-130c-35bd-0000-000000000000defb04c4-c15c-3a07-0000-00000000000011.0483.0nifi-standard-narorg.apache.nifi1.6.0WARN1Log LevelLog LevelLog PayloadLog PayloadAttributes to LogAttributes to Logattributes-to-log-regexattributes-to-log-regexAttributes to IgnoreAttributes to Ignoreattributes-to-ignore-regexattributes-to-ignore-regexLog prefixLog prefixcharacter-setcharacter-setALLfalse30 secLog LevelinfoLog PayloadfalseAttributes to Logattributes-to-log-regex.*Attributes to Ignoreattributes-to-ignore-regexLog prefixcharacter-setUTF-800 secTIMER_DRIVEN1 secLogAttributetruesuccessSTOPPEDorg.apache.nifi.processors.standard.LogAttributebb6c25ae-f2b6-386a-0000-000000000000defb04c4-c15c-3a07-0000-000000000000670.0225.0nifi-standard-narorg.apache.nifi1.6.0WARN1validate-csv-schemavalidate-csv-schemavalidate-csv-headervalidate-csv-headervalidate-csv-delimitervalidate-csv-delimitervalidate-csv-quotevalidate-csv-quotevalidate-csv-eolvalidate-csv-eolvalidate-csv-strategyvalidate-csv-strategyALLfalse30 secvalidate-csv-schemaNotNull,ParseInt(),Optional(ParseInt()),Nullvalidate-csv-headertruevalidate-csv-delimiter,validate-csv-quote"validate-csv-eol\nvalidate-csv-strategyLine by line validation00 secTIMER_DRIVEN1 secValidateCsvfalseinvalidfalsevalidSTOPPEDorg.apache.nifi.processors.standard.ValidateCsveb6cd54a-e1f1-3871-0000-000000000000defb04c4-c15c-3a07-0000-000000000000688.00.0nifi-standard-narorg.apache.nifi1.6.0WARN1File SizeFile SizeBatch SizeBatch SizeData FormatData FormatUnique FlowFilesUnique FlowFilesgenerate-ff-custom-textgenerate-ff-custom-textcharacter-setcharacter-setschema.nameschema.nameALLfalse30 secFile Size0BBatch Size1Data FormatTextUnique FlowFilesfalsegenerate-ff-custom-textname,age,int_val,address
Rakesh Prasad,0,99,"address 12 33333, 444441"
rakesh Prasad1,1,,"address 12 33333, 444442"
rakesh Prasad2,2,55,"address 12 33333, 444443"
rakesh Prasad3,,33,"address 12 33333, 444444"character-setUTF-8schema.nameempData01 dayTIMER_DRIVEN1 secGenerateFlowFilefalsesuccessSTOPPEDorg.apache.nifi.processors.standard.GenerateFlowFile1297bea9-b30f-3f45-0000-000000000000defb04c4-c15c-3a07-0000-000000000000450.0539.0nifi-standard-narorg.apache.nifi1.6.0WARN1Log LevelLog LevelLog PayloadLog PayloadAttributes to LogAttributes to Logattributes-to-log-regexattributes-to-log-regexAttributes to IgnoreAttributes to Ignoreattributes-to-ignore-regexattributes-to-ignore-regexLog prefixLog prefixcharacter-setcharacter-setALLfalse30 secLog LevelinfoLog PayloadfalseAttributes to Logattributes-to-log-regex.*Attributes to Ignoreattributes-to-ignore-regexLog prefixcharacter-setUTF-800 secTIMER_DRIVEN1 secLogAttributetruesuccessSTOPPEDorg.apache.nifi.processors.standard.LogAttribute64b15a56-8a5f-3297-0000-000000000000defb04c4-c15c-3a07-0000-000000000000837.0482.0000305175781nifi-standard-narorg.apache.nifi1.6.0WARN1Log LevelLog LevelLog PayloadLog PayloadAttributes to LogAttributes to Logattributes-to-log-regexattributes-to-log-regexAttributes to IgnoreAttributes to Ignoreattributes-to-ignore-regexattributes-to-ignore-regexLog prefixLog prefixcharacter-setcharacter-setALLfalse30 secLog LevelinfoLog PayloadfalseAttributes to Logattributes-to-log-regex.*Attributes to Ignoreattributes-to-ignore-regexLog prefixcharacter-setUTF-800 secTIMER_DRIVEN1 secLogAttributetruesuccessSTOPPEDorg.apache.nifi.processors.standard.LogAttribute09/05/2018 01:32:27 EDT
ANSWER
Answered 2018-Sep-04 at 13:04You can use ConvertRecord with a CSV Reader and in the CSV Reader choose "Use String Fields From Header" for the Schema Access Strategy. This will create a schema dynamically from the header.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install csv-schema
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page