split-folders | Split folders with files | Machine Learning library

 by   jfilter Python Version: 0.5.1 License: MIT

kandi X-RAY | split-folders Summary

kandi X-RAY | split-folders Summary

split-folders is a Python library typically used in Artificial Intelligence, Machine Learning, Deep Learning applications. split-folders has no bugs, it has no vulnerabilities, it has a Permissive License and it has low support. However split-folders build file is not available. You can install using 'pip install split-folders' or download it from GitHub, PyPI.

Split folders with files (e.g. images) into train, validation and test (dataset) folders.
Support
    Quality
      Security
        License
          Reuse

            kandi-support Support

              split-folders has a low active ecosystem.
              It has 265 star(s) with 46 fork(s). There are 7 watchers for this library.
              OutlinedDot
              It had no major release in the last 12 months.
              There are 0 open issues and 28 have been closed. On average issues are closed in 149 days. There are no pull requests.
              It has a neutral sentiment in the developer community.
              The latest version of split-folders is 0.5.1

            kandi-Quality Quality

              split-folders has 0 bugs and 10 code smells.

            kandi-Security Security

              split-folders has no vulnerabilities reported, and its dependent libraries have no vulnerabilities reported.
              split-folders code analysis shows 0 unresolved vulnerabilities.
              There are 1 security hotspots that need review.

            kandi-License License

              split-folders is licensed under the MIT License. This license is Permissive.
              Permissive licenses have the least restrictions, and you can use them in most projects.

            kandi-Reuse Reuse

              split-folders releases are not available. You will need to build from source code and install.
              Deployable package is available in PyPI.
              split-folders has no build file. You will be need to create the build yourself to build the component from source.
              Installation instructions, examples and code snippets are available.
              split-folders saves you 126 person hours of effort in developing the same functionality from scratch.
              It has 448 lines of code, 31 functions and 7 files.
              It has low code complexity. Code complexity directly impacts maintainability of the code.

            Top functions reviewed by kandi - BETA

            kandi has reviewed split-folders and discovered the below as its top functions. This is intended to give you an instant insight into split-folders implemented functionality, and help decide if they suit your requirements.
            • Parse command line arguments
            • Copy files from a fixed directory
            • Group files by prefix
            • Copy the files contained in the input directory
            • Split the class directory with fixed parameters
            • Copy files_type to output directory
            • Return a list of training and validation files
            • Checks input folder
            • Return a list of the files in the class directory
            • Splits the class directory with the given ratio
            • List all files in a directory
            • List all directories in a directory
            Get all kandi verified functions for this library.

            split-folders Key Features

            No Key Features are available at this moment for split-folders.

            split-folders Examples and Code Snippets

            No Code Snippets are available at this moment for split-folders.

            Community Discussions

            QUESTION

            What is "seed" for in splitting test-val data in Python and how to come up with a correct number?
            Asked 2021-May-19 at 05:35

            I'm trying to split my image dataset so it can have a training set and validation set. I found this Python's library called split-folders. The syntax is easy to understand

            splitfolders.ratio("input_folder", output="output", seed=1337, ratio=(.8, .1, .1), group_prefix=None)

            But I don't know about this seed parameter and what it does. The description on the page only says that "a seed makes splits reproducible" and that "it shuffles the items" but it doesn't really explain anything for me. I have googled about it and none of them gave me a clear answer. Anyone can give me a brief explanation?

            The default number is 1337, but why? What does it mean to have the seed set to 1337? How did they come up with that number? How do I find the correct seed for my dataset?

            ...

            ANSWER

            Answered 2021-May-19 at 05:34

            When you split your corpus to train, validate, and test set, you randomly assign one data point to one of these three sets. Randomness is traceable using seeds.

            Imagine, you have a random generator, a BlackBox, that gives you a series of random numbers; But for each given seed, the sequence it generates will be always identical. For example, for seed=1337, a random generator will always generate a sequence of random numbers like 12,901,110,1,.... on the same computer.

            Why we care about tracing the randomness, especially in the case of dividing the corpus for training? Because most of the time, you want to repeat the same experiment, with the same data. So if you do not use the seed value, each time you run the same experiment, you will end up with different settings for training.

            The seed value itself is not important, as long as you get it by some value you know it is fixed during your experiments. I personally set it to a prime number.

            Source https://stackoverflow.com/questions/67597367

            Community Discussions, Code Snippets contain sources that include Stack Exchange Network

            Vulnerabilities

            No vulnerabilities reported

            Install split-folders

            This package is Python only and there are no external dependencies. Optionally, you may install tqdm to get get a progress bar when moving files.

            Support

            If you have a question, found a bug or want to propose a new feature, have a look at the issues page. Pull requests are especially welcomed when they fix bugs or improve the code quality.
            Find more information at:

            Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items

            Find more libraries
            Install
          • PyPI

            pip install split-folders

          • CLONE
          • HTTPS

            https://github.com/jfilter/split-folders.git

          • CLI

            gh repo clone jfilter/split-folders

          • sshUrl

            git@github.com:jfilter/split-folders.git

          • Stay Updated

            Subscribe to our newsletter for trending solutions and developer bootcamps

            Agree to Sign up and Terms & Conditions

            Share this Page

            share link