climactic | YAML-based test framework for command-line utilities | Testing library
kandi X-RAY | climactic Summary
kandi X-RAY | climactic Summary
A simple testing framework for running shell commands and verifying their behavior. Tests are written as YAML files which specify the commands to run along with assertions about the output (currently, stdout and file/directory contents). It is written in Python 3, but it can be used for testing any kind of shell-based application using just the climactic utility and your test files (by default, files matching **/test_*.yml).
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Use setuptools
- Download setuptools
- Issue a warning if the given version is available
- Build a Setuptools egg
- Generator from a path
- Parse a YAML stream
- Parse a file
- Run the script
- Log a message with TRACE
- Diff two dicts
- Download a file from the given URL
- Wrapper for subprocess
- Install Setuptools
- Build tags from a dictionary
- Build the install arguments
- Install the requirements files
- Return arguments for download
- Parses a file
- Runs the test case
- Return a dictionary mapping extras to extras
- Run the subprocess
- Load plugins
- Register a tag
- Parse command line options
- Determine verbosity and log level
- Return the dependency links
climactic Key Features
climactic Examples and Code Snippets
Community Discussions
Trending Discussions on climactic
QUESTION
I'm trying to build a naive bayes based classifier for 1000 positive+negative labled IMDB reviews (txt_sentoken) and weka API for Java.
As I wasn't aware of StringToWordVector
, which basically provides a BagOfWords model that reaches an 80% accuracy, so I did the vocabulary building and vector creation myself, with an accuracy of only 75% :(
Now I'm wondering why my solution is performing so much worse.
1) From my 2000 reviews, I build the BagOfWords:
...ANSWER
Answered 2017-Dec-28 at 07:18Reading through Weka's StringToWordVector
documentation, there seem to be a couple of implementation details different than yours. Here are the top two, based on how likely they are to be the reason for the performance difference you see, in my opinion:
- It seems that by default, the resulting vector is boolean (i.e. noting the existence of a word, rather than number of occurrences)
- If the class attribute is set before vectorizing the text, a separate dictionary is built for each class, then all dictionaries are merged.
While any of them (or other, more minor differences) could be the culprit, my bet is on the second point.
The built-in class allows setting and unsetting each of these options; you could try re-running the 80% version using StringToWordVector
with the -C option to use number of occurences rather then a boolean value, and with -O, to use a single dictionary across both classes.
This should allow you to verify whether any of these is indeed the culprit.
EDIT: Regarding the first point, i.e. counting occurences vs. noting word existence (also called Bernoulli and multinomial models), there were several academic papers at the 90s which looked into the differences, e.g. here and here. While usually the multinomial model works better, there are also opposite cases, depending on corpus and classification problem.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install climactic
env: sets environment variables which can be used in run commands and assertions
run: runs one or more commands; nearly acts like a bash script (performs environment variable substitution)
write-file-utf8: writes a string to a file
assert-output: compares the output of the most recent run command with an expected variable (performs environment variable substitution)
assert-tree: compares the directory structure of the directory being tested against an expected structure
assert-file-utf8: compares the contents of a file against an expected string
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page