weka | A Python wrapper for Weka | Machine Learning library
kandi X-RAY | weka Summary
kandi X-RAY | weka Summary
Provides a convenient wrapper for calling Weka classifiers from Python.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Query the classifier
- Escape a string
- Parse sparse data line
- Append line to the stream
- Train the classifier
- Return a shallow copy of the object
- Write the object to fout
- Train the model
- Load a classifier
- Load a pickled file
- Load a model from raw data
- Load a schema from a file
- Get requirements from files
- Save the document to a file
- Set the class name
weka Key Features
weka Examples and Code Snippets
from weka.core.classes import from_commandline, get_classname
from weka.attribute_selection import ASSearch
from weka.attribute_selection import ASEvaluation
search = from_commandline('weka.attributeSelection.GreedyStepwise -B -T -1.79769
multi = MultiSearch(options=["-S", "1"])
from wekapy import *
# CREATE NEW MODEL INSTANCE WITH A CLASSIFIER TYPE
model = Model(classifier_type = "bayes.BayesNet")
# LOAD A PREVIOUSLY TRAINED MODEL INTO OUR model OBJECT FOR TESTING AGAINST
model.load_model("/path/to/model.mod
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_selection import mutual_info_classif
from sklearn.feature_extraction.text import CountVectorizer
categories = ['talk.religion.misc',
'comp.graphics', 'sci.
sudo apt-get install python-pip python3-pip virtualenv build-essential
sudo apt-get install openjdk-8-source openjdk-8-jdk
DEBUG:weka.core.jvm:Adding bundled jars
DEBUG:weka.core.jvm:Classp
import arff
arff.dump('filename.arff'
, df.values
, relation='relation name'
, names=df.columns)`
list_of_sel_atts = list(attsel.selected_attributes))
print(list_of_sel_atts)
from weka.filters import Filter
remove = Filter(classname="weka.filters.unsupervised.attribute.Remove", options=["-R", "1,2,3,4,5"])
line = '| | | | | "assembly" <= 0.5:'
parts = line.split('"')
indent = parts[0].count('| ')
predictor = parts[1]
splitpoint = float(parts[2][-1-parts[2].rfind(' '):-1])
Community Discussions
Trending Discussions on weka
QUESTION
My intention is to recreate a big model done on weka using scikit-learn and other libraries.
I have this base model done with pyweka.
...ANSWER
Answered 2022-Mar-31 at 22:42I've just released a version 0.0.5 of sklearn-weka-plugin, with which you can do the following:
QUESTION
I did a model like that:
...ANSWER
Answered 2022-Mar-16 at 02:30I have turned your code snippet into one with imports and fixed the MultiSearch setup for Bagging (mparam.prop = "numIterations"
instead of mparam.prop = "numOfBoostingIterations"
), allowing it to be executed.
Since I do not have access to your data, I just used the UCI dataset vote.arff.
Your code was a bit odd, as it did a 70/30 train/test split, trained the classifier and then performed cross-validation on the test data. For cross-validation you do not train the classifier, as this happens within the internal cross-validation loop (each trained classifier inside that loop gets discarded, as cross-validation is only used for gathering statistics).
The code below has therefore three parts:
- your original evaluation code, but commented out
- performing proper cross-validation
- performing train/test evaluation
I do not use Jupyter notebooks and tested the code successfully in a regular virtual environment on my Linux Mint:
- Python:
3.8.10
- Output of
pip freeze
:
QUESTION
I am trying to build a prediction model using WEKA. For example, I plan to build a model for classifying whether something is leading to A or P using Database 1. I will then use another Testing and Training dataset to predict outcomes whether something is A or B using this model. Are there any prediction packages for that? I saw Forecasting packages but that is for numeric data. I am looking prediction packages that could help me classify data and make predictions. Also, can I save this model and make it publicly available so that others can use it as well and make similar predictions? Is this possible in WEKA?
...ANSWER
Answered 2022-Mar-15 at 21:11Yes, of course you can make predictions with a model. Since it is not clear, whether you want to do that from the command-line, in the GUI or from code, I recommend you have a look at the following Weka wiki article:
https://waikato.github.io/weka-wiki/making_predictions/
If you want to use the API, check the articles listed here:
https://waikato.github.io/weka-wiki/using_the_api/
Also, every Weka installation comes with code examples, aptly named wekaexamples.zip.
QUESTION
Im translating a model done on weka to python-weka-wrapper3 and i dont know how to an evaluator and search options on attributeselectedclassifier.
This is the model on weka:
...ANSWER
Answered 2022-Mar-14 at 20:20You need to instantiate ASSearch
and ASEvaluation
objects. If you have command-lines, you can use the from_commandline
helper method like this:
QUESTION
For some reason I am using the WEKA API...
I have generated tf-idf scores for a set of documents,
...ANSWER
Answered 2022-Mar-13 at 19:53The StringToWordVector filter uses the weka.core.DictionaryBuilder class under the hood for the TF/IDF computation.
As long as you create a weka.core.Instance
object with the text that you want to have converted, you can do that using the builder's vectorizeInstance(Instance)
method.
Edit 1:
Below is an example based on your code (but with Weka classes), which shows how to either use the filter or the DictionaryBuilder for the TF/IDF transformation. Both get serialized, deserialized and re-used as well to demonstrate that these classes are serializable:
QUESTION
Hi guys I have this question because im trying to use ADTrees of weka on python using pyweka like this:
...ANSWER
Answered 2022-Mar-09 at 20:06python-weka-wrapper3, as the name suggests, is a light-weight wrapper around Weka classes. However, it mostly only wraps abstract superclasses, not all of the thousands of classes that make up Weka.
The weka.classifiers.Classifier
class is the wrapper to use for classifiers and you need to specify the Java classname of the actual classifier that you want to wrap (in your case weka.classifiers.trees.ADTree). And, yes, just like Python, Java is also case sensitive.
Furthermore, if you require classes from packages, then you need to start the JVM with package support (default is no package support).
Below is an example that outputs the command-line of ADTree
with its default parameters. If necessary, it installs the package first.
QUESTION
I am doing a data science project. My dataset is an imbalanced dataset. I am using Weka for classification purpose.
The dataset has 1273 instances. Among them yes class instances are 174 and No class instances are 1099. Therefore, the dataset is bias to no class.
I am using resample filter to maintain a ratio among yes class and no class. I am sharing a result below.
The parameter that I have tweaked to see various yes: no
ratio is bias to uniform class
. As per weka's documentation the definition of term bias to uniform class
is Whether to use bias towards a uniform class. A value of 0 leaves the class distribution as-is, a value of 1 ensures the class distribution is uniform in the output data.
ANSWER
Answered 2022-Feb-27 at 01:20I'm not quite sure why you have a problem with the output of the filter. Maybe the following will make it apparent to you how the filter functions.
Here is your data in one table, slightly rearranged:
QUESTION
I have this code for gridsearch:
...ANSWER
Answered 2022-Feb-21 at 22:40The exceptions that you listed do not affect the script execution, as they get handled internally by pww3 (error output could not be suppressed, unfortunately, despite catching the exceptions; this gets output by the underlying javabridge library).
A bit of background: Since pww3 can run with and without package support, it first tries to load classes using the Java classloader. If that fails (that's the error message that you see), it will try loading them using Weka's mechanism for loading classes.
The just released version 0.2.7 of pww3 approaches this a bit more intelligent and avoids the output of these exceptions.
Final note: you need to drop the classifier.
prefix in your property names when using MultiSearch
.
QUESTION
I am working on Pima Indians Diabetes Database in Weka. I noticed that for decision tree J48 the tree is smaller as compared to the Random Tree. I am unable to understand why it is like this? Thank you.
...ANSWER
Answered 2022-Feb-21 at 19:57Though they both are decision trees, they employ different algorithms for constructing the tree, which will (most likely) give you a different outcome:
- J48 prunes the tree by default after it built its tree (Wikipedia).
- RandomTree (when using default parameters) inspects a maximum of
log2(num_attributes)
attributes for generating splits.
QUESTION
It is in ARFF format. If you're not familiar with ARFF, it's basically that everything under the @data marker is in CSV.
For clarification, I am trying to use the dataset on Weka but the option to use Naïve Bayes is greyed out.
...ANSWER
Answered 2022-Jan-04 at 21:03Every classifier, clusterer, filter etc in Weka can only handle certain types of data, i.e., its capabilities (which you can check in GUI). These capabilities are then compared against the data. In case of a mismatch, the GUI won't allow you to apply the algorithm.
Long story short: the dport attribute is of type string which NaiveBayes can't handle. You can convert that attribute into a nominal one using the StringToNominal filter.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install weka
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page