awesome-machine-learning | curated list of awesome Machine Learning frameworks | Machine Learning library
kandi X-RAY | awesome-machine-learning Summary
kandi X-RAY | awesome-machine-learning Summary
A curated list of awesome machine learning frameworks, libraries and software (by language). Inspired by awesome-php.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample of awesome-machine-learning
awesome-machine-learning Key Features
awesome-machine-learning Examples and Code Snippets
Community Discussions
Trending Discussions on awesome-machine-learning
QUESTION
In sklearn when we pass sentence to algorithms we can use text features extractors like the countvectorizer, tf-idf vectoriser etc... And we get an array of floats.
But what we get when passed to vowpal wabbit the input file like this one:
...ANSWER
Answered 2017-Mar-30 at 21:47There are two separate questions here:
Q1: Why can't you (and shouldn't you) use transformations like tf-idf
when using vowpal wabbit
?
A1: vowpal wabbit
is not a batch learning system, it is an online-learning system. In order to compute measures like tf-idf
(term frequency in each document vs the whole corpus) you need to see all the data (corpus) first, and sometimes do multiple passes over the data. vowpal wabbit
as an online/incremental learning system is designed to also work on problems where you don't have the full data ahead of time. See This answer for a lot more details.
Q2: How does vowpal wabbit
"transform" the features it sees ?
A2: It doesn't. It simply maps each word feature on-the-fly to its hashed location in memory. The online learning step is driven by a repetitive optimization loop (SGD or BFGS) example by example, to minimize the modeling error. You may select the loss function to optimize for.
However, if you already have the full data you want to train on, nothing prevents you from transforming it (using any other tool) before feeding the transformed values to vowpal wabbit
. It's your choice. Depending on the particular data, you may get better or worse results using a transformation pre-pass, than by running multiple passes with vowpal wabbit
itself without preliminary transformations (check-out the vw --passes
option).
To complete the answer, let's add another related question:
Q3: Can I use pre-transformed (e.g. tf-idf
) data with vowpal wabbit
?
A3: Yes, you can. Just use the following (post-transformation) form. Instead of words, use integers as feature IDs and since any feature can have an optional explicit weight, use the tf-idf
floating point as weights, following the :
separator in typical SVMlight format:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install awesome-machine-learning
You can use awesome-machine-learning like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page