text-classification | 中文文本分类(支持 API 部署)
kandi X-RAY | text-classification Summary
kandi X-RAY | text-classification Summary
中文文本分类(支持 API 部署)
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Generate training data
- Convert a sequence of words into a sequence of ids
- Reads the data files in the input_data_path
- Create a list of target ids
- Get a list of vocab
- Train the model
- Embedding layer
- Make a batch of input data
- Shuffle a batch
- Copy the configuration to the given path
- Check if a path exists
- Train the v2 model
- Build the model
- Generate test data
- Build a server session
- Generate infer data
- Runs test
text-classification Key Features
text-classification Examples and Code Snippets
Community Discussions
Trending Discussions on text-classification
QUESTION
Using the TFBertForQuestionAnswering.from_pretrained()
function, we get a predefined head on top of BERT together with a loss function that are suitable for this task.
My question is how to create a custom head without relying on TFAutoModelForQuestionAnswering.from_pretrained()
.
I want to do this because there is no place where the architecture of the head is explained clearly. By reading the code here we can see the architecture they are using, but I can't be sure I understand their code 100%.
Starting from How to Fine-tune HuggingFace BERT model for Text Classification is good. However, it covers only the classification task, which is much simpler.
'start_positions'
and 'end_positions'
are created following this tutorial.
So far, I've got the following:
...ANSWER
Answered 2022-Apr-03 at 22:24For future reference, I actually found a solution, which is just editing the TFBertForQuestionAnswering class itself. For example, I added an additional layer in the following code and trained the model as usual and it worked.
QUESTION
Goal: Amend this Notebook to work with distilbert-base-uncased model
Error occurs in Section 1.3.
Kernel: conda_pytorch_p36
. I did Restart & Run All, and refreshed file view in working directory.
Section 1.3:
...ANSWER
Answered 2022-Jan-14 at 10:23A Dev explains this predicament at this Git Issue.
The Notebook experiments with BERT, which uses token_type_ids
.
DistilBERT does not use token_type_ids
for training.
So, this would require re-developing the notebook; removing/ conditioning all mentions of token_type_ids
for this model specifically.
QUESTION
I am currently trying to replicate the article
https://towardsdatascience.com/text-classification-with-bert-in-pytorch-887965e5820f
to get an introduction to PyTorch and BERT.
I used some own sample corpus and corresponding tragets as practise, but the code throws the following:
...ANSWER
Answered 2022-Jan-12 at 14:00You're creating a list of length 33 in your __getitem__
call which is one more than the length of the labels list, hence the out of bounds error. In fact, you create the same list each time this method is called. You're supposed to fetch the associated y
with the X
found at idx
.
If you replace batch_y = np.array(range(...))
with batch_y = np.array(self.labels[idx])
, you'll fix your error. Indeed, this is already implemented in your get_batch_labels
method.
QUESTION
I am getting the following error : attributeerror: 'dataframe' object has no attribute 'data_type'"
. I am trying to recreate the code from this link which is based on this article with my own dataset which is similar to the article
ANSWER
Answered 2022-Jan-10 at 08:41The error means you have no data_type
column in your dataframe because you missed this step
QUESTION
I'm using this Notebook, where section Apply DocumentClassifier is altered as below.
Jupyter Labs, kernel: conda_mxnet_latest_p37
.
Error appears to be an ML standard practice response. However, I pass/ create the same parameter and the variable names as the original code. So it's something to do with their values in my code.
My Code:
...ANSWER
Answered 2021-Dec-08 at 21:05 Reading official docs and analyzing that the error is generated when calling .predict(docs_to_classify)
I could recommend that you try to do basic tests such as using the parameter labels = ["negative", "positive"]
, and correct if it is caused by string values of the external file and optionally you should also check where it indicates the use of pipelines.
QUESTION
I am trying code from this page. I ran up to the part LR (tf-idf)
and got the similar results
After that I decided to try GridSearchCV
. My questions below:
1)
...ANSWER
Answered 2021-Dec-09 at 23:12You end up with the error with precision because some of your penalization is too strong for this model, if you check the results, you get 0 for f1 score when C = 0.001 and C = 0.01
QUESTION
I'm fairly new to Machine Learning. I've successfully solved errors to do with parameters and model setup.
I'm using this Notebook, where section Apply DocumentClassifier is altered as below.
Jupyter Labs, kernel: conda_mxnet_latest_p37
.
Error seems to be more about my laptop's hardware, rather than my code being broken.
Update: I changed batch_size=4
, it ran for ages only to crash.
What should be my standard approach to solving this error?
My Code:
...ANSWER
Answered 2021-Dec-09 at 11:53Reducing the batch_size
helped me:
QUESTION
I have the following dataset (I will upload only a sample of 4 rows, the real one has 15,000 rows):
...ANSWER
Answered 2021-Dec-03 at 17:36I don't think it's really meaningful to compute the chi-squared statistic without having the classes attached. The code chi2(X_train, y_neutral)
is asking "Assuming that class and the parameter are independent, what are the odds of getting this distribution?" But all of the examples you're showing it are the same class.
I would suggest this instead:
QUESTION
I am trying to implement this article https://towardsdatascience.com/bert-text-classification-using-pytorch-723dfb8b6b5b, but I have the following problem.
...ANSWER
Answered 2021-Nov-01 at 02:55Try
QUESTION
I am doing sentiment analysis using BERT. I want to convert the result to DataFrame format, but I don't know how. If anyone knows, please let me know.
The related web pages are as follows https://huggingface.co/transformers/main_classes/pipelines.html
...ANSWER
Answered 2021-Oct-22 at 06:37Try this:
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install text-classification
You can use text-classification like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page