RecurrentNeuralNetwork | Recurrent Neural Network from scratch using Python and Numpy | Machine Learning library
kandi X-RAY | RecurrentNeuralNetwork Summary
kandi X-RAY | RecurrentNeuralNetwork Summary
Recurrent Neural Network from scratch using Python and Numpy
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Return the sigmoid function
- Sigmoid prime function
RecurrentNeuralNetwork Key Features
RecurrentNeuralNetwork Examples and Code Snippets
Community Discussions
Trending Discussions on RecurrentNeuralNetwork
QUESTION
I am currently working on an application for segmentation-free handwritten text recognition. Therefore text lines are extracted from the input document that should then be recognized.
For development purpose I use the IAM Handwriting Database. It provides text line images along with the corresponding ASCII text.
For the recognition I adapt the approaches found in the papers "An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition" and "Can We Build Language-independent OCR Using LSTM Networks?".
Basiacally, I use a bidirectional GRU architecture and a forward backward algorithm to align transcripts with the output of the neural network.
An image from the database looks like this:
The images are presented as 1D sequence of pixel values, more preceisely the images are first scaled to a height of 32 pixels.
The numpy array of the above image with the dimension of 597 x 32 has the shape of: (597, 32).
The numpy array, representing the overall training images of size n, has the shape of: (n, w, 32) where w would represent the variable width of the line images (for example 597).
The following code shows how the training images and the transcription are represented:
...ANSWER
Answered 2018-Mar-26 at 22:17ok, I wasn't able to explain this with the 600 chars available in the comment section, therefore I will do it by answering, however ignoring your Q2.
The code to the paper you mentioned can be found at: https://github.com/bgshih/crnn It is a good starting point for handwritten text recognition. However, the CRNN implementation recognizes text on word-level, you want to do it on line-level, therefore you need larger input images, e.g. I used 800x64px and a maximum text length of 100. And as already said, stretching images to the desired size does not work very well, in my experiments the accuracy increased when using padding (randomize positions a little bit ... it's an easy way to do data augmentation).
There is a relationship between the maximum text length L and the input image width W: the Neural Network (NN) downsizes the input image by a fixed scaling factor f: L=W/f (in my example: W=800px, L=100, f=8). The illustrations attached shows the input image (800x64px) and the character probability matrix (probability for each of the 80 possible characters for each of the 100 time-steps). The NN maps the input image to this character probability matrix which serves as input for the CTC. As there are L many time-steps in the matrix, there can be at most L many characters: this of course holds for decoding, but also loss calculation must align the ground truth text somehow with this matrix, and how should a text with L+1 characters be aligned with just L time-steps contained in the matrix!? Note that inside the CTC calculation repeated characters (like in "piZZa") must be separated by a special character - therefore the possible text length decreases by 1 for each repetition.
I think with this explanation you should be able to figure out how all those length-variables in your code are related to each other.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install RecurrentNeuralNetwork
You can use RecurrentNeuralNetwork like any standard Python library. You will need to make sure that you have a development environment consisting of a Python distribution including header files, a compiler, pip, and git installed. Make sure that your pip, setuptools, and wheel are up to date. When using pip it is generally recommended to install packages in a virtual environment to avoid changes to the system.
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page