pynini | Read-only mirror of Pynini
kandi X-RAY | pynini Summary
kandi X-RAY | pynini Summary
Pynini is a Python extension module which allows the user to compile, optimize, and apply grammar rules. Rules can be compiled into weighted finite state transducers, pushdown transducers, or multi-pushdown transducers. For general information and a detailed tutorial, see pynini.opengrm.org.
Support
Quality
Security
License
Reuse
Top functions reviewed by kandi - BETA
- Run the generator
- Generate a mapping of target designator
- Close the FST file
- Check if ostring matches a given rule
- Rewrite the lattice
- Checks if the input matches the given string
- Return the top - level rewrite
- Convert a lattice to the shortest shortest path
- Convert a lattice to a list of strings
- Find optimal rewrite for a string
- Convert a lattice to fst
- Transform a sentence into a string
- Rewrite a string
- Transform a string into one - top - top - top - level rewriting
- Return a list of possible rewrite rules
- Rewrites the given string
- Return a list of strings representing the top - left transformation
- Rewrite a top - level string
- Create a new Fst with the given weight
- Rewrite the input string
- Delete a cross expression
- Reduce a rule
- Generate markdown
- Returns a fst with zero padding
- Tag a word
- Get the version string
pynini Key Features
pynini Examples and Code Snippets
Community Discussions
Trending Discussions on pynini
QUESTION
I have created a word-level text generator using an LSTM model. But in my case, not every word is suitable to be selected. I want them to match additional conditions:
- Each word has a map: if a character is a vowel then it will write 1 if not, it will write 0 (for instance, overflow would be
10100010
). Then, the sentence generated needs to meet a given structure, for instance,01001100
(hi01
and friend001100
). - The last vowel of the last word must be the one provided. Let's say is e. (friend will do the job, then).
Thus, to handle this scenario, I've created a pandas dataframe with the following structure:
...ANSWER
Answered 2020-Apr-12 at 02:05If you are happy with your approach, the easiest way might be if you'd be able to train your LSTM on the reversed sequences as to train it to give the weight of the previous word, rather than the next one. In such a case, you can use the method you already employ, except that the first subset of words would be satisfying the last vowel constraint. I don't believe that this is guaranteed to produce the best result.
Now, if that reversal is not possible or if, after reading my answer further, you find that this doesn't find the best solution, then I suggest using a pathfinding algorithm, similar to reinforcement learning, but not statistical as the weights computed by the trained LSTM are deterministic. What you currently use is essentially a depth first greedy search which, depending on the LSTM output, might be even optimal. Say if LSTM is giving you a guaranteed monotonous increase in the sum which doesn't vary much between the acceptable consequent words (as the difference between N-1 and N sequence is much larger than the difference between the different options of the Nth word). In the general case, when there is no clear heuristic to help you, you will have to perform an exhaustive search. If you can come up with an admissible heuristic, you can use A* instead of Dijkstra's algorithm in the first option below, and it will do the faster, the better you heuristic is.
I suppose it is clear, but just in case, your graph connectivity is defined by your constraint sequence. The initial node (0-length sequence with no words) is connected with any word in your data frame that matches the beginning of your constraint sequence. So you do not have the graph as a data structure, just it's the compressed description as this constraint.
EDIT As per request in the comment here are additional details. Here are a couple of options though:
Apply Dijkstra's algorithm multiple times. Dijkstra's search finds the shortest path between 2 known nodes, while in your case we only have the initial node (0-length sequence with no words) and the final words are unknown.
- Find all acceptable last words (those that satisfy both the pattern and vowel constraints).
- Apply Dijkstra's search for each one of those, finding the largest word sequence weight sum for each of them.
- Dijkstra's algorithm is tailored to the searching of the shortest path, so to apply it directly you will have to negate the weights on each step and pick the smallest one of those that haven't been visited yet.
- After finding all solutions (sentences that end with one of those last words that you identified initially), select the smallest solution (this is going to be exactly the largest weight sum among all solutions).
Modify your existing depth-first search to do an exhaustive search.
- Perform the search operation as you described in OP and find a solution if the last step gives one (if the last word with a correct vowel is available at all), record the weight
- Rollback one step to the previous word and pick the second-best option among previous words. You might be able to discard all the words of the same length on the previous step if there was no solution at all. If there was a solution, it depends on whether your LSTM provides different weights depending on the previous word. Likely it does and in that case, you have to perform that operation for all the words in the previous step.
- When you run out of the words on the previous step, move one step up and restart down from there.
- You keep the current winner all the time as well as the list of unvisited nodes on every step and perform exhaustive search. Eventually, you will find the best solution.
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install pynini
Linux (x86) and Mac OS X users who already have conda can install Pynini and all dependencies with the following command.
Once these are installed, issue the following command:. To confirm successful installation, run python test/pynini_test.py; if all tests pass, the final line will read OK. Pynini source installation for the current version has been tested on Debian Linux 5.7.17-1 on x86_64, GCC 10.2.0, and Python 3.8.5.
A standards-compliant C++17 compiler (GCC >= 7 or Clang >= 700)
The compatible recent version of OpenFst (see NEWS for this) built with the far, pdt, mpdt, and script extensions (i.e., built with ./configure --enable-grm) and headers
Python 3.6+ and headers
Support
Reuse Trending Solutions
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesStay Updated
Subscribe to our newsletter for trending solutions and developer bootcamps
Share this Page