In this solution we are going to merge a list of tokens into doc with the help of Spacy in python. Spacy is the most useful and prominent Library in python. In that the method called as Retokenize we have used in this solution. This Retokenizer method will mark a span for merging. In this solution kit, I am sharing the code snippet and library that I use to remove particular tokes in Python which can be executed directly in the IDE.
Preview of the output that you will get on running this code from your IDE
In this solution we have used spaCy - Retokenizer.merge Method from SpaCy.
print([(idx,tok) for idx,tok in enumerate(samp)]) #this prints #[(0, sydney), (1, is), (2, a), (3, cool), (4, town)] import spacy nlp = spacy.load("en_core_web_sm") doc = nlp(u"sydney is a cool town") with doc.retokenize() as retokenizer: retokenizer.merge(doc[0:3]) print([(idx,tok) for idx,tok in enumerate(doc)]) #[(0, sydney is a), (1, cool), (2, town)]
I hope you found this useful. I have added the link to dependent libraries, version information in the following sections.
I found this code snippet by searching for "Merge sapcy tokens into a Doc " in kandi. You can try any such use case!
I tested this solution in the following versions. Be mindful of changes when working with other versions.
Using this solution, we can merge the tokens into doc with the help of function in spacy . This process also facilities an easy to use, hassle free method to create a hands-on working version of code which would help us merge the tokens in python.
Open Weaver – Develop Applications Faster with Open Source