Text Segmenter (text_seg)
Module text_seg
implemented for the Topical Change Detection in Documents via Embeddings of Long Sequences.
Walk through
1. Import text_segementer (text_seg
) module in jac
- For executing jaseci Open terminal and run follow command.
jsctl -m
- Load
text_seg
module in jac by commandactions load module jac_nlp.text_seg
2. Get segments
For this tutorial, we are going to leverage the text segmenter (text_seg
) for the purpose of text segmentation,
-
Creating Jac Program text segmenter (
text_seg
)-
Create a file by name
segment.jac
-
Create node model_dir and
text_seg
insegment.jac
file.node model_dir; node text_seg{};
-
Initializing node text_seg and import
text_seg.load_model
andtext_seg.get_segments
ability inside nodetext_seg
.# import ability can text_seg.load_model, text_seg.get_segments;
-
Initialize module
load_model
andget_segments
insideget_seg
node.# loading model can load_model with text_segment entry{ text_seg.load_model( model_name = visitor.model_name ); } # text segmentation can get_segments with text_segment entry{ data = file.load_json(visitor.data); report text_seg.get_segments( text = data["text"], threshold = data["threshold"] ); }
load_model
to load the available model for text segmentation.get_segements
: gets different topics in the context provided, given a thresholdParameter details
- Input
-
model_name(String)
: name of the transformer model to load, options are:wiki
: trained on wikipedia datalegal
: trained on legal documents
-
Input Data
{ "text": "There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude. The King of England was at war with him and had led a great army into Scotland to drive him out of the land. Battle after battle had been fought. Six times Bruce had led his brave little army against his foes and six times his men had been beaten and driven into flight. At last his army was scattered, and he was forced to hide in the woods and in lonely places among the mountains. One rainy day, Bruce lay on the ground under a crude shed listening to the patter of the drops on the roof above him. He was tired and unhappy. He was ready to give up all hope. It seemed to him that there was no use for him to try to do anything more. As he lay thinking, he saw a spider over his head making ready to weave her web. He watched her as she toiled slowly and with great care. Six times she tried to throw her frail thread from one beam to another, and six times it fell short. 'Poor thing,' said Bruce: 'you, too, know what it is to fail. But the spider did not lose hope with the sixth failure. With still more care, she made ready to try for the seventh time. Bruce almost forgot his own troubles as he watched her swing herself out upon the slender line. Would she fail again? No! The thread was carried safely to the beam and fastened there.", "threshold": 0.65 }
text(String)
: text the contain the entire contextthreshold(Float)
: range is between 0-1, make each sentence as segment if, threshold is 1.
-
Output List of Sentences that best summarizes the context
-
- Input
-
Adding edge name of
seg_model
insegment.jac
file for connecting nodes inside graph.# adding edge edge seg_model { has model_type; }
-
Adding graph name of
text_seg_graph
for initializing node.graph text_seg_graph { has anchor seg_model_dir; spawn { seg_model_dir = spawn node::model_dir; text_seg_node = spawn node::text_seg; seg_model_dir -[seg_model(model_type="text_seg")]-> text_seg_node; } }
-
Initializing walker init for calling graph.
walker init { root { spawn here ++> graph::text_seg_graph; } }
-
Creating walker name of
text_segment
for getting parameter from context or default and calling abilityload_model
andget_segments
.# declaring walker for summerize text walker text_segment{ has model_name="wiki"; has data="text.json"; root { take --> node::model_dir; } model_dir { take -->; } }
Final
segment.jac
programnode model_dir; node text_seg{ # import all module ability can text_seg.load_model, text_seg.get_segments; # loading model can load_model with text_segment entry{ text_seg.load_model( model_name = visitor.model_name ); } # text segmentation can segment with text_segment entry{ data = file.load_json(visitor.data); report text_seg.get_segments( text = data["text"], threshold = data["threshold"] ); } } # adding edge edge seg_model { has model_type; } # adding graph graph text_seg_graph { has anchor seg_model_dir; spawn { seg_model_dir = spawn node::model_dir; text_seg_node = spawn node::text_seg; seg_model_dir -[seg_model(model_type="text_seg")]-> text_seg_node; } } # declare init graph walker init { root { spawn here ++> graph::text_seg_graph; } } # declaring walker for summerize text walker text_segment{ has model_name="wiki"; has data="text.json"; root { take --> node::model_dir; } model_dir { take -->; } }
-
Steps for running
segment.jac
program-
Execute the follow command for Build
segment.jac
jac build segment.jac
-
Execute the follow command to Activate sentinal
sentinel set -snt active:sentinel -mode ir segment.jir
Note: If getting error
ValueError: badly formed hexadecimal UUID string
execute only oncesentinel register -set_active true -mode ir segment.jir
-
Execute the walker
text_segment
with default parameter fortext segmentation (text_seg)
module by following commandwalker run text_segment
After executing walker
text_segment
result data will show on console.Result
"report": [ [ "There was once a king of Scotland whose name was Robert Bruce. He needed to be both brave and wise because the times in which he lived were wild and rude. The King of England was at war with him and had led a great army into Scotland to drive him out of the land. Battle after battle had been fought. Six times Bruce had led his brave little army against his foes and six times his men had been beaten and driven into flight. At last his army was scattered, and he was forced to hide in the woods and in lonely places among the mountains.", "One rainy day, Bruce lay on the ground under a crude shed listening to the patter of the drops on the roof above him. He was tired and unhappy. He was ready to give up all hope. It seemed to him that there was no use for him to try to do anything more. As he lay thinking, he saw a spider over his head making ready to weave her web. He watched her as she toiled slowly and with great care. Six times she tried to throw her frail thread from one beam to another, and six times it fell short. ' Poor thing,' said Bruce: 'you, too, know what it is to fail. But the spider did not lose hope with the sixth failure. With still more care, she made ready to try for the seventh time.", "Bruce almost forgot his own troubles as he watched her swing herself out upon the slender line. Would she fail again? No! The thread was carried safely to the beam and fastened there." ]
-
-