Fasttext Classifier
fast_enc
module uses the facebook's fasttext -- efficient learning of word representations and sentence classification.
This tutorial shows you how to train a fasttext Classifier
with a custom training loop to categorize sentences.
- Preparing dataset
- Import fasttext(
fast_enc
) module in jac - Train the model
- Evaluate the model's effectiveness
- Use the trained model to make predictions
Walk through
1. Praparing dataset
For this tutorial, we are going to leverage the fasttext Classifier
for sentence classification, which is categorizing an incoming text into a one of predefined intents. for demonstration purpose, we are going to use the SNIPS dataset as an example here. snips dataset.
SNIPS is a popular intent classificawtion datasets that covers intents such as [ "BookRestaurant", "ComparePlaces", "GetDirections", "GetPlaceDetails", "GetTrafficInformation", "GetWeather", "RequestRide", "SearchPlace", "ShareCurrentLocation", "ShareETA" ]
We need to do a little data format conversion to create a version of SNIPS that work with our fasttext Classifier
implemenation.
For this part, we are going to use Python. First,
-
Import the dataset
from huggingface dataset library.# import library from datasets import load_dataset # load dataset dataset = load_dataset("snips_built_in_intents") print(dataset["train"][:2])
If imported successsfuly, you should see the data format to be something like this
{"text": ["Share my location with Hillary's sister", "Send my current location to my father"], "label": [5, 5]}
-
Converting the format
from the SNIPS out of the box to the format that can be ingested by biencoder.import pandas as pd from sklearn.model_selection import train_test_split import json # get labels names lab = dataset["train"].features["label"].names # create labels dictionary label_dict = {v: k for v, k in enumerate(lab)} # dataset dataset = dataset["train"] # create dataset function def CreateData(data): # Create dataframe df = pd.DataFrame(data) # Map labels dict on label column df["label"] = df["label"].apply(lambda x : label_dict[x]) # grouping text on basis of label df = df.groupby("label").agg({"text": "\t".join, "label": "\t".join}) df["label"] = df["label"].apply(lambda x: x.split("\t")[0]) # Create data dictionary data_dict = {} for i in range(len(df)): data_dict[df["label"][i]] = df["text"][i].split("\t") return data_dict # Split dataset: Create train and test dataset and store in json file `train_bi.json` and `test_bi.json` and save to disk. # Split dataset in train and test set train, test = train_test_split(dataset, test_size=0.2, random_state=42) # Create train dataset train_data = CreateData(train) # write data in json file 'train.json' with open("train.json", "w", encoding="utf8") as f: f.write(json.dumps(train_data, indent = 4)) # Create test dataset test_data = CreateData(test) data = { "contexts": [], "labels": [] } for itm in test_data: data["labels"].append(itm) data["contexts"].extend(test_data[itm]) # write data in json file 'test.json' with open("test.json", "w", encoding="utf8") as f: f.write(json.dumps(data, indent = 4))
The example result format should look something like this.
-
train.json
"BookRestaurant": [ "Book me a table for 2 people at the sushi place next to the show tomorrow night","Find me a table for four for dinner tonight" ], "ComparePlaces": [ "What's the cheapest between the two restaurants the closest to my hotel?" ]
-
test.json
{ { "contexts": [ "We are a party of 4 people and we want to book a table at Seven Hills for sunset", "Book a table at Saddle Peak Lodge for my diner with friends tonight", "How do I go to Montauk avoiding tolls?", "What's happening this week at Smalls Jazz Club?", "Will it rain tomorrow near my all day event?", "Send my current location to Anna", "Share my ETA with Jo", "Share my ETA with the Snips team" ], "labels": [ "BookRestaurant", "ComparePlaces", "GetDirections", "GetPlaceDetails", "GetTrafficInformation", "GetWeather", "RequestRide", "SearchPlace", "ShareCurrentLocation", "ShareETA" ] } }
-
2. Import Fasttext(fast_enc)
module in jac
- Open terminal and run jaseci by cmd
jsctl -m
- Load
fast_enc
module in jac by cmdactions load module jac_nlp.fast_enc
3. Train the model
For this tutorial, we are going to train and test
the fast_enc
for intent classification
its train
on snips train datasets
and test
on test dataset
, which is categorizing an incoming text into a one of predefined intents.
- Creating Jac Program (
train and test
fast_enc)-
Create a file by name
fasttext.jac
-
Create node
model_dir
andfasttext
infasttext.jac
filenode model_dir; node fasttext {};
-
Initializing
node fasttext
and importtrain
andinfer
ability inside node.# import train and infer ability can fast_enc.train, fast_enc.predict;
-
Initialize module
train
andtest
insidefatstext node
fast_enc.train
take training argument and start traingfast_enc
modulecan train_fasttext with train_and_test_fasttext entry{ #Code snippet for training the model train_data = file.load_json(visitor.train_file); std.out("fasttext training started...",train_data.type, visitor.train_with_existing.bool); report fast_enc.train( traindata = train_data, train_with_existing = visitor.train_with_existing ); } can tests with train_and_test_fasttext exit{ std.out("fasttext validation started..."); # Use the model to perform inference # returns the list of context with the suitable intents test_data = file.load_json(visitor.test_file); resp_data = fast_enc.predict( sentences=test_data["contexts"] ); fn = "fasttext_val_result.json"; file.dump_json(fn, resp_data); }
-
Initialize module for predict intent on new text
can predict with predict_fasttext entry{ # Use the model to perform inference resp_data = fast_enc.predict( sentences=file.load_json(visitor.test_file)["text"] ); # the infer action returns all the labels with the probability scores report [resp_data]; }
Parameter details
-
train
: will be used to train thefasttext module
on custom dataset- Input:
traindata
(Dict): dictionary of candidates and suportting contexts for each candidatetrain_with_existing
(bool): if set tofalse
train the model fromscratch
otherwise trainsincrementally
- Input:
-
infer
: will be used to predits the most suitable candidate for a provided context, takes text or embedding- Input:
contexts
(list of strings): context which needs to be classified
- Return: a dictionary of probability score for each candidate and context
- Input:
-
-
Adding edge name of
enc_model
infasttext.jac
file for connecting nodes inside graph.# adding edge edge enc_model { has model_type; }
-
Adding graph name of
encoder_graph
for initializing node .graph encoder_graph { has anchor enc_model_dir; spawn { enc_model_dir = spawn node::model_dir; fasttext_node = spawn node::fasttext; enc_model_dir -[enc_model(model_type="fasttext")]-> fasttext_node; } }
-
Initializing
walker init
for calling graphwalker init { root { spawn here ++> graph::encoder_graph; } }
-
Creating walker name of
train_and_evaluate_fasttext
for getting parameter from context or default and calling abilitytrain and test
.# Declaring the walker: walker train_and_evaluate_fasttext{ # the parameters required for training has train_with_existing=false; has train_file="train.json"; has test_file="test.json"; root { take --> node::model_dir; } model_dir { take -->; } }
Default parameter for train and test fasttext
train_file
: local path of train.json filetrain_with_existing
: falsetest_file
: local path of test.json file -
Declaring walker for
predicting intents
on new text# Declaring walker for predicting intents on new text walker predict_fasttext{ has test_file = "test_dataset.json"; root { take --> node::model_dir; } model_dir { take -->; } }
Final fasttext.jac program
node model_dir; node fasttext{ # import train and infer ability can fast_enc.train, fast_enc.predict; can train_fasttext with train_and_test_fasttext entry{ #Code snippet for training the model train_data = file.load_json(visitor.train_file); std.out("fasttext training started...",train_data.type, visitor.train_with_existing.bool); report fast_enc.train( traindata = train_data, train_with_existing = visitor.train_with_existing ); } can tests with train_and_test_fasttext exit{ std.out("fasttext validation started..."); # Use the model to perform inference # returns the list of context with the suitable candidates test_data = file.load_json(visitor.test_file); resp_data = fast_enc.predict( sentences=test_data["contexts"] ); # the infer action returns all the candidate with the confidence scores # Iterate through the candidate labels and their predicted scores fn = "fasttext_val_result.json"; file.dump_json(fn, resp_data); } can predict with predict_fasttext entry{ # Use the model to perform inference resp_data = fast_enc.predict( sentences=file.load_json(visitor.test_file)["text"] ); # the infer action returns all the labels with the probability scores report [resp_data]; } } # adding edge edge enc_model { has model_type; } graph encoder_graph { has anchor enc_model_dir; spawn { enc_model_dir = spawn node::model_dir; fasttext_node = spawn node::fasttext; enc_model_dir -[enc_model(model_type="fasttext")]-> fasttext_node; } } walker init { root { spawn here ++> graph::encoder_graph; } } # Declaring the walker: walker train_and_test_fasttext{ # the parameters required for training has train_with_existing=false; has train_file="train.json"; has test_file="test.json"; root { take --> node::model_dir; } model_dir { take -->; } } # declaring walker for predicting intents on new text walker predict_fasttext{ has test_file = "test_dataset.json"; root { take --> node::model_dir; } model_dir { take -->; } }
-
- Steps for running
fasttext.jac
programm-
Run the following command to Build
fasttext.jac
jac build fasttext.jac
-
Run the following command to Activate sentinal
sentinel set -snt active:sentinel -mode ir fasttext.jir
Note: If getting error
ValueError: badly formed hexadecimal UUID string
execute only oncesentinel register -set_active true -mode ir fasttext.jir
-
Run the following command to execute walker
train_and_test_fasttext
withdefault parameter
for trainingfast_enc
module.walker run train_and_test_fasttext
-
You'll find the following logs on console
training logs
jaseci > walker run train_and_evaluate_fasttext -ctx "{\"train_file\":\"train.json\",\"test_file\":\"test.json\",\"train_with_existing\":\"false\"}" Training... Wrote 261 sentences to <file_location>.singh\anaconda3\envs\pytorch\lib\site-packages\jac_nlp\jac_nlp\fasttext\pretrained_model\train.txt Read 0M words Number of words: 577 Number of labels: 10 Progress: 100.0% words/sec/thread: 105638 lr: 0.000000 avg.loss: 1.230422 ETA: 0h 0m 0s Saving... Model saved to <file_location>.singh\anaconda3\envs\pytorch\lib\site-packages\jac_nlp\jac_nlp\fasttext\pretrained_model\model.ftz. LABELS (10): - BookRestaurant - GetPlaceDetails - GetWeather - GetDirections - SearchPlace - RequestRide - ShareETA - GetTrafficInformation - ComparePlaces - ShareCurrentLocation fasttext validation started... { "success": true, "report": [ "Model training Completed" ] }
-
4. Evaluation of the model effectiveness
-
Performing model effectiveness on
test.json
datasetModel testing Accuracy : 0.82 Model testing F1_Score : 0.76 Model classification Report precision recall f1-score support BookRestaurant 1.00 1.00 1.00 14 ComparePlaces 1.00 0.25 0.40 4 GetDirections 0.70 1.00 0.82 7 GetPlaceDetails 0.56 0.90 0.69 10 GetTrafficInformation 0.80 1.00 0.89 4 GetWeather 0.88 0.78 0.82 9 RequestRide 1.00 1.00 1.00 5 SearchPlace 0.00 0.00 0.00 6 ShareCurrentLocation 1.00 1.00 1.00 3 ShareETA 1.00 1.00 1.00 4 accuracy 0.82 66 macro avg 0.79 0.79 0.76 66 weighted avg 0.78 0.82 0.78 66
Sample Result Data
{ "I want a table for friday 8pm for 2 people at Katz's Delicatessen": [ { "sentence": "I want a table for friday 8pm for 2 people at Katz's Delicatessen", "intent": "BookRestaurant", "probability": 0.9759947061538696 } ], "I want a table in a good japanese restaurant near Trump tower": [ { "sentence": "I want a table in a good japanese restaurant near Trump tower", "intent": "BookRestaurant", "probability": 0.830787181854248 } ], "Book a table at a restaurant near Times Square for 2 people tomorrow night": [ { "sentence": "Book a table at a restaurant near Times Square for 2 people tomorrow night", "intent": "BookRestaurant", "probability": 0.9866142272949219 } ], "Book a table for today's lunch at Eggy's Diner for 3 people": [ { "sentence": "Book a table for today's lunch at Eggy's Diner for 3 people", "intent": "BookRestaurant", "probability": 0.9936538934707642 } ] }
5. Use the trained model to make predictions
-
Create new input data for prdiction stored in a file for example -
test_dataset.json
Input data{ "text": [ "We are a party of 4 people and we want to book a table at Seven Hills for sunset", "Is Waldorf Astoria more luxurious than the Four Seasons?" ] }
-
Run the following command to Executing walker
predict_fasttext
walker run predict_fasttext
Output Result
{ "We are a party of 4 people and we want to book a table at Seven Hills for sunset": [ { "sentence": "We are a party of 4 people and we want to book a table at Seven Hills for sunset", "intent": "BookRestaurant", "probability": 0.9151427149772644 } ], "Is Waldorf Astoria more luxurious than the Four Seasons?": [ { "sentence": "Is Waldorf Astoria more luxurious than the Four Seasons?", "intent": "GetPlaceDetails", "probability": 0.34331175684928896 } ] }