Introduction to Mistral model
Mistral 7B is a decoder-based language model trained using almost 7 billion parameters designed to deliver both efficiency and high performance for real-world applications.
Employing attention mechanisms like Sliding Window Attention, Mistral 7B can train with an 8k context length and a fixed cache size, resulting in a theoretical attention span of 128K tokens. This capability allows the model to focus on crucial parts of the text efficiently. Moreover, the model incorporates Grouped Query Attention (GQA) to accelerate inference and reduce cache size, thereby expediting its inference process. Additionally, its Byte-fallback tokenizer ensures consistent representation of characters, eliminating the need for out-of-vocabulary tokens.
Such design features in its architecture equip Mistral 7B for exceptional performance, particularly in tasks related to language comprehension and generation. In this guide we see how we can use the Mistral LLM for text classification and named entity recognition.
Mistral Implementation in arcgis.learn
Install the model backbone
Follow these steps to download and install the Mistral model backbone:
-
Download the mistral model backbone.
-
Extract the downloaded zip file.
-
Open the anaconda prompt and move to the folder that contains arcgis_mistral_backbone-1.0.0-py_0.tar.bz2
-
Run:
conda install --offline arcgis_mistral_backbone-1.0.0-py_0.tar.bz2
Mistral with the TextClassifier model
Import the TextClassifier class from the arcgis.learn.text module
from arcgis.learn.text import TextClassifier
Initialize the TextClassifier model with a databunch
Prepare databunch for the TextClassifier
model using theprepare_textdata
method in arcgis.learn
.
from arcgis.learn import prepare_textdata
data = prepare_textdata("path_to_data_folder", task="classification",train_file="input_csv_file.csv",
text_columns="text_column", label_columns="label_column")
Once the data is prepared, the TextClassifier
model object can be instantiated as below with the following parameters:
data
: The databunch created using the prepare_textdata method.
backbone
: To use mistral as the model backbone, use backbone="mistral".
prompt
: Text string describing the task and its guardrails. This is an optional parameter.
classifier_model = TextClassifier(
data=data,
backbone="mistral",
prompt="Classify all the input sentences into the defined labels, do not make up your own labels."
)
Initialize the TextClassifier model without a databunch
A TextClassifier
model with a mistral backbone can also be created without a large dataset using only a few examples.
Below are the parameters to be passed into TextClassifier
:
backbone
: To use mistral as the model backbone, use backbone="mistral".
examples
: User defined examples to provide the mistral model, in python dictionary format:
{
"label_1" :["input_text_example_1", "input_text_example_2", ...],
"label_2" :["input_text_example_1", "input_text_example_2", ...],
...
}
prompt
: Text string describing the task and its guardrails. This is an optional parameter.
classifier_model = TextClassifier(
data=None,
backbone="mistral",
prompt="Classify all the input sentences into the defined labels, do not make up your own labels.",
examples={
"positive" : [" Good! it was a wonderful experience!", "i really adore your work"],
"negative" : ["The customer support was unhelpful", "I don`t like your work"]
}
)
Classify the text using mistral model
To classify text using the mistral model, use the predict
method from the TextClassifier
class. The input to the method will be a text string or a list of text string.
classifier_model.predict()
Load the model
To load a saved mistral model, use the from_model
method from the TextClassifier
class.
classifier_model.from_model(r'path_to_dlpk_file')
Save the model
The following method saves the model weights and creates a Deep Learning Package (.dlpk).
classifier_model.save("name_of_the_mistral_model")
Mistral with an EntityRecognizer model
Import the EntityRecognizer class from the arcgis.learn.text module
from arcgis.learn.text import EntityRecognizer
Initialize the EntityRecognizer model with a databunch
Prepare the databunch for the EntityRecognizer
model using the prepare_textdata
method in arcgis.learn.
from arcgis.learn import prepare_textdata
data = prepare_textdata("path_to_data_file", task="entity_recognition", dataset_type='ner_json')
Once the data is prepared, the EntityRecognizer
model object can be instantiated with the following parameters:
data
: The databunch created using the prepare_textdata
method.
backbone
: To use mistral as the model backbone, use backbone="mistral".
prompt
: Text string describing the task and its guardrails. This is an optional parameter.
entity_recognizer_model = EntityRecognizer(
data=data,
backbone="mistral",
prompt="Tag the input sentences in the named entity for the given classes, no other class should be tagged."
)
Initialize the EntityRecognizer model without a databunch
An EntityRecognizer
model with a mistral backbone can also be created without a large dataset by using only a few examples.
Below are the parameters to be passed into EntityRecognizer
:
backbone
: To use mistral as the model backbone, use backbone="mistral".
examples
: User defined examples for the mistral model, in python list format:
[
("input_text_sentence",
{
"class_1":["Named Entity", ...],
"class_2": ["Named entity", ...],
...
}
)
...
]
Note: The EntityRecognizer class, using the "Mistral" backbone, needs at least six examples to work effectively.
prompt
: Text string describing the task and its guardrails. This is an optional parameter.
entity_recognizer_model = EntityRecognizer(
data=None,
backbone="mistral",
prompt="Tag the input sentences in the named entity for the given classes, no other class should be tagged."
examples=[(
'Jim stays in London',
{
'name': ['Jim'],
'location': ['London']
},
...
)]
)
Extract entities using the mistral model
To extract named entities using the mistral model, use the extract_entities
method from the EntityRecognizer
class. The input to the method will be a text string or a list of text strings.
entity_recognizer_model.extract_entities()
Load the model
To load a saved mistral model, use the from_model
method from the EntityRecognizer
class.
entity_recognizer_model.from_model(r'path_to_dlpk_file')
Save the model
This method saves the model weights and creates a Deep Learning Package (.dlpk).
entity_recognizer_model.save("name_of_the_mistral_model")
Conclusion
In this guide we demonstrated the steps to initialize and perform inference using the Mistral LLM as a backbone with the TextClassifier and EntityRecognizer models in arcgis.learn.
References
Mistral-7B HuggingFace: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2
Mistral-7B MistralAI: https://mistral.ai/news/announcing-mistral-7b