Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit 5749330

Browse filesBrowse files
committed
Added zero-shot classification
1 parent daf045c commit 5749330
Copy full SHA for 5749330

File tree

2 files changed

+60
-22
lines changed
Filter options

2 files changed

+60
-22
lines changed

‎README.md

Copy file name to clipboardExpand all lines: README.md
+60-22Lines changed: 60 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -188,6 +188,7 @@ SELECT pgml.transform(
188188
## Text Classification
189189

190190
Text classification involves assigning a label or category to a given text. Common use cases include sentiment analysis, natural language inference, and the assessment of grammatical correctness.
191+
191192
![text classification](pgml-docs/docs/images/text-classification.png)
192193

193194
### Sentiment Analysis
@@ -217,7 +218,7 @@ The default <a href="https://huggingface.co/distilbert-base-uncased-finetuned-ss
217218

218219
*Using specific model*
219220

220-
To use one of the over 19,000 models available on Hugging Face, include the name of the desired model and its associated task as a JSONB object in the SQL query. For example, if you want to use a RoBERTa <a href="https://huggingface.co/models?pipeline_tag=text-classification" target="_blank">model</a> trained on around 40,000 English tweets and that has POS (positive), NEG (negative), and NEU (neutral) labels for its classes, include this information in the JSONB object when making your query.
221+
To use one of the over 19,000 models available on Hugging Face, include the name of the desired model and `text-classification` task as a JSONB object in the SQL query. For example, if you want to use a RoBERTa <a href="https://huggingface.co/models?pipeline_tag=text-classification" target="_blank">model</a> trained on around 40,000 English tweets and that has POS (positive), NEG (negative), and NEU (neutral) labels for its classes, include this information in the JSONB object when making your query.
221222

222223
```sql
223224
SELECT pgml.transform(
@@ -276,7 +277,7 @@ NLI, or Natural Language Inference, is a type of model that determines the relat
276277

277278
The GLUE dataset is the benchmark dataset for evaluating NLI models. There are different variants of NLI models, such as Multi-Genre NLI, Question NLI, and Winograd NLI.
278279

279-
If you want to use an NLI model, you can find them on the :hugs: Hugging Face model hub. Look for models with "nli" or "mnli".
280+
If you want to use an NLI model, you can find them on the :hugs: Hugging Face model hub. Look for models with "mnli".
280281

281282
```sql
282283
SELECT pgml.transform(
@@ -324,7 +325,7 @@ SELECT pgml.transform(
324325
### Quora Question Pairs (QQP)
325326
The Quora Question Pairs model is designed to evaluate whether two given questions are paraphrases of each other. This model takes the two questions and assigns a binary value as output. LABEL_0 indicates that the questions are paraphrases of each other and LABEL_1 indicates that the questions are not paraphrases. The benchmark dataset used for this task is the Quora Question Pairs dataset within the GLUE benchmark, which contains a collection of question pairs and their corresponding labels.
326327

327-
If you want to use an QQP model, you can find them on the :hugs: Hugging Face model hub. Look for models with "qqp".
328+
If you want to use an QQP model, you can find them on the :hugs: Hugging Face model hub. Look for models with `qqp`.
328329

329330
```sql
330331
SELECT pgml.transform(
@@ -349,7 +350,7 @@ SELECT pgml.transform(
349350
### Grammatical Correctness
350351
Linguistic Acceptability is a task that involves evaluating the grammatical correctness of a sentence. The model used for this task assigns one of two classes to the sentence, either "acceptable" or "unacceptable". LABEL_0 indicates acceptable and LABEL_1 indicates unacceptable. The benchmark dataset used for training and evaluating models for this task is the Corpus of Linguistic Acceptability (CoLA), which consists of a collection of texts along with their corresponding labels.
351352

352-
If you want to use a grammatical correctness model, you can find them on the :hugs: Hugging Face model hub. Look for models with "cola".
353+
If you want to use a grammatical correctness model, you can find them on the :hugs: Hugging Face model hub. Look for models with `cola`.
353354

354355
```sql
355356
SELECT pgml.transform(
@@ -369,23 +370,60 @@ SELECT pgml.transform(
369370
{"label": "LABEL_1", "score": 0.9576480388641356}
370371
]
371372
```
372-
### Token Classification
373-
### Table Question Answering
374-
### Question Answering
375-
### Zero-Shot Classification
376-
### Translation
377-
### Summarization
378-
### Conversational
379-
### Text Generation
380-
### Text2Text Generation
381-
### Fill-Mask
382-
### Sentence Similarity
383-
384-
## Regression
385-
## Classification
386-
387-
## Applications
388-
### Text
373+
374+
## Zero-Shot Classification
375+
Zero Shot Classification is a task where the model predicts a class that it hasn't seen during the training phase. This task leverages a pre-trained language model and is a type of transfer learning. Transfer learning involves using a model that was initially trained for one task in a different application. Zero Shot Classification is especially helpful when there is a scarcity of labeled data available for the specific task at hand.
376+
377+
![zero-shot classification](pgml-docs/docs/images/zero-shot-classification.png)
378+
379+
In the example provided below, we will demonstrate how to classify a given sentence into a class that the model has not encountered before. To achieve this, we make use of `args` in the SQL query, which allows us to provide `candidate_labels`. You can customize these labels to suit the context of your task. We will use `facebook/bart-large-mnli` model.
380+
381+
Look for models with `mnli` to use a zero-shot classification model on the :hugs: Hugging Face model hub.
382+
383+
```sql
384+
SELECT pgml.transform(
385+
inputs => ARRAY[
386+
'I have a problem with my iphone that needs to be resolved asap!!'
387+
],
388+
task => '{
389+
"task": "zero-shot-classification",
390+
"model": "facebook/bart-large-mnli"
391+
}'::JSONB,
392+
args => '{
393+
"candidate_labels": ["urgent", "not urgent", "phone", "tablet", "computer"]
394+
}'::JSONB
395+
) AS zero_shot;
396+
```
397+
*Result*
398+
399+
```sql
400+
zero_shot
401+
------------------------------------------------------
402+
[
403+
{
404+
"labels": ["urgent", "phone", "computer", "not urgent", "tablet"],
405+
"scores": [0.503635, 0.47879, 0.012600, 0.002655, 0.002308],
406+
"sequence": "I have a problem with my iphone that needs to be resolved asap!!"
407+
}
408+
]
409+
```
410+
## Token Classification
411+
## Table Question Answering
412+
## Question Answering
413+
414+
## Translation
415+
## Summarization
416+
## Conversational
417+
## Text Generation
418+
## Text2Text Generation
419+
## Fill-Mask
420+
## Sentence Similarity
421+
422+
# Regression
423+
# Classification
424+
425+
<!-- # Applications
426+
## Text
389427
- AI writing partner
390428
- Chatbot for customer support
391429
- Social media post analysis
@@ -404,7 +442,7 @@ SELECT pgml.transform(
404442
- Ease of fine tuning and why
405443
- Rust based extension and its benefits
406444
- Problems with HTTP serving and how PML enables microsecond latency
407-
- Pgcat for horizontal scaling
445+
- Pgcat for horizontal scaling -->
408446

409447
## Concepts
410448
- Database
Loading

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.