Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Commit a2bcd1d

Browse filesBrowse files
committed
readme for token classification
1 parent 5749330 commit a2bcd1d
Copy full SHA for a2bcd1d

File tree

Expand file treeCollapse file tree

2 files changed

+55
-3
lines changed
Filter options
Expand file treeCollapse file tree

2 files changed

+55
-3
lines changed

‎README.md

Copy file name to clipboardExpand all lines: README.md
+55-3Lines changed: 55 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
</h2>
2020

2121
<p align="center">
22-
Simple machine learning with
22+
Generative AI with
2323
<a href="https://www.postgresql.org/" target="_blank">PostgreSQL</a>
2424
</p>
2525

@@ -408,9 +408,61 @@ SELECT pgml.transform(
408408
]
409409
```
410410
## Token Classification
411-
## Table Question Answering
412-
## Question Answering
411+
Token classification is a task in natural language understanding, where labels are assigned to certain tokens in a text. Some popular subtasks of token classification include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. NER models can be trained to identify specific entities in a text, such as individuals, places, and dates. PoS tagging, on the other hand, is used to identify the different parts of speech in a text, such as nouns, verbs, and punctuation marks.
412+
413+
![token classification](pgml-docs/docs/images/token-classification.png)
413414

415+
### Named Entity Recognition
416+
Named Entity Recognition (NER) is a task that involves identifying named entities in a text. These entities can include the names of people, locations, or organizations. The task is completed by labeling each token with a class for each named entity and a class named "0" for tokens that don't contain any entities. In this task, the input is text, and the output is the annotated text with named entities.
417+
418+
```sql
419+
SELECT pgml.transform(
420+
inputs => ARRAY[
421+
'I am Omar and I live in New York City.'
422+
],
423+
task => 'token-classification'
424+
) as ner;
425+
```
426+
*Result*
427+
```sql
428+
ner
429+
------------------------------------------------------
430+
[[
431+
{"end": 9, "word": "Omar", "index": 3, "score": 0.997110, "start": 5, "entity": "I-PER"},
432+
{"end": 27, "word": "New", "index": 8, "score": 0.999372, "start": 24, "entity": "I-LOC"},
433+
{"end": 32, "word": "York", "index": 9, "score": 0.999355, "start": 28, "entity": "I-LOC"},
434+
{"end": 37, "word": "City", "index": 10, "score": 0.999431, "start": 33, "entity": "I-LOC"}
435+
]]
436+
```
437+
438+
### Part-of-Speech (PoS) Tagging
439+
PoS tagging is a task that involves identifying the parts of speech, such as nouns, pronouns, adjectives, or verbs, in a given text. In this task, the model labels each word with a specific part of speech.
440+
441+
Look for models with `pos` to use a zero-shot classification model on the :hugs: Hugging Face model hub.
442+
```sql
443+
select pgml.transform(
444+
inputs => array [
445+
'I live in Amsterdam.'
446+
],
447+
task => '{"task": "token-classification",
448+
"model": "vblagoje/bert-english-uncased-finetuned-pos"
449+
}'::JSONB
450+
) as pos;
451+
```
452+
*Result*
453+
```sql
454+
pos
455+
------------------------------------------------------
456+
[[
457+
{"end": 1, "word": "i", "index": 1, "score": 0.999, "start": 0, "entity": "PRON"},
458+
{"end": 6, "word": "live", "index": 2, "score": 0.998, "start": 2, "entity": "VERB"},
459+
{"end": 9, "word": "in", "index": 3, "score": 0.999, "start": 7, "entity": "ADP"},
460+
{"end": 19, "word": "amsterdam", "index": 4, "score": 0.998, "start": 10, "entity": "PROPN"},
461+
{"end": 20, "word": ".", "index": 5, "score": 0.999, "start": 19, "entity": "PUNCT"}
462+
]]
463+
```
464+
## Question Answering
465+
## Table Question Answering
414466
## Translation
415467
## Summarization
416468
## Conversational
524 KB
Loading

0 commit comments

Comments
0 (0)
Morty Proxy This is a proxified and sanitized view of the page, visit original site.