Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

bancodobrasil/bbrc

Open more actions menu

Repository files navigation

BBRC: Brazilian Banking Regulation Corpora

Machine learning experiments for the article BBRC: Brazilian Banking Regulation Corpora, which describes an NLP corpora (dataset) with the same name as the article. It was published in the 7th Financial Technology and Natural Language Processing (FinNLP) 2024 (within LREC-COLING 2024).

We present BBRC, a collection of 25 corpus of banking regulatory risk from different departments of Banco do Brasil (BB). These are individual corpus about investments, insurance, human resources, security, technology, treasury, loans, accounting, fraud, credit cards, payment methods, agribusiness, risks, etc. They were annotated in binary form by experts indicating whether each regulatory document contains regulatory risk that may require changes to products, processes, services, and channels of a bank department or not. The corpora in Portuguese contain documents from 26 Brazilian regulatory authorities in the financial sector. In total, there are 61,650 annotated documents, mostly between half and three pages long. The corpora belong to a Natural Language Processing (NLP) application that has been in production since 2020. The corpora size is 1.6GB.

The article (paper): https://aclanthology.org/2024.finnlp-1.15.pdf

Hugging Face link to the data: https://huggingface.co/datasets/bancodobrasil/bbrc_brazilian_banking_regulation_corpora

Presentation video: https://drive.google.com/file/d/1Lk_xVno8odMJJK2yskEe9azQov4Y7vcv/view?usp=sharing

Presentation: https://drive.google.com/file/d/1vxKThA_CqDIX6XalFk68yb8WeTwDtek7/view?usp=sharing

FinNLP: https://sites.google.com/nlg.csie.ntu.edu.tw/finnlp-kdf-2024/home

LREC-COLING 2024: https://lrec-coling-2024.org/

LinkedIn post: https://www.linkedin.com/feed/update/urn:li:activity:7199492778874015745/

About

Brazilian Banking Regulation Corpora (BBRC), an NLP dataset described in the article with the same name. It was published in FinNLP 2024 (LREC-COLING 2024).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Morty Proxy This is a proxified and sanitized view of the page, visit original site.