/img alt="Imagem da capa" class="recordcover" src="""/>
Artigo
Uma metodologia em cascata de quatro etapas para classificar códigos NCM usando técnicas de PLN
This work aims to develop a process to classify the descriptions of products present in electronic invoices (NF-e). This classification is based on the 8 digits of the Common Mercosur Nomenclature (NCM), separated into 4 parts, Chapter, Position, Subheading and item/Subitem. The classification was p...
Autor principal: | PINHEIRO, Pedro Luiz Braga |
---|---|
Grau: | Artigo |
Publicado em: |
2023
|
Assuntos: | |
Acesso em linha: |
https://bdm.ufpa.br:8443/jspui/handle/prefix/5010 |
Resumo: |
---|
This work aims to develop a process to classify the descriptions of products present in electronic invoices (NF-e). This classification is based on the 8 digits of the Common Mercosur Nomenclature (NCM), separated into 4 parts, Chapter, Position, Subheading and item/Subitem. The classification was performed using the Support Vector Machine (SVM) algorithm and the Naıve Bayess algorithm together with Natural Language Processing (NLP) techniques, for processing a database of 340,000 different products. The data were divided into 80% training and 20% testing and an accuracy of 90% was obtained for a total of 98 classes. |