Konzeption und Realisierung eines Modells zur Multi-Label-Textklassifikation und Named Entity Recognition unter Verwendung von künstlicher Intelligenz

Hiller, Paul David

Öffnen

Bachelorarbeit_Hiller_Paul_David.pdf (4.110Mb)

Autor:in

Hiller, Paul David

Gutachter:in

Priefer, Dennis

Kneisel, Peter

Datum

2023

DOI

10.25716/thm-282

Metadata

Zur Langanzeige

Abschlussarbeit (Bachelor) Open Access

Zusammenfassung

The integration of artificial intelligence into business processes is a crucial step for companies to survive in a competitive environment. Text processing is one of the most important applications of artificial intelligence in a growing sector. This thesis focuses on text processing through multi-label text classification and named entity recognition. The aim is to investigate how multi-label text classification and named entity recognition can be applied, implemented and evaluated using artificial intelligence. To this end, the basics of neural networks in the context of multi-label text classification and named entity recognition as well as the associated metrics are first explained. With the help of a quantitative research approach and a structured literature review, the current state of research is identified. Based on this, a neural network consisting of a BERT and an ELMo encoder, a bidirectional long short-term memory and conditional random fields for named entity recognition as well as a neural network based on the universal sentence encoder with a bidirectional long short-term memory, a fully connected layer and individual heads for classifying the text into several labels are implemented, merged into one system and evaluated. The metrics and methods identified within the structured literature research are summarised in an evaluation concept. This is used to evaluate the realised models. On a Reuters 21578 dataset reduced to 20 labels, micro and macro F1 scores of 73% and 56% respectively were achieved for the classification of texts with multiple labels and 94% and 93% respectively for the recognition of named entities on the CoNLL03 dataset.

Schlagworte

Named Entity Recognition
Multi-Label Text Classification
Natural Language Processing
Artificial Intelligence
Deep Learning

Umfang

II, 83 S.

Link zur Veröffentlichung

https://publikationsserver.thm.de/xmlui/handle/123456789/335

Sammlungen

Mathematik, Naturwissenschaften und Informatik (MNI) [18]

Die folgenden Lizenzbestimmungen sind mit dieser Ressource verbunden:
Urheberrechtlich geschützt