Date of Award

Fall 2024

Project Type

Dissertation

Program or Major

Computer Science

Degree Name

Doctor of Philosophy

First Advisor

Laura Dietz

Second Advisor

Marek Petrik

Third Advisor

Dongpeng Xu

Abstract

For various artificial intelligence systems, automatic text understanding algorithms that go beyond mere pattern recognition are helpful. For instance, understanding web pages is beneficial for Information Retrieval (IR) systems to retrieve relevant information. In the most common form, IR systems retrieve relevant information as entities or documents in response to keywords-based queries. Traditional IR models use lexical matching between the query terms and document terms to identify relevant documents. With the emergence of neural networks, Neural IR approaches utilize deep learning techniques to learn high-dimensional representations of documents and queries that go beyond term matching.

On the other hand, with the development of symbolic knowledge graphs (KGs), the IR models leverage the semantic information in the form of entities or concepts (e.g., relations, name, description, type, etc.) to retrieve relevant information. Entities are physical objects or things, such as people, places, organizations, etc., representing world knowledge. Traditionally, lexical matching between the query terms and semantic information of entities (e.g., related entities, name, description, type, etc.) is used to recognize the relevant information.

This thesis aims to advance state-of-the-art text understanding by developing novel neuro-symbolic algorithms that combine the strengths of two paradigms: the symbolic knowledge graph and the textual documents. Our goal is to leverage the interplay between textual documents and the semantic information of entities to retrieve relevant information. In the first part of this thesis, we devise algorithms to understand how to best utilize the relations between symbolic entities in textual documents to identify relevant entities. In the second part of the thesis, we explore the intricacies of the connections between the symbolic knowledge and the textual documents to generate neuro-symbolic representations for identifying relevant information in the form of entities and documents. To achieve this, we develop novel neuro-symbolic models that identify relevant connections between symbolic knowledge and textual documents and leverage these connections to retrieve relevant information.

Share

COinS