In natural language, hyper references are one of the main challenges for comprehension. Hyper references are words that refer to other words they are nested within. For example: “The police searched the park and found the knife.” In this case, the word “police” is hyperreferenced by the word “search” it is a child word nested inside another child word. These hyper references require a reader to navigate through nested hierarchies in order to find out which word referred to by another. In many cases, hyperreferences can be resolved using logical inference or topic modeling techniques. Generally speaking, logical inference algorithms help identify parent-child relationships among hyperreferences by identifying common attributes between parent and child hyperreferences. Then, topic models allow understanding underlying semantic relationships among words by identifying topics that contain those words. In this blog post we will explore how you can use the LogicalXpress Graph Transformation Language (GTL) extension of R to resolve hyperreferences in your text documents automatically; with little training and no domain-specific knowledge required from your end users—so even non-programmers could make use of it!
Overview of logical inference algorithms
Logical inference algorithms are an automated way to create connections between words and sentences. The connection between words is described using parent-child phrases, which are related by certain attributes. For example, if the word “police” is followed by the word “search”, we can infer that the word “police” is the child of the word “search”. This pattern can be detected using algorithms, also known as natural language processing tools. There are several logical inference algorithms. The most popular ones include: – Chain of Implication: Given an assumption (such as the fact that a word is followed by another), and a specific example, the chain of implication algorithm can infer the unstated assumption that led to the specific example. – Co-Occurrence: Given a specific example, and a list of similar words, the co-occurrence algorithm can infer the fact that one of the similar words is also in the list. – Contextual Similarity: Given a specific example, and a list of words similar only in context, the contextual similarity algorithm can infer the fact that a specific word is also similar in context.
GTL extension in R for hyperreferencing resolution algorithm
The logical inference algorithms required to resolve hyperreferences can be easily implemented with the LogicalXpress Graph Transformation Language (GTL). GTL is a declarative schema language that can be easily integrated into applications. The code below shows how you can use GTL to implement the chain of implication algorithm to resolve hyperreferences in your text documents. This example uses the hyperreferencing noun “park”, which is hyperreferenced by the generic word “knife”. To resolve this example, the chain of implication algorithm will find parent-child relationships between “park” and “knife”, and infer that “knife” is the child of “park”.
Train the model with log files
To train the model, you will need to collect log files from the process being hyperreferenced. While this may seem like a tedious process, it is much simpler than you would expect. For example, if the hyperreferenced word is “police”, you can record the log files of any process related to police searches. Once you have the log files, you can use them to train the model using a machine learning algorithm such as Logistic Regression, Linear Regression and K-Nearest Neighbors.
Sample data preparation
The next step is to collect examples of hyperreferences. You can do this by creating a text file that contains the hyperreferenced text, and a list of hyperreferences as follows: The list should contain the hyperreference between the hyperreferenced words and the log file containing the original text.
Understand the text and build the graph
Now that you have the log files, you can build the graph from the log files to resolve the hyperreferences. To do this, you will need to first understand how each hyperreference is defined in the graph. To understand the graph, you will build a graph model that takes the following data as inputs: – The log files containing the original textual data – The hyperreferences defined in the graph model that use the log files as inputs – The hyperreferences between the children of the same parent in the graph model – A penalty function such as the one represented by the following equation: – A function that resolves the hyperreferences in the graph
Run the model and evaluate results
Now that you have built the graph model, you can run the model to resolve the hyperreferences. To do this, follow these steps: – Load the graph model created with the above steps – Run the model with the data (i.e., the graph model and original text)
In this blog post, you saw how to resolve hyperreferences in your text documents using the LogicalXpress Graph Transformation Language (GTL) extension in R. GTL is a declarative schema language that can be easily integrated into applications. Check: coreference resolution