What is Event Extraction?

Event extraction transforms unstructured text into a structured description of what happened. Add semantic modeling and a knowledge graph and you get knowledge discovery through complex querying and faceted search.

Extracting events from unstructured text aims to answer the who, what, when, where, why, and how questions about an occurrence in a structured manner. Let’s see what it is, where it’s applied, why it matters, and how it works

What is an event?

The Linguistic Data Consortium (LDC) gives the following definition of an event: 

An Event is a specific occurrence involving participants. An Event is something that happens. An Event can frequently be described as a change of state.

Consider the following example:

Heidi Cruz, Ted’s wife, became managing director of Goldman Sachs. They oversee the Texas utilities.”

The first sentence contains an event of type Start-Position, indicated most of all by the word “became“. We can also extract additional details about the event such as the Person starting the position – Heidi Cruz, the Position itself – managing director, as well as the Entity granting the position – Goldman Sachs.

In the above example, we can see how an event can be broken down in the following components:

  • The event extent is the span of text (usually a sentence), containing the event
  • The event type identifies the kind of occurrence according to a predefined schema
  • The event trigger is the keyword within the event extent that most clearly indicates the mention of the occurrence
  • The event arguments describe additional characteristics of the event such as participants, location and more. The types of arguments are specific to the event type. 

What are event detection and event extraction?

Event detection consists of identifying the event extent, type and trigger. When event types are classified according to a predefined schema we speak of closed-domain event detection. Alternatively, when a fitting event schema isn’t available, events can be clustered according to a suitable similarity metric in an approach known as open-domain event detection. 

Both approaches have their pros and cons and which one is more suitable depends on the context. Ontotext’s Event Detector for Narrative Analysis (EDNA), for example, uses a predefined event schema, while Ontotext’s Relation and Event Detector (RED) allows users to define additional event types on the fly.

Event extraction, on the other hand, starts with event detection as a first step and then proceeds with argument extraction. Like the trigger, arguments provide additional information about the event in a structured format that conforms to the schema. The types of arguments that can be extracted about an event depend on what type of event it is, but in general provide the answers to the who, what, where, when, why, how questions. 

Why does event extraction matter?

Well-structured descriptions of events can reveal hidden patterns and inform decision making. Such descriptions are, however, rarely readily available and manual extraction of events from raw text is a time-consuming and laborious task. A trained annotator would spend about a minute annotating an event, while an AI model is able to perform the same task in seconds. 

If we want to benefit from using automated solutions, we need to pay attention to the quality of the extracted data. Only then will automated event extraction (which is comparable in accuracy to human annotators) increase our productivity as it will allow us to focus on analyzing the events rather than looking for them in the text.

The structured nature of the extracted events pairs well with knowledge graphs. Knowledge graphs are a natural fit for representing and storing events as they provide the ability to set constraints on the modeled data via SHACL rules. This allows us to ensure that the event instances extracted from the data conform with the chosen event schema and satisfy any constraints about the extracted event types.

On top of that, knowledge graphs offer a lot of flexibility in terms of how data is interlinked. By using a data model that links various forms of online content with extracted events and their arguments, we can find patterns and connections within the data that were previously hidden. Such interlinking of structured information also allows us to design complex faceted queries based on said events, agents, locations, and more.

We can derive even more powerful insights when we add on a suitable entity linking solution, such as Ontotext’s Common English Entity Linking (CEEL) or Multilingual Entity Linking (MEL). Such integration can provide additional context about events based on the entities involved. We can also use it to explore temporal changes to a particular entity based on the events in which it’s involved. 

Real world applications

Like entity linking or extractive question answering, event extraction is a task relevant to a wide range of industries and use cases. While the types of events to be extracted can be very industry-specific, the approaches to the problem across industries are very similar.

The following are some examples of event extraction applications in different industries:

 

Approaches to event detection

Historically, approaches to event detection have followed the development of novel software and hardware solutions. We can broadly divide these approaches by the following categories:

  • The earliest approaches relied heavily on manually crafted rule-based solutions that require domain expertise. [1]
  • Those were followed by classical machine learning approaches such as support vector machines, random forests, naive bayes classifiers, and more. [2]
  • As neural-based computation became more affordable and thus scalable, various deep-learning approaches were adopted such as Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNNs), and more recently Graph Neural Networks (GNNs). [3]
  • The next bottleneck that emerges was labeled data availability and it resulted in methods that rely on very little labeled data (and a lot of unlabeled data), collectively known as semi-supervised learning. [4]
  • This task has also been addressed by unsupervised methods that don’t require any labeled data. [5]
  • Finally, large language models (LLMs) have also been used for event extraction, especially for use cases with little to no labeled data. Many LLMs use a technique known as Reinforcement learning from human feedback (RLHF) to correct their output based on user input. [6]

How are models trained?

Event extraction is most commonly broken up into four tasks – trigger identification, event classification, argument identification and argument classification. When a model performs these tasks in a sequential (dependent) manner we speak of a pipeline-based model. In contrast, a joint model performs these tasks in parallel (independent of each other). 

In the pipeline-based method [7], each task is informed by the output of the previous one. First, the trigger is identified, which informs the next task – event classification. The type of event is then used to inform the task of argument extraction. This pipeline-based method has a significant drawback – error propagation. Errors in trigger identification result in less accurate event classification, which negatively affects the quality of argument extraction.

The alternative approach, which eliminates error propagation, is known as the joint method [2]. The trigger and arguments are extracted independently of each other and are then fitted according to the event schema. A drawback of this approach is that argument identification and classification can’t benefit from knowledge about the extracted trigger.

Training vs fine-tuning

Training an event extraction model from scratch requires significant computational resources and a lot of high quality annotated data. In comparison, fine-tuning is the process of adapting an existing model to perform a specific task. It requires much lower resources and less data, but also has its limitations. 

Whenever a pretrained model is fine-tuned for a new task, the model’s performance on its previous task(s) would usually decrease. This is a problem, if for example we want to extend an event extraction model to detect and extract new event types, but also the ones the model is already trained on. The extent to which the model forgets the old event types can be mitigated through proper design of the new event types and selection of high-quality annotated data on which the model is fine-tuned. 

LLMs for event extraction

LLMs have also been employed for the purposes of extracting events. For example, Ontotext’s RED uses an LLM to recognize relationships and events in text. RED’s model-driven extraction approach is also easily adaptable and customizable to new event types. In fact, one benefit of using LLMs for event extraction is their zero-knowledge requirement to perform new tasks. Many LLMs are trained on multilingual datasets, which means that the task would also be adaptable to other languages. 

LLM solutions are no free lunch and their potential drawbacks must also be considered. Use cases that involve proprietary or personal data require proper privacy measures, which can’t be met by using off the shelf solutions. Hosting the solution on your own premises might resolve this problem, but this requires infrastructure and know-how. LLMs are also not computationally efficient resulting in cost and sustainability. These considerations should be taken into account when deciding whether an LLM approach is the best solution. Other considerations around LLMs include the consistency and explainability of their outputs.

Summary

Descriptions of events contain valuable information that can remain hidden when presented in an unstructured format. Extracting events from raw text in an automated manner allows us to uncover such information, especially when the solution leverages knowledge graph technology and entity linking. A wide range of approaches to event extraction are available and the right choice depends on the domain, use case, available data, and computational infrastructure.

Want to learn more about Event Extraction?

Dive into our AI in Action series!

 

References

[1] Riloff, E. (1993, July). Automatically constructing a dictionary for information extraction tasks. In AAAI (Vol. 1, No. 1, pp. 2-1).

[2] Patwardhan, S., & Riloff, E. (2009, August). A unified model of phrasal and sentential evidence for information extraction. In Proceedings of the 2009 conference on empirical methods in natural language processing (pp. 151-160).

[3] Chen, Y., Xu, L., Liu, K., Zeng, D., & Zhao, J. (2015, July). Event extraction via dynamic multi-pooling convolutional neural networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 167-176).

[4] Zhou, D., & Zhong, D. (2015). A semi-supervised learning framework for biomedical event extraction based on hidden topics. Artificial intelligence in medicine, 64(1), 51-58.

[5] Araki, J., & Mitamura, T. (2018, August). Open-domain event detection using distant supervision. In Proceedings of the 27th international conference on computational linguistics (pp. 878-891).

[6] Wang, X., Li, S., & Ji, H. (2022). Code4struct: Code generation for few-shot event structure prediction. arXiv preprint arXiv:2210.12810.

[7] Ahn, D. (2006, July). The stages of event extraction. In Proceedings of the Workshop on Annotating and Reasoning about Time and Events (pp. 1-8).

 

Ontotext Newsletter