Entity, Relation, and Event Extraction with Contextualized ...

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processingand the 9th International Joint Conference on Natural Language Processing, pages 5784 5789,Hong Kong, China, November 3 7, 2019 Association for Computational Linguistics5784 Entity, Relation, and Event Extractionwith Contextualized span RepresentationsDavid Wadden Ulme Wennberg Yi Luan Hannaneh Hajishirzi Paul G. Allen School of Computer Science & Engineering, University of Washington Google AI Language Allen Institute for Artificial examine the capabilities of a unified, multi-task framework for three information extrac-tion tasks: named entity recognition, rela-tion Extraction , and Event (called DYGIE++) accomplishesall tasks by enumerating, refining, and scoringtext spans designed to capture local (within-sentence) and global (cross-sentence) framework achieves state-of-the-art results across all tasks, on four datasetsfrom a variety of domains.

We perform ex-periments comparing different techniques toconstruct span embeddings like BERT perform well atcapturing relationships among entities in thesame or adjacent sentences, while dynamicspan graph updates model long-range cross-sentence relationships. For instance, propagat-ing span representations via predicted coref-erence links can enable the model to disam-biguate challenging entity mentions. Our codeis publicly available can be easilyadapted for new tasks or IntroductionMany information Extraction tasks includingnamed entity recognition, relation Extraction , eventextraction, and coreference resolution can benefitfrom incorporating global context across sentencesor from non-local dependencies among example, knowledge of a coreference relation-ship can provide information to help infer the typeof a difficult-to-classify entity mention.

In eventextraction, knowledge of the entities present in asentence can provide information that is useful forpredicting Event model global context, previous works haveused pipelines to extract syntactic, discourse, andother hand-engineered features as inputs to struc-tured prediction models (Li et al., 2013; Yang andBERTR elation propagationCoreference propagationGraph propagationEventsEntitiesRelationsSpan enumerationEvent 1:Overview of our framework:DYGIE++.Shared span representations are constructed by refin-ing Contextualized word embeddings via span graphupdates, then passed to scoring functions for three , 2016; Li and Ji, 2014) and neural scor-ing functions (Nguyen and Nguyen, 2019), or as aguide for the construction of neural architectures(Peng et al.)

, 2017; Zhang et al., 2018; Sha et al.,2018; Christopoulou et al., 2018). Recent end-to-end systems have achieved strong performance bydynmically constructing graphs of spans whoseedges correspond to task-specific relations (Luanet al., 2019; Lee et al., 2018; Qian et al., 2018).Meanwhile, contextual language models (Daiand Le, 2015; Peters et al., 2017, 2018; Devlinet al., 2018) have proven successful on a range ofnatural language processing tasks (Bowman et al.,2015; Sang and De Meulder, 2003; Rajpurkar et al.,2016). Some of these models are also capable ofmodeling context beyond the sentence instance, the attention mechanism in BERT stransformer architecture can capture relationshipsbetween tokens in nearby this paper, we study different methods to in-corporate global context in a general multi-task IEframework, building upon a previous span -based IEmethod (Luan et al.

, 2019). OurDYGIE++frame-work, shown in Figure 1, enumerates candidate text5785spans and encodes them using contextual languagemodels and task-specific message updates passedover a text span graph. Our framework achievesstate-of-the results across three IE tasks, leveragingthe benefits of both contextualization conduct experiments and a thorough anal-ysis of the model on named entity, relation, andevent Extraction . Our findings are as follows: (1)Our general span -based framework produces state-of-the-art results on all tasks and all but one sub-tasks across four text domains, with relative errorreductions ranging from - (2) BERT encodings are able to capture important withinand adjacent-sentence context, achieving improvedperformance by increasing the input window size.

(3) Contextual encoding through message passingupdates enables the model to incorporate cross-sentence dependencies, improving performance be-yond that of BERT alone, especially on IE tasks inspecialized Task and ModelOurDYGIE++framework extends a recent span -based model for entity and relation Extraction (Luan et al., 2019) as follows: (1) We perform eventextraction as an additional task and propagate spanupdates across a graph connecting Event triggers totheir arguments. (2) We build span representationson top of multi-sentence BERT Task definitionsThe input is a document represented as a sequenceof tokensD, from which our model constructsspansS={s1,..,sT}, the set of all possiblewithin-sentence phrases (up to a threshold length)in the Entity Recognition involves predictingthe best entity type labeleifor each spansi.

Forall tasks, the best label may be a null label. Rela-tion Extraction involves predicting the best relationtyperijfor all span pairs(si,sj). For the datasets studied in this work, all relations are betweenspans within the same sentence. The coreferenceresolution task is to predict the best coreferenceantecedentcifor each spansi. We perform coref-erence resolution as auxiliary task, to improve therepresentations available for the main three Extraction involves predicting named enti-ties, Event triggers, Event arguments, and argumentroles. Specifically, each tokendiis predicted asan Event trigger by assigning it a labelti. Then,for each triggerdi, Event arguments are assignedto this Event trigger by predicting an argument roleaijfor all spanssjin the same sentence asdi.

Un-like most work on Event Extraction , we considerthe realistic setting where gold entity labels are notavailable. Instead, we use predicted entity mentionsas argument DyGIE++ ArchitectureFigure 1 depicts the four-stage architecture. Formore details, see (Luan et al., 2019).Token encoding:DYGIE++uses BERT for tokenrepresentations using a sliding window approach,feeding each sentence to BERT together with asize-Lneighborhood of surrounding enumeration:Spans of text are enumeratedand constructed by concatenating the tokens repre-senting their left and right endpoints, together witha learned span width graph propagation:A graph structure isgenerated dynamically based on the model s cur-rent best guess at the relations present among thespans in the document.

Each span representationgtjis updated by integrating span representations fromits neighbors in the graph according to three vari-ants of graph propagation. In coreference propaga-tion, a span s neighbors in the graph are its likelycoreference antecedents. In relation propagation,neighbors are related entities within a sentence. Inevent propagation, there are Event trigger nodes andevent argument nodes; trigger nodes pass messagesto their likely arguments, and arguments pass mes-sages back to their probable triggers. The wholeprocedure is trained end-to-end, with the modellearning simultaneously how to identify importantlinks between spans and how to share informationbetween those formally, at each iterationtthe model gen-erates an updateutx(i)for spanst Rd:utx(i) = j Bx(i)Vtx(i,j) gtj,(1)where denotes elementwise multiplication andVtx(i,j)is a measure of similarity between spansiandjunder taskx for instance, a score indi-cating the model s confidence that spanjis thecoreference antecedent of spani.

For relation ex-traction, we use a ReLU activation to enforce spar-sity. The final updated span representationgt+1jiscomputed as a convex combination of the previousrepresentation and the current update, with weightsdetermined by a gating classification:The re-contextualizedrepresentations are input to scoring functions whichmake predictions for each of the end tasks. We usea two-layer feedforward neural net (FFNN) as thescoring function. For trigger and named entity pre-diction for spangi, we computeFFNN task(gi). Forrelation and argument role prediction, we concate-nate the relevant pair of embeddings and computeFFNN task([gi,gj]).3 Experimental SetupDataWe experiment on four different datasets:ACE05, SciERC, GENIA and WLPC (Statisticsand details on all data sets and splits can be foundin Appendix A.)

Entity, Relation, and Event Extraction with Contextualized ...

Tags:

Information

Transcription of Entity, Relation, and Event Extraction with Contextualized ...

Related search queries

Entity, Relation, and Event Extraction with Contextualized ...

Tags:

Information

Documents from same domain

Related documents

Related search queries