Kernerman Dictionary News • Number 13 • June 2005
|
Towards
Hebrew FrameNet
In the
context of a research project that investigates the universality of the
semantic frame, initial steps have been taken towards the development
of Hebrew FrameNet, an on-line lexical resource for contemporary Hebrew
which will provide the semantic and syntactic combinatorial
possibilites, or valences, for each item analyzed, through the
manual annotation of example sentences in a newspaper corpus (and
eventually, the automatic capture and organization of the annotation
results for web-based viewing and querying). With advances in computer
technology and the existence of (searchable) corpora, the work of
lexicography has changed dramatically in recent years. The fine-grained
semantic classification and syntagmatic information of the sort to be
provided by Hebrew FrameNet will make the HFN database an invaluable
resource for lexicographers and advanced language teachers/learners, as
well as researchers in linguistics and natural language processing
(NLP). In accord
with FrameNet4, the first computational lexicography project
of its kind (Fontenelle 2003), HFN is based on the principles of Frame
Semantics (FS; Fillmore 1978, 1985, Petruck 1996), at the heart of
which is the semantic frame, an experience-based schematization
of the speaker’s world against which word meaning can be understood. In
Frame Semantics, a linguistic unit evokes a frame, whose frame
elements ( To
illustrate, consider the three predicates (in boldface) in the first
sentence of the initial corpus used for Hebrew FrameNet: esrot anashim magi'im
mi-tailand le-israel kshe-hem nirshamim ke-mitnadvim ax
le-ma'ase meshamshim
sxirim zolim The verb magi'a
[reach] evokes an Arriving frame, characterizing a situation in
which a Theme moves in the direction of a Goal, the latter either
expressed explicitly or implied by the verb. The noun phrase esrot
anashim fills the role of Theme, and functions as the subject of
the clause; the Goal is expressed by the prepositional phrase
complement le-israel; the example sentence also includes an
optional Source expression in the prepositional phrase mi-tailand.
nirsham [register]
evokes a Registration frame, describing a scene in which a Registrant
puts an Entity on record at an Institution as belonging to a Category
or as Licensed for a specific purpose or state. The noun phrase kshe-hem
expresses the Registrant and functions as the subject of the clause;
the noun phrase ke-mitnadvim fills the Category role. Finally, meshamshim
evokes the Function_as frame, in which an Entity serves a Function or
Purpose, the former for activities and the latter for states of
affairs. Although not present in the maximal clause of the verb meshamshim, it
is clear what fills the Entity role (hem, also indicated by the
3rd-person masculine plural ending -im on the verb); the object
noun phrase sxirim zolim expresses the Purpose. Table 1, below,
provides the definitions for the evoked frames and their respective
instantiated frame elements.
Table 1: Evoked Frames and
Instantiated Frame Elements Frame Element
annotation for each of the three predicates is given in (1). (1) [esrot anashim Theme] magi'im [mi-tailand Source][le-israelGoal] tens (of) people reach from-thailand
to-israel
[kshe-hem Registrant] nirshamim [ke-mitnadvim Category] as/when-they register as-volunteers ax le-ma'ase meshamshim [ovdim sxirim zolim Purpose] but in-fact they function workers hired cheap Tens of
people arrive in Such Frame
Semantic analyses are useful for research in crosslinguistic lexicology
(Subirats and Petruck 2003) and in the advanced foreign language
classroom (Sato 2004). For instance, whereas the Hebrew verb meshamesh
expresses the Purpose role as a direct object noun phrase (sxirim
zolim), English serve expresses it as a prepostional phrase
complement (as cheap labor). The availability of such
information via the internet will facilitate studies in Hebrew
linguistics as well as Hebrew language teaching/learning. An initial
goal of HFN is to produce full annotation for frame evoking elements5
in the newspaper corpus. This serves as a means of (1)
creating the infrastructure for using the FrameNet DeskTop for the
analysis of Hebrew texts and (2) determining the level of linguistic
description and computational representation at which the lexicon of
Modern Hebrew can be characterized in terms of existing frame semantic
concepts. Adapting the FrameNet DeskTop (FNDT; a suite of tools used
for defining frames, FEs, and words, and annotating illustrative
example sentences) for HFN will demonstrate the feasibility of using
the software for a non-IndoEuropean language.6 Investigating
the linguistic expression of events and scenarios through the same or
different frames will also document the different lexicalization
patterns of Hebrew and English (Talmy 2000). As with
FrameNets for other languages (e.g. Spanish7) the HFN
database will function as both a dictionary and a thesaurus. The
dictionary-like features include definitions, tables summarizing the
patterns of syntactic realizations of FEs that occur with a word, and
sets of annotated sentences from the corpus showing the semantic
information associated with each syntactic pattern. Like a thesaurus,
words are linked to the semantic frames in which they participate, and
frames are linked to other collections of words as well as to related
frames. Once attaining sufficient coverage, HFN data will serve the
needs of research in NLP for Hebrew, contributing deep semantic
information for a variety of tasks, including word sense
disambiguation, machine translation, information extraction, and
question answering (Litkowski 2004).
1. http://mila.cs.technion.ac.il/website/english/resources/corpora/2000sentences/index.html.
Fillmore, C.J. 1977. Topics in Lexical Semantics. In R. Cole
(ed.) Current Issues in Linguistic Theory. 76-138. Fillmore, C.J. 1985. Frames and the Semantics of
Understanding, Quaderni di Semantica 6.2: 222-254. Fontenelle, T. (ed.) 2003. International Journal of
Lexicography 16.3, September 2003. (Special issue devoted to
FrameNet). Litkowski, K.C. 2004. Senseval-3 Task: Automatic Labeling of
Semantic Roles. In Proceedings of Senseval-3: The Third
International Workshop on the Evaluation of Systems for the Semantic
Analysis of Text, ACL: Petruck, M.R.L. 1996. Frame Semantics. In J.
Verschueren, J-O. Östman, J.
Blommaert, and C. Bulcaen (eds.). Handbook of Pragmatics. Sato H.
2004. FrameNets and Language Teaching. Presentation at Crosslingual
FrameNet Group Meeting. October, 2004, ICSI, Subirats-Rüggeberg, C. and M.R.L. Petruck. 2003. Surprise: Spanish FrameNet! Presentation
at Workshop on Frame Semantics, International Congress of Linguists.
July, 2003, Talmy, L.
2000. Lexicalization Patterns . In L. Talmy.
Toward a Cognitive Semantics. Vol. 2: Typology and Process in
Concept Structuring.
K Dictionaries Ltd |
|
|
|