Inductive Learning
• Analyzing words into their
linguistic components (morphemes).
• Morphemes are the smallest
meaningful units of language.
• Ambiguity: More than one
alternatives
flies flyVERB+PROG
flyNOUN+PLU
adam adam+ACC - the man
(accusative)
adam+P1SG - my man
ada+P1SG+ACC - my island
(accusative)
Parts-of-Speech (POS) Tagging
• Each word has a
part-of-speech tag to describe its category.
• Part-of-speech tag of a
word is one of major word groups (or its subgroups).
– open classes --
noun, verb, adjective, adverb
– closed classes --
prepositions, determiners, conjuctions, pronouns, particples
• POS Taggers try to find
POS tags for the words.
• duck is a verb or noun?
(morphological analyzer cannot make decision).
• A POS tagger may make that
decision by looking the surrounding words.
– Duck! (verb)
– Duck is delicious for
dinner. (noun)
Lexical Processing
• The purpose of lexical
processing is to determine meanings of individual words.
• Basic methods is to lookup
in a database of meanings – lexicon
• We should also identify
non-words such as punctuation marks.
• Word-level ambiguity --
words may have several meanings, and the correct one
cannot be chosen based
solely on the word itself.
– bank in English
• Solution -- resolve the
ambiguity on the spot by POS tagging (if possible) or passon
the ambiguity to the other
levels.
Syntactic Processing
• Parsing --
converting a flat input sentence into a hierarchical structure that
corresponds to the units of
meaning in the sentence.
• There are different
parsing formalisms and algorithms.
• Most formalisms have two
main components:
– grammar -- a
declarative representation describing the syntactic structure
of sentences in the
language.
– parser -- an
algorithm that analyzes the input and outputs its structural
representation (its parse)
consistent with the grammar specification.
• CFGs are in the center of
many of the parsing mechanisms. But they are
complemented by some
additional features that make the formalism more suitable
to handle natural languages.
Semantic Analysis
• Assigning meanings to the
structures created by syntactic analysis.
• Mapping words and
structures to particular domain objects in way consistent with
our knowledge of the world.
• Semantic can play an
import role in selecting among competing syntactic analyses
and discarding illogical
analyses.
– I robbed the bank -- bank
is a river bank or a financial institution
• We have to decide the
formalisms which will be used in the meaning
representation.
Knowledge Representation for NLP
• Which knowledge
representation will be used depends on the application --
Machine Translation,
Database Query System.
• Requires the choice of
representational framework, as well as the specific
meaning vocabulary (what are
concepts and relationship between these concepts
-- ontology)
• Must be computationally
effective.
• Common representational
formalisms:
– first order predicate
logic
– conceptual dependency
graphs
– semantic networks
– Frame-based
representations
No comments:
Post a Comment