UNIT
V
APPLICATIONS
Communication
Natural
Language Processing (NLP) is the process of computer analysis of input provided
in a human language (natural
language), and conversion of this input into a useful form of
representation.
The
field of NLP is primarily concerned with getting computers to perform useful
and
interesting tasks with human
languages. The field of NLP is secondarily concerned with
helping us come to a better
understanding of human language.
• The input/output of a NLP
system can be:
– written
text
– speech
The following language
related information are useful in NLP:
• Phonology –
concerns how words are related to the sounds that realize them.
• Morphology –
concerns how words are constructed from more basic meaning units called
morphemes. A morpheme is the primitive unit of meaning in a language.
• Syntax – concerns
how can be put together to form correct sentences and determines what
structural role each word plays in the sentence and what phrases are subparts
of other phrases.
• Semantics –
concerns what words mean and how these meaning combine in sentences to form
sentence meaning. The study of context-independent meaning.
• Pragmatics –
concerns how sentences are used in different situations and how use affects the
interpretation of the sentence.
• Discourse –
concerns how the immediately preceding sentences affect the interpretation of
the next sentence. For example, interpreting pronouns and interpreting the
temporal aspects of the information.
• World Knowledge –
includes general knowledge about the world. What each language user must know
about the other’s beliefs and goals.
Communication
as action
• We will mostly concerned
with written text (not speech).
• To process written text,
we need:
– lexical, syntactic,
semantic knowledge about the language
– discourse information,
real world knowledge
• To process spoken
language, we need everything required to process written text, plus the
challenges of speech recognition and speech synthesis.
There are two components of
NLP.
• Natural Language
Understanding
– Mapping the given input in
the natural language into a useful representation.
– Different level of
analysis required
morphological analysis,
syntactic analysis,
semantic analysis,
discourse analysis.
• Natural Language
Generation
– Producing output in the
natural language from some internal representation.
– Different level of
synthesis required,
deep planning (what to say),
syntactic generation
• NL Understanding is much
harder than NL Generation. But, still both of them are hard.
The difficulty in NL
understanding arises from the following facts:
• Natural language is
extremely rich in form and structure, and very ambiguous.
– How to represent meaning,
– Which structures map to
which meaning structures.
• One input can mean many
different things. Ambiguity can be at different levels.
– Lexical (word level)
ambiguity -- different meanings of words
– Syntactic ambiguity --
different ways to parse the sentence
– Interpreting partial
information -- how to interpret pronouns
– Contextual information --
context of the sentence may affect the meaning of that sentence.
• Many input can mean the
same thing.
• Interaction among
components of the input is not clear.
No comments:
Post a Comment