Preface
“I am putting myself to the fullest possible use, which is all I think that any conscious entity can ever hope to do.”
As spoken by HAL 9000,
2001: A Space Odyssey (Kubrick, 1968)[1]
Creating computers that are as fluent in human language as people has long been a goal for scientists and the general public. Human language communication both represents and challenges an intelligence, because while languages appear to follow some unseen rules of spelling and grammar, we have never been certain about what they are. Even when one expert seems to have proof of a “universal” theory, another expert may be just as certain to have found an exception to it[2]. And yet, spoken or written language is still seen as the ideal for communicating with both people and complex devices. Systems that understand or use language, which we call “Natural Language Processing” (NLP) systems, haveĀ been created by specifying algorithms for computers based on the observable regularities of language noted by experts. There is even enough statistical regularity to language that, with enough examples of the right type, one can create highly accurate systems using methods that “learn” or “program themselves”. The time has come when it may not always be clear whether the entity we are communicating with is another person or a “bot”.
Read this book to learn the principles and methods of NLP to understand what it is, where it is useful, how to use it, and how it might be used people. The book includes the core topics of modern NLP, including an overview of the syntax and semantics of English, benchmark tasks for computational language modelling, and higher level tasks and applications that analyze or generate language, using both rule-based search and machine learning approaches. It takes the perspective of a computer scientist. The primary themes are abstraction, data, algorithms, applications and impacts. It also includes some history and trends that are important for understanding why things have been done in a certain way.
The book presumes basic proficiency in programming and discusses topics at that level. It does not attempt to teach the underlying mathematics, to reduce the overall length of the book. The book does not focus or depend on any particular programming language or software library, as there are now many options, for a variety of languages including Python, Java, and C++. Examples from the most widely used tools are provided.
This book will be appropriate for anyone who understands computing with data structures and wishes to get an overview of the field of natural language processing and recently developed methods. While background in artificial intelligence, linear algebra, linguistics, probability, or statistics would be helpful, this background is not essential for using this book. To use the software that is discussed, skill prerequisites include programming with arrays, tables, trees, graphs, graph search algorithms (e.g., breadth first and depth first) and installing open source software tools or libraries, such as Anaconda, the Natural Language Toolkit (NLTK) or spaCy. Readers with less programming background can often use online demonstration systems to learn about the capabilities of the different components of modern NLP systems. The book includes URLs to many of these systems that were active at the time the book was written, but the reader should be aware that some may disappear over time.