[PDF] Using Large Corpora - eBooks Review

Using Large Corpora


Using Large Corpora
DOWNLOAD

Download Using Large Corpora PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Using Large Corpora book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages. If the content not found or just blank you must refresh this page



Using Large Corpora


Using Large Corpora
DOWNLOAD
Author : Armstrong-Warwick Armstrong
language : en
Publisher: MIT Press
Release Date : 1994

Using Large Corpora written by Armstrong-Warwick Armstrong and has been published by MIT Press this book supported file pdf, txt, epub, kindle and other format this book has been release on 1994 with Business & Economics categories.


Using Large Corpora identifies new data-oriented methods for organizing and analyzing large corpora and describes the potential results that the use of large corpora offers. Today, large corpora consisting of hundreds of millions or even billions of words, along with new empirical and statistical methods for organizing and analyzing these data, promise new insights into the use of language. Already, the data extracted from these large corpora reveal that language use is more flexible and complex than most rule-based systems have tried to account for, providing a basis for progress in the performance of Natural Language Processing systems. Using Large Corpora identifies these new data-oriented methods and describes the potential results that the use of large corpora offers. The research described shows that the new methods may offer solutions to key issues of acquisition (automatically identifying and coding information), coverage (accounting for all of the phenomena in a given domain), robustness (accommodating real data that may be corrupt or not accounted for in the model), and extensibility (applying the model and data to a new domain, text, or problem). There are chapters on lexical issues, issues in syntax, and translation topics, as well discussions of the statistics-based vs. rule-based debate. ACL-MIT Series in Natural Language Processing.



Natural Language Processing Using Very Large Corpora


Natural Language Processing Using Very Large Corpora
DOWNLOAD
Author : S. Armstrong
language : en
Publisher: Springer Science & Business Media
Release Date : 2013-04-17

Natural Language Processing Using Very Large Corpora written by S. Armstrong and has been published by Springer Science & Business Media this book supported file pdf, txt, epub, kindle and other format this book has been release on 2013-04-17 with Computers categories.


ABOUT THIS BOOK This book is intended for researchers who want to keep abreast of cur rent developments in corpus-based natural language processing. It is not meant as an introduction to this field; for readers who need one, several entry-level texts are available, including those of (Church and Mercer, 1993; Charniak, 1993; Jelinek, 1997). This book captures the essence of a series of highly successful work shops held in the last few years. The response in 1993 to the initial Workshop on Very Large Corpora (Columbus, Ohio) was so enthusias tic that we were encouraged to make it an annual event. The following year, we staged the Second Workshop on Very Large Corpora in Ky oto. As a way of managing these annual workshops, we then decided to register a special interest group called SIGDAT with the Association for Computational Linguistics. The demand for international forums on corpus-based NLP has been expanding so rapidly that in 1995 SIGDAT was led to organize not only the Third Workshop on Very Large Corpora (Cambridge, Mass. ) but also a complementary workshop entitled From Texts to Tags (Dublin). Obviously, the success of these workshops was in some measure a re flection of the growing popularity of corpus-based methods in the NLP community. But first and foremost, it was due to the fact that the work shops attracted so many high-quality papers.



Corpus Linguistics


Corpus Linguistics
DOWNLOAD
Author : Tony McEnery
language : en
Publisher: Cambridge University Press
Release Date : 2011-10-06

Corpus Linguistics written by Tony McEnery and has been published by Cambridge University Press this book supported file pdf, txt, epub, kindle and other format this book has been release on 2011-10-06 with Language Arts & Disciplines categories.


Corpus linguistics is the study of language data on a large scale - the computer-aided analysis of very extensive collections of transcribed utterances or written texts. This textbook outlines the basic methods of corpus linguistics, explains how the discipline of corpus linguistics developed and surveys the major approaches to the use of corpus data. It uses a broad range of examples to show how corpus data has led to methodological and theoretical innovation in linguistics in general. Clear and detailed explanations lay out the key issues of method and theory in contemporary corpus linguistics. A structured and coherent narrative links the historical development of the field to current topics in 'mainstream' linguistics. Practical tasks and questions for discussion at the end of each chapter encourage students to test their understanding of what they have read and an extensive glossary provides easy access to definitions of technical terms used in the text.



Corpora In Translation


Corpora In Translation
DOWNLOAD
Author : Tengku Sepora Tengku Mahadi
language : en
Publisher: Peter Lang
Release Date : 2010

Corpora In Translation written by Tengku Sepora Tengku Mahadi and has been published by Peter Lang this book supported file pdf, txt, epub, kindle and other format this book has been release on 2010 with Education categories.


Corpora are among the hottest issues in translation studies affecting both pure and applied realms of the discipline. As for pure translation studies, corpora have done their part through contributions to the studies on translational language and translation universals. Yet, their recent contribution is within the borders of applied translation studies, i.e. translator training and translation aids. The former is the major focus of the present book. The present book in fact aims at providing readers with comprehensive information about corpora in translation studies in general, and corpora in translator education in particular. It further offers researchers and practitioners a comprehensive and up-to-date survey of studies done on corpora in translator education and provides a rich source of information on pros and cons of using different types of corpora as translation aids in the context of translation classrooms.



Translation Driven Corpora


Translation Driven Corpora
DOWNLOAD
Author : Federico Zanettin
language : en
Publisher: Routledge
Release Date : 2014-04-08

Translation Driven Corpora written by Federico Zanettin and has been published by Routledge this book supported file pdf, txt, epub, kindle and other format this book has been release on 2014-04-08 with Language Arts & Disciplines categories.


Electronic texts and text analysis tools have opened up a wealth of opportunities to higher education and language service providers, but learning to use these resources continues to pose challenges to scholars and professionals alike. Translation-Driven Corpora aims to introduce readers to corpus tools and methods which may be used in translation research and practice. Each chapter focuses on specific aspects of corpus creation and use. An introduction to corpora and overview of applications of corpus linguistics methodologies to translation studies is followed by a discussion of corpus design and acquisition. Different stages and tools involved in corpus compilation and use are outlined, from corpus encoding and annotation to indexing and data retrieval, and the various methods and techniques that allow end users to make sense of corpus data are described. The volume also offers detailed guidelines for the construction and analysis of multilingual corpora. Corpus creation and use are illustrated through practical examples and case studies, with each chapter outlining a set of tasks aimed at guiding researchers, students and translators to practice some of the methods and use some of the resources discussed. These tasks are meant as hands-on activities to be carried out using the materials and links available in an accompanying DVD. Suggested further readings at the end of each chapter are complemented by an extensive bibliography at the end of the volume. Translation-Driven Corpora is designed for use by teachers and students in the classroom or by researchers and professionals for self-learning. It is an invaluable resource for anyone interested in this fast growing area of scholarly and professional activity.



Corpus Based Studies In Language Use Language Learning And Language Documentation


Corpus Based Studies In Language Use Language Learning And Language Documentation
DOWNLOAD
Author : John Newman
language : en
Publisher: Rodopi
Release Date : 2011

Corpus Based Studies In Language Use Language Learning And Language Documentation written by John Newman and has been published by Rodopi this book supported file pdf, txt, epub, kindle and other format this book has been release on 2011 with Computers categories.


This volume consists of selected papers from the 2009 meeting of the American Association for Corpus Linguistics. The chapters cover aspects of language use (usage-based accounts of morphology/syntax of English and Tok Pisin), language learning (corpus-based learning of English, syntactic development observable in a Learner Corpus of English, “core” vocabulary items for learners of English) and language documentation (a new and innovative usage-based frequency dictionary of English, proposals to broaden the traditional understanding of a corpus in various directions, e.g., constructing a corpus of the content of Japanese manga comics). Taken together, the thirteen chapters represent a good cross-section of strands of new work in corpus linguistics, as practised by international scholars working on English and other languages.



A Practical Guide To Lexicography


A Practical Guide To Lexicography
DOWNLOAD
Author : P. G. J. van Sterkenburg
language : en
Publisher: John Benjamins Publishing
Release Date : 2003-01-01

A Practical Guide To Lexicography written by P. G. J. van Sterkenburg and has been published by John Benjamins Publishing this book supported file pdf, txt, epub, kindle and other format this book has been release on 2003-01-01 with Language Arts & Disciplines categories.


This is a state-of-the-art Guide to the fascinating world of the lexicon and its description in various types of dictionaries. A team of experts brings together a solid Introduction to Lexicography and leads you through decision-making processes step-by-step to compile and design dictionaries for general and specific purposes. The domains of lexicography are outlined and its specific terminology is explained in the Glossary. Each chapter provides ample suggestions for further reading. Naturally, electronic dictionaries, corpus analysis, and database management are central themes throughout the book. The book also "introduces" questions about the many types of definition, meaning, sense relations, and stylistics. And that is not all: those afraid to embark on a dictionary adventure will find out all about the pitfalls in the chapters on Design. A Practical Guide to Lexicography introduces and seduces you to learn about the achievements, unexpected possibilities, and challenges of modern-day lexicography.



Small Corpus Studies And Elt


Small Corpus Studies And Elt
DOWNLOAD
Author : Mohsen Ghadessy
language : en
Publisher: John Benjamins Publishing
Release Date : 2001

Small Corpus Studies And Elt written by Mohsen Ghadessy and has been published by John Benjamins Publishing this book supported file pdf, txt, epub, kindle and other format this book has been release on 2001 with Language Arts & Disciplines categories.


Recent developments in this field of small corpus studies, largely brought about by the personal computer, have yielded remarkable insights into the nature and use of real language. This book presents work by a number of leading researchers in the field and covers a series of topics directly related to language teaching and language research. The ultimate aim of this book is to encourage the exploitation of small corpora by the ELT profession to make language learning more effective. In addition to descriptions of the basic corpus analysis tools, chapters in the collection cover syllabus and materials design, comparisons of different genres, descriptions of local and functional grammars, compilation and use of learner corpora, and making cross-linguistic comparisons. The message of this collection is that language use is purposeful and culture specific and that small corpus analysis is an effective method of linguistic investigation."Preface by: " John Sinclair;



Corpora Pragmatics And Discourse


Corpora Pragmatics And Discourse
DOWNLOAD
Author :
language : en
Publisher: BRILL
Release Date : 2015-06-29

Corpora Pragmatics And Discourse written by and has been published by BRILL this book supported file pdf, txt, epub, kindle and other format this book has been release on 2015-06-29 with Language Arts & Disciplines categories.


This volume presents current state-of-the-art discussions in corpus-based linguistic research of the English language. The papers deal with Present-day English, worldwide varieties of English and the history of the English language. A special focus of the volume are studies in the broad field of corpus pragmatics and corpus-based discourse analysis. It includes corpus-based studies of speech acts, conversational routines, referential expressions and thought styles, as well as studies on the lexis, grammar and semantics of English. And it also includes several studies on technical aspects of corpus compilation, fieldwork and parsing.



Scalable And Efficient Probabilistic Topic Model Inference For Textual Data


Scalable And Efficient Probabilistic Topic Model Inference For Textual Data
DOWNLOAD
Author : Måns Magnusson
language : en
Publisher: Linköping University Electronic Press
Release Date : 2018-04-27

Scalable And Efficient Probabilistic Topic Model Inference For Textual Data written by Måns Magnusson and has been published by Linköping University Electronic Press this book supported file pdf, txt, epub, kindle and other format this book has been release on 2018-04-27 with categories.


Probabilistic topic models have proven to be an extremely versatile class of mixed-membership models for discovering the thematic structure of text collections. There are many possible applications, covering a broad range of areas of study: technology, natural science, social science and the humanities. In this thesis, a new efficient parallel Markov Chain Monte Carlo inference algorithm is proposed for Bayesian inference in large topic models. The proposed methods scale well with the corpus size and can be used for other probabilistic topic models and other natural language processing applications. The proposed methods are fast, efficient, scalable, and will converge to the true posterior distribution. In addition, in this thesis a supervised topic model for high-dimensional text classification is also proposed, with emphasis on interpretable document prediction using the horseshoe shrinkage prior in supervised topic models. Finally, we develop a model and inference algorithm that can model agenda and framing of political speeches over time with a priori defined topics. We apply the approach to analyze the evolution of immigration discourse in the Swedish parliament by combining theory from political science and communication science with a probabilistic topic model. Probabilistiska ämnesmodeller (topic models) är en mångsidig klass av modeller för att estimera ämnessammansättningar i större corpusar. Applikationer finns i ett flertal vetenskapsområden som teknik, naturvetenskap, samhällsvetenskap och humaniora. I denna avhandling föreslås nya effektiva och parallella Markov Chain Monte Carlo algoritmer för Bayesianska ämnesmodeller. De föreslagna metoderna skalar väl med storleken på corpuset och kan användas för flera olika ämnesmodeller och liknande modeller inom språkteknologi. De föreslagna metoderna är snabba, effektiva, skalbara och konvergerar till den sanna posteriorfördelningen. Dessutom föreslås en ämnesmodell för högdimensionell textklassificering, med tonvikt på tolkningsbar dokumentklassificering genom att använda en kraftigt regulariserande priorifördelningar. Slutligen utvecklas en ämnesmodell för att analyzera "agenda" och "framing" för ett förutbestämt ämne. Med denna metod analyserar vi invandringsdiskursen i Sveriges Riksdag över tid, genom att kombinera teori från statsvetenskap, kommunikationsvetenskap och probabilistiska ämnesmodeller.