[PDF] Building And Using Comparable Corpora - eBooks Review

Building And Using Comparable Corpora


Building And Using Comparable Corpora
DOWNLOAD

Download Building And Using Comparable Corpora PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Building And Using Comparable Corpora book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages. If the content not found or just blank you must refresh this page



Building And Using Comparable Corpora


Building And Using Comparable Corpora
DOWNLOAD
Author : Serge Sharoff
language : en
Publisher: Springer Science & Business Media
Release Date : 2013-12-13

Building And Using Comparable Corpora written by Serge Sharoff and has been published by Springer Science & Business Media this book supported file pdf, txt, epub, kindle and other format this book has been release on 2013-12-13 with Computers categories.


The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field. The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.



Building And Using Comparable Corpora For Multilingual Natural Language Processing


Building And Using Comparable Corpora For Multilingual Natural Language Processing
DOWNLOAD
Author : Serge Sharoff
language : en
Publisher: Springer Nature
Release Date : 2023-08-23

Building And Using Comparable Corpora For Multilingual Natural Language Processing written by Serge Sharoff and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2023-08-23 with Computers categories.


This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications.



Building And Using Comparable Corpora For Multilingual Natural Language Processing


Building And Using Comparable Corpora For Multilingual Natural Language Processing
DOWNLOAD
Author : Serge Sharoff
language : en
Publisher: Springer
Release Date : 2024-08-24

Building And Using Comparable Corpora For Multilingual Natural Language Processing written by Serge Sharoff and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2024-08-24 with Computers categories.


This book provides a comprehensive overview of methods to build comparable corpora and of their applications, including machine translation, cross-lingual transfer, and various kinds of multilingual natural language processing. The authors begin with a brief history on the topic followed by a comparison to parallel resources and an explanation of why comparable corpora have become more widely used. In particular, they provide the basis for the multilingual capabilities of pre-trained models, such as BERT or GPT. The book then focuses on building comparable corpora, aligning their sentences to create a database of suitable translations, and using these sentence translations to produce dictionaries and term banks. Then, it is explained how comparable corpora can be used to build machine translation engines and to develop a wide variety of multilingual applications.



Using Comparable Corpora For Under Resourced Areas Of Machine Translation


Using Comparable Corpora For Under Resourced Areas Of Machine Translation
DOWNLOAD
Author : Inguna Skadina
language : en
Publisher:
Release Date : 2019

Using Comparable Corpora For Under Resourced Areas Of Machine Translation written by Inguna Skadina and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2019 with Corpora (Linguistics) categories.


This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.



Bucc 2009


Bucc 2009
DOWNLOAD
Author :
language : en
Publisher:
Release Date : 2009

Bucc 2009 written by and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2009 with Computational linguistics categories.




43rd Annual Meeting Of The Association For Computational Linguistics


43rd Annual Meeting Of The Association For Computational Linguistics
DOWNLOAD
Author :
language : en
Publisher:
Release Date : 2005

43rd Annual Meeting Of The Association For Computational Linguistics written by and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2005 with Computational linguistics categories.




Using Comparable Corpora For Under Resourced Areas Of Machine Translation


Using Comparable Corpora For Under Resourced Areas Of Machine Translation
DOWNLOAD
Author : Inguna Skadiņa
language : en
Publisher: Springer
Release Date : 2019-02-06

Using Comparable Corpora For Under Resourced Areas Of Machine Translation written by Inguna Skadiņa and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2019-02-06 with Computers categories.


This book provides an overview of how comparable corpora can be used to overcome the lack of parallel resources when building machine translation systems for under-resourced languages and domains. It presents a wealth of methods and open tools for building comparable corpora from the Web, evaluating comparability and extracting parallel data that can be used for the machine translation task. It is divided into several sections, each covering a specific task such as building, processing, and using comparable corpora, focusing particularly on under-resourced language pairs and domains. The book is intended for anyone interested in data-driven machine translation for under-resourced languages and domains, especially for developers of machine translation systems, computational linguists and language workers. It offers a valuable resource for specialists and students in natural language processing, machine translation, corpus linguistics and computer-assisted translation, and promotes the broader use of comparable corpora in natural language processing and computational linguistics.



Proceedings Of The Lrec 2020 13th Workshop On Building And Using Comparable Corpora


Proceedings Of The Lrec 2020 13th Workshop On Building And Using Comparable Corpora
DOWNLOAD
Author : Workshop on Building and Using Comparable Corpora
language : en
Publisher:
Release Date : 2020

Proceedings Of The Lrec 2020 13th Workshop On Building And Using Comparable Corpora written by Workshop on Building and Using Comparable Corpora and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2020 with categories.




Translation Driven Corpora


Translation Driven Corpora
DOWNLOAD
Author : Federico Zanettin
language : en
Publisher: Routledge
Release Date : 2014-04-08

Translation Driven Corpora written by Federico Zanettin and has been published by Routledge this book supported file pdf, txt, epub, kindle and other format this book has been release on 2014-04-08 with Language Arts & Disciplines categories.


Electronic texts and text analysis tools have opened up a wealth of opportunities to higher education and language service providers, but learning to use these resources continues to pose challenges to scholars and professionals alike. Translation-Driven Corpora aims to introduce readers to corpus tools and methods which may be used in translation research and practice. Each chapter focuses on specific aspects of corpus creation and use. An introduction to corpora and overview of applications of corpus linguistics methodologies to translation studies is followed by a discussion of corpus design and acquisition. Different stages and tools involved in corpus compilation and use are outlined, from corpus encoding and annotation to indexing and data retrieval, and the various methods and techniques that allow end users to make sense of corpus data are described. The volume also offers detailed guidelines for the construction and analysis of multilingual corpora. Corpus creation and use are illustrated through practical examples and case studies, with each chapter outlining a set of tasks aimed at guiding researchers, students and translators to practice some of the methods and use some of the resources discussed. These tasks are meant as hands-on activities to be carried out using the materials and links available in an accompanying DVD. Suggested further readings at the end of each chapter are complemented by an extensive bibliography at the end of the volume. Translation-Driven Corpora is designed for use by teachers and students in the classroom or by researchers and professionals for self-learning. It is an invaluable resource for anyone interested in this fast growing area of scholarly and professional activity.



Web As Corpus


Web As Corpus
DOWNLOAD
Author : Maristella Gatto
language : en
Publisher: A&C Black
Release Date : 2014-02-13

Web As Corpus written by Maristella Gatto and has been published by A&C Black this book supported file pdf, txt, epub, kindle and other format this book has been release on 2014-02-13 with Language Arts & Disciplines categories.


Is the internet a suitable linguistic corpus? How can we use it in corpus techniques? What are the special properties that we need to be aware of? This book answers those questions. The Web is an exponentially increasing source of language and corpus linguistics data. From gigantic static information resources to user-generated Web 2.0 content, the breadth and depth of information available is breathtaking – and bewildering. This book explores the theory and practice of the “web as corpus”. It looks at the most common tools and methods used and features a plethora of examples based on the author's own teaching experience. This book also bridges the gap between studies in computational linguistics, which emphasize technical aspects, and studies in corpus linguistics, which focus on the implications for language theory and use.