Create your own natural language training corpus for machine learning. This example-driven book walks you through the annotation cycle, from selecting an annotation task and creating the annotation specification to designing the guidelines, creating a "gold standard" corpus, and then beginning the actual data creation with the annotation process. Systems exist for analyzing existing corpora, but making a new corpus can be extremely complex. To help you build a foundation for your own machine learning goals, this easy-to-use guide includes case studies that demonstrate four different annotation tasks in detail. You'll also learn how to use a lightweight software package for annotating texts and adjudicating the annotations. This book is a perfect companion to O'Reilly's Natural Language Processing with Python, which describes how to use existing corpora with the Natural Language Toolkit.
Les mer
Create your own natural language training corpus for machine learning. This example-driven book walks you through the annotation cycle, from selecting an annotation task and creating the annotation specification to designing the guidelines, creating a "gold standard" corpus, and then beginning the actual data creation with the annotation process.
Les mer
Produktdetaljer
ISBN
9781449306663
Publisert
2012-12-04
Utgiver
Vendor
O'Reilly Media
Aldersnivå
P, XV, 06, 01
Språk
Product language
Engelsk
Format
Product format
Heftet
Antall sider
350
Forfatter