Universitetet i Oslo
UniDigital
Syddansk universitet

   Norsk

 • Read about OBT

 • History

 • Evaluation

 • Tagset

 • OBT in use

 • Publications

 • Download

 • Contact

 

 Tekstlaboratoriet

Evaluation


Bokmål: The evaluation of the morphological Constraint Grammar module shows a success rate (recall) of 99% and a precision of 96%. This gives an f-measure of 97.5% (if recall and precision are weighted equally).

The tagger was tested on a 30 000 word evaluation corpus with texts from newspapers, magazines, journals, government reports and novels.

Including the statistical module to perform complete disambiguation of the evaluation corpus yields a tagger accuracy of 96.5%. This number includes full disambiguation of both morphology and lemma.

Nynorsk: Evaluation has so far only been carried out for the original CG1-module of the Oslo-Bergen tagger. This module had a success rate (recall) of 98.7% with 93.6% precision. This gives an f-measure of 96.2%.

The evaluation corpus for Nynorsk also had about 30 000 words taken from newspapers, magazines, journals, government reports and novels.