MBTG − Memory Based Tagger generator
mbtg −T <filename> −s <setting filename>
or
mbtg [options]
This programs generates, based on a tagged corpus, all the files needed to be able to tag a text with mbt.
−h or −−help
show help
−T <tagged training corpus file>
or
−E <enriched tagged training corpus file>
All further options have reasonable defaults, so using them is only needed for the experienced user. See the mbt manual for more details.
−s settingsfile
mbtg creates this file, which can be used to run mbt with minimal effort. (like mbt −s settings −T somefile)
−p pattern
the pattern for known words (default ddfa)
−P pattern
the pattern for unknown words (default dFapsss)
−% <number>
filter threshold for ambitag construction (default 5%)
−l <lexiconfile>
−L <file with list of frequent words>
−r <ambitagfile>
−k <known words case base>
−u <unknown words case base>
−K <known words instances file>
−U <unknown words instances file>
−V or −−version
show version info
−e <sentence delimiter> (default ’<utt>’)
−X
keep the intermediate files
−Otimbl options
(Note: there is NO SPACE
between O and the options)
<options> classifier options for both known and
unknown words instances bases
K: <options> classifier options for known words
instance base
U: <options> classifier options for unknown words case
base
valid timbl options are: a d k m q v w x −
possibly
Ko van der Sloot [email protected]
Antal van den Bosch [email protected]
timbl(1) mbt(1) mbtserver(1)