unigen-hangul - Generate Hangul syllables from a Johab 6/3/1 Unifont hex file

NAME  SYNOPSIS  DESCRIPTION  OPTIONS  EXAMPLES  FILES  SEE ALSO  AUTHOR  LICENSE  BUGS 

NAME

unigen-hangul − Generate Hangul syllables from a Johab 6/3/1 Unifont hex file

SYNOPSIS

unigen-hangul −i hangul-base.hex −o hangul-syllables.hex

DESCRIPTION

unigen-hangul generates Hangul syllables from an input Unifont .hex file encoded in Johab 6/3/1 format. By default, the output is the Unicode Hangul Syllables range, U+AC00..U+D7A3. Options allow the user to specify a starting code point for the output Unifont .hex file, and ranges in hexadecimal of the starting and ending Hangul Jamo code points:

Range

Hangul

1100−115E

Hangul Jamo initial consonants (choseong)

A960−A97C

Hangul Jamo Extended-A initial consonants (choseong)

1161−11A7

Hangul Jamo medial vowels and diphthongs (jungseong)

D7B0−D7C6

Hangul Jamo Extended-B medial vowels and diphthongs (jungseong)

11A8−11FF

Hangul Jamo final consonants (jongseong).

D7CB−D7FB

Hangul Jamo Extended-B final consonants (jongseong).

A single code point or 0 to omit can be specified instead of a range. A starting code point of one position before a valid starting range for a Hangul jamo series (choseong, jungseong, and/or jongseong) will first use a blank glyph for that jamo, and then cycle through remaining valid code points for the respective choseong, jungseong, or jongseong. A range can span modern and ancient, and even Hangul Jamo Extended-A and Hangul Jamo Extended-B ranges.

For example,

-j3 11A7−D7FB

Will first use no jongseong (because U+11A7 is one before the start of Hangul Jamo jongseong code points), then loop through jongseong in the Hangul Jamo range of U+11A8 through U+11FF, and then loop through jongseong in the Hangul Jamo Extended-B range of U+D7CB through U+D7FB.

OPTIONS

Option

Function

−h, −−help

Print a help message and exit.

−all

Generate all Hangul syllables, using all modern and ancient Hangul in the Unicode range U+1100..U+11FF, assigned code points in the Extended-A range of U+A960..U+A97C, and assigned code points in the Extended-B range of U+D7B0..U+D7FF. WARNING: this will generate over 1,600,000 syllables in a 115 megabyte Unifont .hex format file. The default is to only output the 11,172 modern Hangul syllables.

-c code_point

Starting code point in hexadecimal for output file.

−j1 start-end

Choseong (jamo 1) start-end range in hexadecimal.

−j2 start-end

Jungseong (jamo 2) start-end range in hexadecimal.

−j3 start-end

Jongseong (jamo 3) start-end range in hexadecimal.

−i input_file

Unifont hangul-base.hex formatted input file.

−o output_file

Unifont .hex format output file.

EXAMPLES

unigen-hangul -c 1 -j3 11AB-11AB \

-i hangul-base.hex -o nieun-only.hex

This command generates Hangul syllables using all modern choseong and jungseong, and only the jongseong nieun (Unicode code point U+11AB). The output Unifont .hex file will contain code points starting at 1. Instead of specifying "-j3 11AB-11AB", simply using "-j3 11AB" will also suffice.

This next example is a series of syllable sets suggested by Ho-Seok Ee for preliminary syllable alignment checking of modern Hangul.

The first command generates all modern syllables containing no jongseong (final consonant), starting at Unifont hexadecimal glyph location 0x1000; selecting a jongseong value that is out of range (U+1160 in this case) will use a blank filler in place of the jongseong.

The second command generates all modern syllables containing jongseong Kiyeok (U+11AB), which has a horizontal line extending across the lower portion of a syllable, starting at Unifont hexadecimal glyph location 0x2000.

The third command generates all modern Hangul syllables containing jongseong Rieul (U+11AF), starting at Unifont hexadecimal glyph location 0x3000.

The fourth command generates all modern Hangul syllables containing choseong (initial consonant) Rieul (U+1105), starting at Unifont hexadecimal glyph location 0x4000.

Here is the command sequence:

unigen-hangul -c 1000 -j1 1100-1112 -j2 1161-1175 -j3 1160 \

-i hangul-base.hex > hangul-prep.hex

unigen-hangul -c 2000 -j1 1100-1112 -j2 1161-1175 -j3 11AB \

-i hangul-base.hex >> hangul-prep.hex

unigen-hangul -c 3000 -j1 1100-1112 -j2 1161-1175 -j3 11AF \

-i hangul-base.hex >> hangul-prep.hex

unigen-hangul -c 4000 -j1 1105 -j2 1161-1175 -j3 11A8-11C2 \

-i hangul-base.hex >> hangul-prep.hex

The resulting .hex file can then be examined with hexdraw, unihex2bmp, etc.

FILES

Unifont .hex files in Johab 6/3/1 encoding. See unifont-johab631(5) for a description of the input file structure. This program uses functions contained in the file unihangul-support.c.

SEE ALSO

bdfimplode(1), hex2bdf(1), hex2otf(1), hex2sfd(1), hexbraille(1), hexdraw(1), hexkinya(1), hexmerge(1), johab2syllables(1), johab2ucs2(1), unibdf2hex(1), unibmp2hex(1), unibmpbump(1), unicoverage(1), unidup(1), unifont(5), unifont-johab631(5), unifont-viewer(1), unifont1per(1), unifontchojung(1), unifontksx(1), unifontpic(1), unigencircles(1), unigenwidth(1), unihex2bmp(1), unihex2png(1), unihexfill(1), unihexgen(1), unihexpose(1), unihexrotate(1), unijohab2html(1), unipagecount(1), unipng2hex(1)

AUTHOR

unigen-hangul was written by Paul Hardy.

LICENSE

unigen-hangul is Copyright © 2023 Paul Hardy.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

BUGS

No known bugs exist.


Updated 2024-01-29 - jenkler.se | uex.se