wordaxe : Hyphenation by decomposition of compound words

Release 1.0.1
January 2010

Description

The wordaxe library (formerly known as deco-cow) provides Python programs with the ability to automatically hyphenate words using an algorithm which is based on decomposition of compound words into base words, and is named DCWHyphenator in the code.

Currently, only German language is supported for the DCWHyphenator. Other germanic languages like Nederlands, Dansk, etc. that make heave use of compound words should profit from the algorithm as well - as soon as someone provides a word base.

Apart from the DCWHyphenator, the library supports other algorithms as well:

The library can be used as a simple add-on to the ReportLab PDF library, adding support for automatic hyphenation in the Paragraph line-breaking method.

The documentation is still far from complete. For more information, look at the text document install.txt in the distribution and the samples.

Facts

Examples

Here are two PDF files conatining hyphenated german text, created with the deco-cow library and the ReportLab toolkit:

Requirements

Licensing

The wordaxe library itself is released under a dual-license: Apache 2.0 license or Free BSD.

If you want to use pyHnj or ReportLab, see the corresponding licenses.

The hyphenation dictionary files were taken from the OpenOffice distribution; they are licensed under the GNU LGPL license.

Download

You can download the source from the files section.

As usual on Sourceforge.net, you can also browse the subversion repository.

Project Information

See the project page on SourceForge.net.

Get wordaxe at SourceForge.net. Fast, secure and Free Open Source software downloads