History log of /dovecot/src/lib-fts/Makefile.am
Revision Date Author Comments Expand
149299c7d5136a8fb425ef3cf8953026a1358002 11-Oct-2017 Timo Sirainen <timo.sirainen@dovecot.fi>

global: Use check-local in Makefile.am instead of overriding check directly This helps with dependency problems, like running "make check" in lib-storage without "make" first would try to compile the test programs too early and fail.

/dovecot/src/anvil/Makefile.am /dovecot/src/auth/Makefile.am /dovecot/src/director/Makefile.am /dovecot/src/doveadm/Makefile.am /dovecot/src/doveadm/dsync/Makefile.am /dovecot/src/lib-charset/Makefile.am /dovecot/src/lib-compression/Makefile.am /dovecot/src/lib-dcrypt/Makefile.am /dovecot/src/lib-dict-backend/Makefile.am /dovecot/src/lib-dict/Makefile.am /dovecot/src/lib-dns/Makefile.am /dovecot/src/lib-fs/Makefile.am Makefile.am /dovecot/src/lib-http/Makefile.am /dovecot/src/lib-imap-client/Makefile.am /dovecot/src/lib-imap/Makefile.am /dovecot/src/lib-index/Makefile.am /dovecot/src/lib-ldap/Makefile.am /dovecot/src/lib-mail/Makefile.am /dovecot/src/lib-master/Makefile.am /dovecot/src/lib-program-client/Makefile.am /dovecot/src/lib-settings/Makefile.am /dovecot/src/lib-smtp/Makefile.am /dovecot/src/lib-storage/Makefile.am /dovecot/src/lib/Makefile.am /dovecot/src/plugins/mail-crypt/Makefile.am /dovecot/src/plugins/pop3-migration/Makefile.am /dovecot/src/plugins/quota/Makefile.am /dovecot/src/plugins/var-expand-crypt/Makefile.am
f08b05d41b66e6f52daf6e8b40c1612617e84c79 09-May-2017 Josef 'Jeff' Sipek <jeff.sipek@dovecot.fi>

{lib,lib-fts}: fix builds with BSD make Without this change, BSD make doesn't know how to make a couple of the generated files because the BUILT_SOURCES file names don't match exactly the left hand sides of the rules. (GNU make somehow manages to match the rule even though it is not an exact match.)

58f9b440f44eef4348a9043e3cef477a9733cb10 09-May-2017 Josef 'Jeff' Sipek <jeff.sipek@dovecot.fi>

lib-fts: use full path to word-properties script This is a step toward fixing builds where object dir != source dir.

824107247fcaa05c081f32bffd2cdecea8ec557a 09-May-2017 Josef 'Jeff' Sipek <jeff.sipek@dovecot.fi>

lib-fts: download data files into srcdir This is a step toward fixing builds where object dir != source dir.

4e64ac91c5a3eb2a55e0b18d8da832b29ec08289 23-Mar-2017 Martti Rannanjärvi <martti.rannanjarvi@dovecot.fi>

lib: Download unicode.org files from dovecot.org

5fcd30add8dcf4d883978cce3e39f3a89184f1e5 23-Aug-2016 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Cut overlong strings in lowercase filter. Added new common truncate function for filters. It also removes any partial characters, that would remain from plain truncation.

3f3c1b629196bc8491f146705b6f8ddadfcde1c8 02-Jun-2016 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Improved stopword file reading. The reading tries to be a little bit stricter now. Only stopwords at the start of a new line are accepted now. Changed fi stopwords accordingly. Also removed superfluous stack allocation in parsing.

0605ff6f25783f7c69c1148f9f3a7bd4c34c098f 02-Jun-2016 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Add stopword files for more languages.

abfc91b502618e387a5c9c87bcf658b341735947 02-Jun-2016 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Move stopwords to subdirectory. All files incluided in dist are explicitly mentioned. The whole subdirectory 'stopwords' could also be distributed, but that is more error prone.

00544ad37ece26b2c4f2210ed5e5295241d0db19 16-Mar-2016 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Lift helper function out of generic tokenizer.

7d4c8041ab63e6a1bf17a9b2bb11dd18634971e2 15-Jan-2016 Aki Tuomi <aki.tuomi@dovecot.fi>

lib-fts: Add lib-fts to CPPFLAGS as include dir Without this, VPATH builds fail because the includes cannot be found as they are not on same directory.

40bdcc2e50b6969596b10f848d1fbe23820666f9 12-Jan-2016 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Create library for development packages.

6dd785e6857866657d6ef7a88af6d46ed0133801 17-Nov-2015 Teemu Huovila <teemu.huovila@dovecot.fi>

fts: Added fts_library_init() and _deinit() Replaces calling three different functions on init and deinit.

3ec8b0d282d46d1f698b1f2aa27922cb8f26cb97 17-Nov-2015 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Add Norwegian. Norwegian has two main dialects, Bokmal(nb) and Nynorsk(nn). They are detected separately by libexttextcat, but the stemmer only knows Norwegian. Thus they are treated as a single language, Norwegian (no). This might also make more sense in everyday use of mixed writing style Norwegian. Caveat: The default normalizer filter does not modify U+00F8 (Latin Small Letter O with Stroke). In some configurations it might be desirable to rewrite it to e.g. o. Same goes for the upper case version. This can be done by passing a modified "id" setting to the normalizer filter.

c5effa0f13da8f45991c89a9d8c9d2109db66039 17-Nov-2015 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Add Swedish (sv) to supported languages.

440b625484f3cc9d3ec0a7ba36fe3583aa90172d 31-Aug-2015 Teemu Huovila <teemu.huovila@dovecot.fi>

lib-fts: Add prefixing contraction filter. Filters away prefixing contracted words, e.g. "l'homme" -> "homme". Tokens to be filtered must be lower case. Only supports French in this initial version.

f1306b3d242963588c97b35d16973c4198bcae7e 11-Aug-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Install headers on make install.

471167b9701fcc99b66f7a8bcae07bc4ac0dbbd4 03-Jun-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Added "english-possessive" filter.

5a2910119ec0b878a0d7ca91918b97e9d40a936d 02-Jun-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Moved IS_APOSTROPHE() to fts-common.h

bf698b98d3a3a1eced66cc682c449f23bf2b67d0 16-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Rewrite ICU handling functions. Some of the changes: - Use buffers instead of allocating everything from data stack. - Optimistically attempt to write the data directly to the buffers without first calculating their size. Grow the buffer if it doesn't fit first. - Use u_strFromUTF8Lenient() instead of u_strFromUTF8(). Our input is already supposed to be valid UTF-8, although we don't check if all code points are valid, while u_strFromUTF8() does check them and return failures. We don't really care about if code points are valid or not and u_strFromUTF8Lenient() passes through everything. Added unit tests to make sure all the functions work as intended and all the UTF-8 input passes through them successfully.

a5563dc790a44bb58860d74479a24349f593d68f 14-May-2015 Timo Sirainen <tss@iki.fi>

Reverted d592417ec815 which added unnecessary code to Makefiles. The original problem it tried to solve was properly fixed by 46969c4cc57e. make will actually wait for processes to finish creating files before it continues to the next program that wants to access the file. As long as the dependencies are correct.

b9495c944b49d71e8235c772c2dc035fdab282cd 13-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Makefile compiling dependency fix

91d2e560eb95a9ab7f2c194d5bf14179aff6023b 12-May-2015 Phil Carmody <phil@dovecot.fi>

lib-fts: autogenerate C arrays using perl The sh script had bashisms, the awk script crashed mawk, so let's try perl... Signed-off-by: Phil Carmody <phil@dovecot.fi>

3756060476f110e7a8cb7069ea1319665815e845 11-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: .sh scripts weren't executable - changed them to be run via bash directly. Better to avoid relying on the executable bit.

5d8dad014bc0a18e79286953a92f7fae7684ee9b 11-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Reverted e80969ea8684 which replaced .sh scripts with awk Bugs in older awk versions (used at least by Debian squeeze & wheezy) caused awk to crash while processing the script.

412bd45e0cabee1284a56482578eb347d626bd4d 11-May-2015 Timo Sirainen <tss@iki.fi>

Makefile: Fixed build concurrency issues with lib-fts

644e991973c99703e9994851fe365960ab1bc089 11-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Replaced word-boundary/break-data.sh with more portable awk scripts Patch by Michael Grimm.

acfcf88e4dd529e4b2409f43bc9713cbc0169347 09-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Added "lowercase" filter. For now it handles only ASCII characters, but that's enough for our use.

eac88e31b791d6a099e0e497ac2a29aa041f05b2 09-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Removed "simple" normalizer. It translated input to titlecase, which wasn't suitable for snowball stemming that wanted lowercase input. Since that doesn't work, there's probably no good for the existence of this (perhaps in future it's replaced by unicode-aware lowercaser).

12bc47bcae87a1f954b98420929eaf90922aa605 08-May-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Use rfc822-parser in fts-tokenizer-address instead of duplicating its code.

ec930ce90b17fb63ff035c1c87d994800de092f1 21-Apr-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Added normalizer-simple for doing normalization without libicu.

63713f16bad8b55e74c479adb6b47965b519c29b 21-Apr-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Renamed normalizer to icu-normalizer, including the source code.

cb6f6ef5044a559fb285e2f7d3fe12b4751ea708 21-Apr-2015 Timo Sirainen <tss@iki.fi>

configure: s/normalizer/libicu/ since we it could be used for something else as well.

e162baa2d2ce41a009988e86636a5c77a2725477 21-Apr-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Added udhr_fra.txt to EXTRA_DIST

4bf6941ccdfb27c99e15ab32e5299e25cd2855c6 20-Apr-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Added PropList.txt to EXTRA_DIST

556c189ce6b6de3c8b4a3fc38b7c61bef800d012 20-Apr-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Fixed using FTS_NORMALIZER_CFLAGS/LIBS.

9cff78f3cc4830cce2183f630ec671a98087e4d1 20-Apr-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Added missing stopwords_fi.txt

4e07da7f29d35d1517fce9b7300c6c19f804325b 20-Apr-2015 Timo Sirainen <tss@iki.fi>

lib-fts: Fixed test-fts-language to use TEXTCAT_DATADIR This may still make too many assumptions about what data exists where.. So we may need to remove this test from "make check". But for now leave it there.

c865b0e9c65fd77f7b2ab6f8616d3def5501ecb3 20-Apr-2015 Timo Sirainen <tss@iki.fi>

Initial import for lib-fts. Parts of what this code does was already implemented internally by fts-lucene. lib-fts is intended to be usable for all the FTS backends. The APIs are still going to change a bit, but hopefully not after v2.2.17 release. Mostly written by Teemu Huovila.