lib/Locale/Maketext.pod

1N/A
1N/A# Time-stamp: "2004-01-11 18:35:34 AST"
1N/A
1N/A=head1 NAME
1N/A
1N/ALocale::Maketext - framework for localization
1N/A
1N/A=head1 SYNOPSIS
1N/A
1N/A  package MyProgram;
1N/A  use strict;
1N/A  use MyProgram::L10N;
1N/A   # ...which inherits from Locale::Maketext
1N/A  my $lh = MyProgram::L10N->get_handle() || die "What language?";
1N/A  ...
1N/A  # And then any messages your program emits, like:
1N/A  warn $lh->maketext( "Can't open file [_1]: [_2]\n", $f, $! );
1N/A  ...
1N/A
1N/A=head1 DESCRIPTION
1N/A
1N/AIt is a common feature of applications (whether run directly,
1N/Aor via the Web) for them to be "localized" -- i.e., for them
1N/Ato a present an English interface to an English-speaker, a German
1N/Ainterface to a German-speaker, and so on for all languages it's
1N/Aprogrammed with.  Locale::Maketext
1N/Ais a framework for software localization; it provides you with the
1N/Atools for organizing and accessing the bits of text and text-processing
1N/Acode that you need for producing localized applications.
1N/A
1N/AIn order to make sense of Maketext and how all its
1N/Acomponents fit together, you should probably
1N/Ago read L<Locale::Maketext::TPJ13|Locale::Maketext::TPJ13>, and
1N/AI<then> read the following documentation.
1N/A
1N/AYou may also want to read over the source for C<File::Findgrep>
1N/Aand its constituent modules -- they are a complete (if small)
1N/Aexample application that uses Maketext.
1N/A
1N/A=head1 QUICK OVERVIEW
1N/A
1N/AThe basic design of Locale::Maketext is object-oriented, and
1N/ALocale::Maketext is an abstract base class, from which you
1N/Aderive a "project class".
1N/AThe project class (with a name like "TkBocciBall::Localize",
1N/Awhich you then use in your module) is in turn the base class
1N/Afor all the "language classes" for your project
1N/A(with names "TkBocciBall::Localize::it",
1N/A"TkBocciBall::Localize::en",
1N/A"TkBocciBall::Localize::fr", etc.).
1N/A
1N/AA language class is
1N/Aa class containing a lexicon of phrases as class data,
1N/Aand possibly also some methods that are of use in interpreting
1N/Aphrases in the lexicon, or otherwise dealing with text in that
1N/Alanguage.
1N/A
1N/AAn object belonging to a language class is called a "language
1N/Ahandle"; it's typically a flyweight object.
1N/A
1N/AThe normal course of action is to call:
1N/A
1N/A  use TkBocciBall::Localize;  # the localization project class
1N/A  $lh = TkBocciBall::Localize->get_handle();
1N/A   # Depending on the user's locale, etc., this will
1N/A   # make a language handle from among the classes available,
1N/A   # and any defaults that you declare.
1N/A  die "Couldn't make a language handle??" unless $lh;
1N/A
1N/AFrom then on, you use the C<maketext> function to access
1N/Aentries in whatever lexicon(s) belong to the language handle
1N/Ayou got.  So, this:
1N/A
1N/A  print $lh->maketext("You won!"), "\n";
1N/A
1N/A...emits the right text for this language.  If the object
1N/Ain C<$lh> belongs to class "TkBocciBall::Localize::fr" and
1N/A%TkBocciBall::Localize::fr::Lexicon contains C<("You won!"
1N/A=E<gt> "Tu as gagnE<eacute>!")>, then the above
1N/Acode happily tells the user "Tu as gagnE<eacute>!".
1N/A
1N/A=head1 METHODS
1N/A
1N/ALocale::Maketext offers a variety of methods, which fall
1N/Ainto three categories:
1N/A
1N/A=over
1N/A
1N/A=item *
1N/A
1N/AMethods to do with constructing language handles.
1N/A
1N/A=item *
1N/A
1N/AC<maketext> and other methods to do with accessing %Lexicon data
1N/Afor a given language handle.
1N/A
1N/A=item *
1N/A
1N/AMethods that you may find it handy to use, from routines of
1N/Ayours that you put in %Lexicon entries.
1N/A
1N/A=back
1N/A
1N/AThese are covered in the following section.
1N/A
1N/A=head2 Construction Methods
1N/A
1N/AThese are to do with constructing a language handle:
1N/A
1N/A=over
1N/A
1N/A=item *
1N/A
1N/A$lh = YourProjClass->get_handle( ...langtags... ) || die "lg-handle?";
1N/A
1N/AThis tries loading classes based on the language-tags you give (like
1N/AC<("en-US", "sk", "kon", "es-MX", "ja", "i-klingon")>, and for the first class
1N/Athat succeeds, returns YourProjClass::I<language>->new().
1N/A
1N/AIt runs thru the entire given list of language-tags, and finds no classes
1N/Afor those exact terms, it then tries "superordinate" language classes.
1N/ASo if no "en-US" class (i.e., YourProjClass::en_us)
1N/Awas found, nor classes for anything else in that list, we then try
1N/Aits superordinate, "en" (i.e., YourProjClass::en), and so on thru
1N/Athe other language-tags in the given list: "es".
1N/A(The other language-tags in our example list:
1N/Ahappen to have no superordinates.)
1N/A
1N/AIf none of those language-tags leads to loadable classes, we then
1N/Atry classes derived from YourProjClass->fallback_languages() and
1N/Athen if nothing comes of that, we use classes named by
1N/AYourProjClass->fallback_language_classes().  Then in the (probably
1N/Aquite unlikely) event that that fails, we just return undef.
1N/A
1N/A=item *
1N/A
1N/A$lh = YourProjClass->get_handleB<()> || die "lg-handle?";
1N/A
1N/AWhen C<get_handle> is called with an empty parameter list, magic happens:
1N/A
1N/AIf C<get_handle> senses that it's running in program that was
1N/Ainvoked as a CGI, then it tries to get language-tags out of the
1N/Aenvironment variable "HTTP_ACCEPT_LANGUAGE", and it pretends that
1N/Athose were the languages passed as parameters to C<get_handle>.
1N/A
1N/AOtherwise (i.e., if not a CGI), this tries various OS-specific ways
1N/Ato get the language-tags for the current locale/language, and then
1N/Apretends that those were the value(s) passed to C<get_handle>.
1N/A
1N/ACurrently this OS-specific stuff consists of looking in the environment
1N/Avariables "LANG" and "LANGUAGE"; and on MSWin machines (where those
1N/Avariables are typically unused), this also tries using
1N/Athe module Win32::Locale to get a language-tag for whatever language/locale
1N/Ais currently selected in the "Regional Settings" (or "International"?)
1N/AControl Panel.  I welcome further
1N/Asuggestions for making this do the Right Thing under other operating
1N/Asystems that support localization.
1N/A
1N/AIf you're using localization in an application that keeps a configuration
1N/Afile, you might consider something like this in your project class:
1N/A
1N/A  sub get_handle_via_config {
1N/A    my $class = $_[0];
1N/A    my $preferred_language = $Config_settings{'language'};
1N/A    my $lh;
1N/A    if($preferred_language) {
1N/A      $lh = $class->get_handle($chosen_language)
1N/A       || die "No language handle for \"$chosen_language\" or the like";
1N/A    } else {
1N/A      # Config file missing, maybe?
1N/A      $lh = $class->get_handle()
1N/A       || die "Can't get a language handle";
1N/A    }
1N/A    return $lh;
1N/A  }
1N/A
1N/A=item *
1N/A
1N/A$lh = YourProjClass::langname->new();
1N/A
1N/AThis constructs a language handle.  You usually B<don't> call this
1N/Adirectly, but instead let C<get_handle> find a language class to C<use>
1N/Aand to then call ->new on.
1N/A
1N/A=item *
1N/A
1N/A$lh->init();
1N/A
1N/AThis is called by ->new to initialize newly-constructed language handles.
1N/AIf you define an init method in your class, remember that it's usually
1N/Aconsidered a good idea to call $lh->SUPER::init in it (presumably at the
1N/Abeginning), so that all classes get a chance to initialize a new object
1N/Ahowever they see fit.
1N/A
1N/A=item *
1N/A
1N/AYourProjClass->fallback_languages()
1N/A
1N/AC<get_handle> appends the return value of this to the end of
1N/Awhatever list of languages you pass C<get_handle>.  Unless
1N/Ayou override this method, your project class
1N/Awill inherit Locale::Maketext's C<fallback_languages>, which
1N/Acurrently returns C<('i-default', 'en', 'en-US')>.
1N/A("i-default" is defined in RFC 2277).
1N/A
1N/AThis method (by having it return the name
1N/Aof a language-tag that has an existing language class)
1N/Acan be used for making sure that
1N/AC<get_handle> will always manage to construct a language
1N/Ahandle (assuming your language classes are in an appropriate
1N/A@INC directory).  Or you can use the next method:
1N/A
1N/A=item *
1N/A
1N/AYourProjClass->fallback_language_classes()
1N/A
1N/AC<get_handle> appends the return value of this to the end
1N/Aof the list of classes it will try using.  Unless
1N/Ayou override this method, your project class
1N/Awill inherit Locale::Maketext's C<fallback_language_classes>,
1N/Awhich currently returns an empty list, C<()>.
1N/ABy setting this to some value (namely, the name of a loadable
1N/Alanguage class), you can be sure that
1N/AC<get_handle> will always manage to construct a language
1N/Ahandle.
1N/A
1N/A=back
1N/A
1N/A=head2 The "maketext" Method
1N/A
1N/AThis is the most important method in Locale::Maketext:
1N/A
1N/A$text = $lh->maketext(I<key>, ...parameters for this phrase...);
1N/A
1N/AThis looks in the %Lexicon of the language handle
1N/A$lh and all its superclasses, looking
1N/Afor an entry whose key is the string I<key>.  Assuming such
1N/Aan entry is found, various things then happen, depending on the
1N/Avalue found:
1N/A
1N/AIf the value is a scalarref, the scalar is dereferenced and returned
1N/A(and any parameters are ignored).
1N/AIf the value is a coderef, we return &$value($lh, ...parameters...).
1N/AIf the value is a string that I<doesn't> look like it's in Bracket Notation,
1N/Awe return it (after replacing it with a scalarref, in its %Lexicon).
1N/AIf the value I<does> look like it's in Bracket Notation, then we compile
1N/Ait into a sub, replace the string in the %Lexicon with the new coderef,
1N/Aand then we return &$new_sub($lh, ...parameters...).
1N/A
1N/ABracket Notation is discussed in a later section.  Note
1N/Athat trying to compile a string into Bracket Notation can throw
1N/Aan exception if the string is not syntactically valid (say, by not
1N/Abalancing brackets right.)
1N/A
1N/AAlso, calling &$coderef($lh, ...parameters...) can throw any sort of
1N/Aexception (if, say, code in that sub tries to divide by zero).  But
1N/Aa very common exception occurs when you have Bracket
1N/ANotation text that says to call a method "foo", but there is no such
1N/Amethod.  (E.g., "You have [quaB<tn>,_1,ball]." will throw an exception
1N/Aon trying to call $lh->quaB<tn>($_[1],'ball') -- you presumably meant
1N/A"quant".)  C<maketext> catches these exceptions, but only to make the
1N/Aerror message more readable, at which point it rethrows the exception.
1N/A
1N/AAn exception I<may> be thrown if I<key> is not found in any
1N/Aof $lh's %Lexicon hashes.  What happens if a key is not found,
1N/Ais discussed in a later section, "Controlling Lookup Failure".
1N/A
1N/ANote that you might find it useful in some cases to override
1N/Athe C<maketext> method with an "after method", if you want to
1N/Atranslate encodings, or even scripts:
1N/A
1N/A    package YrProj::zh_cn; # Chinese with PRC-style glyphs
1N/A    use base ('YrProj::zh_tw');  # Taiwan-style
1N/A    sub maketext {
1N/A      my $self = shift(@_);
1N/A      my $value = $self->maketext(@_);
1N/A      return Chineeze::taiwan2mainland($value);
1N/A    }
1N/A
1N/AOr you may want to override it with something that traps
1N/Aany exceptions, if that's critical to your program:
1N/A
1N/A  sub maketext {
1N/A    my($lh, @stuff) = @_;
1N/A    my $out;
1N/A    eval { $out = $lh->SUPER::maketext(@stuff) };
1N/A    return $out unless $@;
1N/A    ...otherwise deal with the exception...
1N/A  }
1N/A
1N/AOther than those two situations, I don't imagine that
1N/Ait's useful to override the C<maketext> method.  (If
1N/Ayou run into a situation where it is useful, I'd be
1N/Ainterested in hearing about it.)
1N/A
1N/A=over
1N/A
1N/A=item $lh->fail_with I<or> $lh->fail_with(I<PARAM>)
1N/A
1N/A=item $lh->failure_handler_auto
1N/A
1N/AThese two methods are discussed in the section "Controlling
1N/ALookup Failure".
1N/A
1N/A=back
1N/A
1N/A=head2 Utility Methods
1N/A
1N/AThese are methods that you may find it handy to use, generally
1N/Afrom %Lexicon routines of yours (whether expressed as
1N/ABracket Notation or not).
1N/A
1N/A=over
1N/A
1N/A=item $language->quant($number, $singular)
1N/A
1N/A=item $language->quant($number, $singular, $plural)
1N/A
1N/A=item $language->quant($number, $singular, $plural, $negative)
1N/A
1N/AThis is generally meant to be called from inside Bracket Notation
1N/A(which is discussed later), as in
1N/A
1N/A     "Your search matched [quant,_1,document]!"
1N/A
1N/AIt's for I<quantifying> a noun (i.e., saying how much of it there is,
1N/Awhile giving the correct form of it).  The behavior of this method is
1N/Ahandy for English and a few other Western European languages, and you
1N/Ashould override it for languages where it's not suitable.  You can feel
1N/Afree to read the source, but the current implementation is basically
1N/Aas this pseudocode describes:
1N/A
1N/A     if $number is 0 and there's a $negative,
1N/A        return $negative;
1N/A     elsif $number is 1,
1N/A        return "1 $singular";
1N/A     elsif there's a $plural,
1N/A        return "$number $plural";
1N/A     else
1N/A        return "$number " . $singular . "s";
1N/A     #
1N/A     # ...except that we actually call numf to
1N/A     #  stringify $number before returning it.
1N/A
1N/ASo for English (with Bracket Notation)
1N/AC<"...[quant,_1,file]..."> is fine (for 0 it returns "0 files",
1N/Afor 1 it returns "1 file", and for more it returns "2 files", etc.)
1N/A
1N/ABut for "directory", you'd want C<"[quant,_1,directory,directories]">
1N/Aso that our elementary C<quant> method doesn't think that the
1N/Aplural of "directory" is "directorys".  And you might find that the
1N/Aoutput may sound better if you specify a negative form, as in:
1N/A
1N/A     "[quant,_1,file,files,No files] matched your query.\n"
1N/A
1N/ARemember to keep in mind verb agreement (or adjectives too, in
1N/Aother languages), as in:
1N/A
1N/A     "[quant,_1,document] were matched.\n"
1N/A
1N/ABecause if _1 is one, you get "1 document B<were> matched".
1N/AAn acceptable hack here is to do something like this:
1N/A
1N/A     "[quant,_1,document was, documents were] matched.\n"
1N/A
1N/A=item $language->numf($number)
1N/A
1N/AThis returns the given number formatted nicely according to
1N/Athis language's conventions.  Maketext's default method is
1N/Amostly to just take the normal string form of the number
1N/A(applying sprintf "%G" for only very large numbers), and then
1N/Ato add commas as necessary.  (Except that
1N/Awe apply C<tr/,./.,/> if $language->{'numf_comma'} is true;
1N/Athat's a bit of a hack that's useful for languages that express
1N/Atwo million as "2.000.000" and not as "2,000,000").
1N/A
1N/AIf you want anything fancier, consider overriding this with something
1N/Athat uses L<Number::Format|Number::Format>, or does something else
1N/Aentirely.
1N/A
1N/ANote that numf is called by quant for stringifying all quantifying
1N/Anumbers.
1N/A
1N/A=item $language->sprintf($format, @items)
1N/A
1N/AThis is just a wrapper around Perl's normal C<sprintf> function.
1N/AIt's provided so that you can use "sprintf" in Bracket Notation:
1N/A
1N/A     "Couldn't access datanode [sprintf,%10x=~[%s~],_1,_2]!\n"
1N/A
1N/Areturning...
1N/A
1N/A     Couldn't access datanode      Stuff=[thangamabob]!
1N/A
1N/A=item $language->language_tag()
1N/A
1N/ACurrently this just takes the last bit of C<ref($language)>, turns
1N/Aunderscores to dashes, and returns it.  So if $language is
1N/Aan object of class Hee::HOO::Haw::en_us, $language->language_tag()
1N/Areturns "en-us".  (Yes, the usual representation for that language
1N/Atag is "en-US", but case is I<never> considered meaningful in
1N/Alanguage-tag comparison.)
1N/A
1N/AYou may override this as you like; Maketext doesn't use it for
1N/Aanything.
1N/A
1N/A=item $language->encoding()
1N/A
1N/ACurrently this isn't used for anything, but it's provided
1N/A(with default value of
1N/AC<(ref($language) && $language-E<gt>{'encoding'})) or "iso-8859-1">
1N/A) as a sort of suggestion that it may be useful/necessary to
1N/Aassociate encodings with your language handles (whether on a
1N/Aper-class or even per-handle basis.)
1N/A
1N/A=back
1N/A
1N/A=head2 Language Handle Attributes and Internals
1N/A
1N/AA language handle is a flyweight object -- i.e., it doesn't (necessarily)
1N/Acarry any data of interest, other than just being a member of
1N/Awhatever class it belongs to.
1N/A
1N/AA language handle is implemented as a blessed hash.  Subclasses of yours
1N/Acan store whatever data you want in the hash.  Currently the only hash
1N/Aentry used by any crucial Maketext method is "fail", so feel free to
1N/Ause anything else as you like.
1N/A
1N/AB<Remember: Don't be afraid to read the Maketext source if there's
1N/Aany point on which this documentation is unclear.>  This documentation
1N/Ais vastly longer than the module source itself.
1N/A
1N/A=over
1N/A
1N/A=back
1N/A
1N/A=head1 LANGUAGE CLASS HIERARCHIES
1N/A
1N/AThese are Locale::Maketext's assumptions about the class
1N/Ahierarchy formed by all your language classes:
1N/A
1N/A=over
1N/A
1N/A=item *
1N/A
1N/AYou must have a project base class, which you load, and
1N/Awhich you then use as the first argument in
1N/Athe call to YourProjClass->get_handle(...).  It should derive
1N/A(whether directly or indirectly) from Locale::Maketext.
1N/AIt B<doesn't matter> how you name this class, altho assuming this
1N/Ais the localization component of your Super Mega Program,
1N/Agood names for your project class might be
1N/ASuperMegaProgram::Localization, SuperMegaProgram::L10N,
1N/ASuperMegaProgram::I18N, SuperMegaProgram::International,
1N/Aor even SuperMegaProgram::Languages or SuperMegaProgram::Messages.
1N/A
1N/A=item *
1N/A
1N/ALanguage classes are what YourProjClass->get_handle will try to load.
1N/AIt will look for them by taking each language-tag (B<skipping> it
1N/Aif it doesn't look like a language-tag or locale-tag!), turning it to
1N/Aall lowercase, turning and dashes to underscores, and appending it
1N/Ato YourProjClass . "::".  So this:
1N/A
1N/A  $lh = YourProjClass->get_handle(
1N/A    'en-US', 'fr', 'kon', 'i-klingon', 'i-klingon-romanized'
1N/A  );
1N/A
1N/Awill try loading the classes
1N/AYourProjClass::en_us (note lowercase!), YourProjClass::fr,
1N/AYourProjClass::kon,
1N/AYourProjClass::i_klingon
1N/Aand YourProjClass::i_klingon_romanized.  (And it'll stop at the
1N/Afirst one that actually loads.)
1N/A
1N/A=item *
1N/A
1N/AI assume that each language class derives (directly or indirectly)
1N/Afrom your project class, and also defines its @ISA, its %Lexicon,
1N/Aor both.  But I anticipate no dire consequences if these assumptions
1N/Ado not hold.
1N/A
1N/A=item *
1N/A
1N/ALanguage classes may derive from other language classes (altho they
1N/Ashould have "use I<Thatclassname>" or "use base qw(I<...classes...>)").
1N/AThey may derive from the project
1N/Aclass.  They may derive from some other class altogether.  Or via
1N/Amultiple inheritance, it may derive from any mixture of these.
1N/A
1N/A=item *
1N/A
1N/AI foresee no problems with having multiple inheritance in
1N/Ayour hierarchy of language classes.  (As usual, however, Perl will
1N/Acomplain bitterly if you have a cycle in the hierarchy: i.e., if
1N/Aany class is its own ancestor.)
1N/A
1N/A=back
1N/A
1N/A=head1 ENTRIES IN EACH LEXICON
1N/A
1N/AA typical %Lexicon entry is meant to signify a phrase,
1N/Ataking some number (0 or more) of parameters.  An entry
1N/Ais meant to be accessed by via
1N/Aa string I<key> in $lh->maketext(I<key>, ...parameters...),
1N/Awhich should return a string that is generally meant for
1N/Abe used for "output" to the user -- regardless of whether
1N/Athis actually means printing to STDOUT, writing to a file,
1N/Aor putting into a GUI widget.
1N/A
1N/AWhile the key must be a string value (since that's a basic
1N/Arestriction that Perl places on hash keys), the value in
1N/Athe lexicon can currently be of several types:
1N/Aa defined scalar, scalarref, or coderef.  The use of these is
1N/Aexplained above, in the section 'The "maketext" Method', and
1N/ABracket Notation for strings is discussed in the next section.
1N/A
1N/AWhile you can use arbitrary unique IDs for lexicon keys
1N/A(like "_min_larger_max_error"), it is often
1N/Auseful for if an entry's key is itself a valid value, like
1N/Athis example error message:
1N/A
1N/A  "Minimum ([_1]) is larger than maximum ([_2])!\n",
1N/A
1N/ACompare this code that uses an arbitrary ID...
1N/A
1N/A  die $lh->maketext( "_min_larger_max_error", $min, $max )
1N/A   if $min > $max;
1N/A
1N/A...to this code that uses a key-as-value:
1N/A
1N/A  die $lh->maketext(
1N/A   "Minimum ([_1]) is larger than maximum ([_2])!\n",
1N/A   $min, $max
1N/A  ) if $min > $max;
1N/A
1N/AThe second is, in short, more readable.  In particular, it's obvious
1N/Athat the number of parameters you're feeding to that phrase (two) is
1N/Athe number of parameters that it I<wants> to be fed.  (Since you see
1N/A_1 and a _2 being used in the key there.)
1N/A
1N/AAlso, once a project is otherwise
1N/Acomplete and you start to localize it, you can scrape together
1N/Aall the various keys you use, and pass it to a translator; and then
1N/Athe translator's work will go faster if what he's presented is this:
1N/A
1N/A "Minimum ([_1]) is larger than maximum ([_2])!\n",
1N/A  => "",   # fill in something here, Jacques!
1N/A
1N/Arather than this more cryptic mess:
1N/A
1N/A "_min_larger_max_error"
1N/A  => "",   # fill in something here, Jacques
1N/A
1N/AI think that keys as lexicon values makes the completed lexicon
1N/Aentries more readable:
1N/A
1N/A "Minimum ([_1]) is larger than maximum ([_2])!\n",
1N/A  => "Le minimum ([_1]) est plus grand que le maximum ([_2])!\n",
1N/A
1N/AAlso, having valid values as keys becomes very useful if you set
1N/Aup an _AUTO lexicon.  _AUTO lexicons are discussed in a later
1N/Asection.
1N/A
1N/AI almost always use keys that are themselves
1N/Avalid lexicon values.  One notable exception is when the value is
1N/Aquite long.  For example, to get the screenful of data that
1N/Aa command-line program might returns when given an unknown switch,
1N/AI often just use a key "_USAGE_MESSAGE".  At that point I then go
1N/Aand immediately to define that lexicon entry in the
1N/AProjectClass::L10N::en lexicon (since English is always my "project
1N/Alanguage"):
1N/A
1N/A  '_USAGE_MESSAGE' => <<'EOSTUFF',
1N/A  ...long long message...
1N/A  EOSTUFF
1N/A
1N/Aand then I can use it as:
1N/A
1N/A  getopt('oDI', \%opts) or die $lh->maketext('_USAGE_MESSAGE');
1N/A
1N/AIncidentally,
1N/Anote that each class's C<%Lexicon> inherits-and-extends
1N/Athe lexicons in its superclasses.  This is not because these are
1N/Aspecial hashes I<per se>, but because you access them via the
1N/AC<maketext> method, which looks for entries across all the
1N/AC<%Lexicon>'s in a language class I<and> all its ancestor classes.
1N/A(This is because the idea of "class data" isn't directly implemented
1N/Ain Perl, but is instead left to individual class-systems to implement
1N/Aas they see fit..)
1N/A
1N/ANote that you may have things stored in a lexicon
1N/Abesides just phrases for output:  for example, if your program
1N/Atakes input from the keyboard, asking a "(Y/N)" question,
1N/Ayou probably need to know what equivalent of "Y[es]/N[o]" is
1N/Ain whatever language.  You probably also need to know what
1N/Athe equivalents of the answers "y" and "n" are.  You can
1N/Astore that information in the lexicon (say, under the keys
1N/A"~answer_y" and "~answer_n", and the long forms as
1N/A"~answer_yes" and "~answer_no", where "~" is just an ad-hoc
1N/Acharacter meant to indicate to programmers/translators that
1N/Athese are not phrases for output).
1N/A
1N/AOr instead of storing this in the language class's lexicon,
1N/Ayou can (and, in some cases, really should) represent the same bit
1N/Aof knowledge as code is a method in the language class.  (That
1N/Aleaves a tidy distinction between the lexicon as the things we
1N/Aknow how to I<say>, and the rest of the things in the lexicon class
1N/Aas things that we know how to I<do>.)  Consider
1N/Athis example of a processor for responses to French "oui/non"
1N/Aquestions:
1N/A
1N/A  sub y_or_n {
1N/A    return undef unless defined $_[1] and length $_[1];
1N/A    my $answer = lc $_[1];  # smash case
1N/A    return 1 if $answer eq 'o' or $answer eq 'oui';
1N/A    return 0 if $answer eq 'n' or $answer eq 'non';
1N/A    return undef;
1N/A  }
1N/A
1N/A...which you'd then call in a construct like this:
1N/A
1N/A  my $response;
1N/A  until(defined $response) {
1N/A    print $lh->maketext("Open the pod bay door (y/n)? ");
1N/A    $response = $lh->y_or_n( get_input_from_keyboard_somehow() );
1N/A  }
1N/A  if($response) { $pod_bay_door->open()         }
1N/A  else          { $pod_bay_door->leave_closed() }
1N/A
1N/AOther data worth storing in a lexicon might be things like
1N/Afilenames for language-targetted resources:
1N/A
1N/A  ...
1N/A  "_main_splash_png"
1N/A    => "/styles/en_us/main_splash.png",
1N/A  "_main_splash_imagemap"
1N/A    => "/styles/en_us/main_splash.incl",
1N/A  "_general_graphics_path"
1N/A    => "/styles/en_us/",
1N/A  "_alert_sound"
1N/A    => "/styles/en_us/hey_there.wav",
1N/A  "_forward_icon"
1N/A   => "left_arrow.png",
1N/A  "_backward_icon"
1N/A   => "right_arrow.png",
1N/A  # In some other languages, left equals
1N/A  #  BACKwards, and right is FOREwards.
1N/A  ...
1N/A
1N/AYou might want to do the same thing for expressing key bindings
1N/Aor the like (since hardwiring "q" as the binding for the function
1N/Athat quits a screen/menu/program is useful only if your language
1N/Ahappens to associate "q" with "quit"!)
1N/A
1N/A=head1 BRACKET NOTATION
1N/A
1N/ABracket Notation is a crucial feature of Locale::Maketext.  I mean
1N/ABracket Notation to provide a replacement for sprintf formatting.
1N/AEverything you do with Bracket Notation could be done with a sub block,
1N/Abut bracket notation is meant to be much more concise.
1N/A
1N/ABracket Notation is a like a miniature "template" system (in the sense
1N/Aof L<Text::Template|Text::Template>, not in the sense of C++ templates),
1N/Awhere normal text is passed thru basically as is, but text is special
1N/Aregions is specially interpreted.  In Bracket Notation, you use brackets
1N/A("[...]" -- not "{...}"!) to note sections that are specially interpreted.
1N/A
1N/AFor example, here all the areas that are taken literally are underlined with
1N/Aa "^", and all the in-bracket special regions are underlined with an X:
1N/A
1N/A  "Minimum ([_1]) is larger than maximum ([_2])!\n",
1N/A   ^^^^^^^^^ XX ^^^^^^^^^^^^^^^^^^^^^^^^^^ XX ^^^^
1N/A
1N/AWhen that string is compiled from bracket notation into a real Perl sub,
1N/Ait's basically turned into:
1N/A
1N/A  sub {
1N/A    my $lh = $_[0];
1N/A    my @params = @_;
1N/A    return join '',
1N/A      "Minimum (",
1N/A      ...some code here...
1N/A      ") is larger than maximum (",
1N/A      ...some code here...
1N/A      ")!\n",
1N/A  }
1N/A  # to be called by $lh->maketext(KEY, params...)
1N/A
1N/AIn other words, text outside bracket groups is turned into string
1N/Aliterals.  Text in brackets is rather more complex, and currently follows
1N/Athese rules:
1N/A
1N/A=over
1N/A
1N/A=item *
1N/A
1N/ABracket groups that are empty, or which consist only of whitespace,
1N/Aare ignored.  (Examples: "[]", "[    ]", or a [ and a ] with returns
1N/Aand/or tabs and/or spaces between them.
1N/A
1N/AOtherwise, each group is taken to be a comma-separated group of items,
1N/Aand each item is interpreted as follows:
1N/A
1N/A=item *
1N/A
1N/AAn item that is "_I<digits>" or "_-I<digits>" is interpreted as
1N/A$_[I<value>].  I.e., "_1" is becomes with $_[1], and "_-3" is interpreted
1N/Aas $_[-3] (in which case @_ should have at least three elements in it).
1N/ANote that $_[0] is the language handle, and is typically not named
1N/Adirectly.
1N/A
1N/A=item *
1N/A
1N/AAn item "_*" is interpreted to mean "all of @_ except $_[0]".
1N/AI.e., C<@_[1..$#_]>.  Note that this is an empty list in the case
1N/Aof calls like $lh->maketext(I<key>) where there are no
1N/Aparameters (except $_[0], the language handle).
1N/A
1N/A=item *
1N/A
1N/AOtherwise, each item is interpreted as a string literal.
1N/A
1N/A=back
1N/A
1N/AThe group as a whole is interpreted as follows:
1N/A
1N/A=over
1N/A
1N/A=item *
1N/A
1N/AIf the first item in a bracket group looks like a method name,
1N/Athen that group is interpreted like this:
1N/A
1N/A  $lh->that_method_name(
1N/A    ...rest of items in this group...
1N/A  ),
1N/A
1N/A=item *
1N/A
1N/AIf the first item in a bracket group is "*", it's taken as shorthand
1N/Afor the so commonly called "quant" method.  Similarly, if the first
1N/Aitem in a bracket group is "#", it's taken to be shorthand for
1N/A"numf".
1N/A
1N/A=item *
1N/A
1N/AIf the first item in a bracket group is empty-string, or "_*"
1N/Aor "_I<digits>" or "_-I<digits>", then that group is interpreted
1N/Aas just the interpolation of all its items:
1N/A
1N/A  join('',
1N/A    ...rest of items in this group...
1N/A  ),
1N/A
1N/AExamples:  "[_1]" and "[,_1]", which are synonymous; and
1N/A"C<[,ID-(,_4,-,_2,)]>", which compiles as
1N/AC<join "", "ID-(", $_[4], "-", $_[2], ")">.
1N/A
1N/A=item *
1N/A
1N/AOtherwise this bracket group is invalid.  For example, in the group
1N/A"[!@#,whatever]", the first item C<"!@#"> is neither empty-string,
1N/A"_I<number>", "_-I<number>", "_*", nor a valid method name; and so
1N/ALocale::Maketext will throw an exception of you try compiling an
1N/Aexpression containing this bracket group.
1N/A
1N/A=back
1N/A
1N/ANote, incidentally, that items in each group are comma-separated,
1N/Anot C</\s*,\s*/>-separated.  That is, you might expect that this
1N/Abracket group:
1N/A
1N/A  "Hoohah [foo, _1 , bar ,baz]!"
1N/A
1N/Awould compile to this:
1N/A
1N/A  sub {
1N/A    my $lh = $_[0];
1N/A    return join '',
1N/A      "Hoohah ",
1N/A      $lh->foo( $_[1], "bar", "baz"),
1N/A      "!",
1N/A  }
1N/A
1N/ABut it actually compiles as this:
1N/A
1N/A  sub {
1N/A    my $lh = $_[0];
1N/A    return join '',
1N/A      "Hoohah ",
1N/A      $lh->foo(" _1 ", " bar ", "baz"),  #!!!
1N/A      "!",
1N/A  }
1N/A
1N/AIn the notation discussed so far, the characters "[" and "]" are given
1N/Aspecial meaning, for opening and closing bracket groups, and "," has
1N/Aa special meaning inside bracket groups, where it separates items in the
1N/Agroup.  This begs the question of how you'd express a literal "[" or
1N/A"]" in a Bracket Notation string, and how you'd express a literal
1N/Acomma inside a bracket group.  For this purpose I've adopted "~" (tilde)
1N/Aas an escape character:  "~[" means a literal '[' character anywhere
1N/Ain Bracket Notation (i.e., regardless of whether you're in a bracket
1N/Agroup or not), and ditto for "~]" meaning a literal ']', and "~," meaning
1N/Aa literal comma.  (Altho "," means a literal comma outside of
1N/Abracket groups -- it's only inside bracket groups that commas are special.)
1N/A
1N/AAnd on the off chance you need a literal tilde in a bracket expression,
1N/Ayou get it with "~~".
1N/A
1N/ACurrently, an unescaped "~" before a character
1N/Aother than a bracket or a comma is taken to mean just a "~" and that
1N/Acharacter.  I.e., "~X" means the same as "~~X" -- i.e., one literal tilde,
1N/Aand then one literal "X".  However, by using "~X", you are assuming that
1N/Ano future version of Maketext will use "~X" as a magic escape sequence.
1N/AIn practice this is not a great problem, since first off you can just
1N/Awrite "~~X" and not worry about it; second off, I doubt I'll add lots
1N/Aof new magic characters to bracket notation; and third off, you
1N/Aaren't likely to want literal "~" characters in your messages anyway,
1N/Asince it's not a character with wide use in natural language text.
1N/A
1N/ABrackets must be balanced -- every openbracket must have
1N/Aone matching closebracket, and vice versa.  So these are all B<invalid>:
1N/A
1N/A  "I ate [quant,_1,rhubarb pie."
1N/A  "I ate [quant,_1,rhubarb pie[."
1N/A  "I ate quant,_1,rhubarb pie]."
1N/A  "I ate quant,_1,rhubarb pie[."
1N/A
1N/ACurrently, bracket groups do not nest.  That is, you B<cannot> say:
1N/A
1N/A  "Foo [bar,baz,[quux,quuux]]\n";
1N/A
1N/AIf you need a notation that's that powerful, use normal Perl:
1N/A
1N/A  %Lexicon = (
1N/A    ...
1N/A    "some_key" => sub {
1N/A      my $lh = $_[0];
1N/A      join '',
1N/A        "Foo ",
1N/A        $lh->bar('baz', $lh->quux('quuux')),
1N/A        "\n",
1N/A    },
1N/A    ...
1N/A  );
1N/A
1N/AOr write the "bar" method so you don't need to pass it the
1N/Aoutput from calling quux.
1N/A
1N/AI do not anticipate that you will need (or particularly want)
1N/Ato nest bracket groups, but you are welcome to email me with
1N/Aconvincing (real-life) arguments to the contrary.
1N/A
1N/A=head1 AUTO LEXICONS
1N/A
1N/AIf maketext goes to look in an individual %Lexicon for an entry
1N/Afor I<key> (where I<key> does not start with an underscore), and
1N/Asees none, B<but does see> an entry of "_AUTO" => I<some_true_value>,
1N/Athen we actually define $Lexicon{I<key>} = I<key> right then and there,
1N/Aand then use that value as if it had been there all
1N/Aalong.  This happens before we even look in any superclass %Lexicons!
1N/A
1N/A(This is meant to be somewhat like the AUTOLOAD mechanism in
1N/APerl's function call system -- or, looked at another way,
1N/Alike the L<AutoLoader|AutoLoader> module.)
1N/A
1N/AI can picture all sorts of circumstances where you just
1N/Ado not want lookup to be able to fail (since failing
1N/Anormally means that maketext throws a C<die>, altho
1N/Asee the next section for greater control over that).  But
1N/Ahere's one circumstance where _AUTO lexicons are meant to
1N/Abe I<especially> useful:
1N/A
1N/AAs you're writing an application, you decide as you go what messages
1N/Ayou need to emit.  Normally you'd go to write this:
1N/A
1N/A  if(-e $filename) {
1N/A    go_process_file($filename)
1N/A  } else {
1N/A    print "Couldn't find file \"$filename\"!\n";
1N/A  }
1N/A
1N/Abut since you anticipate localizing this, you write:
1N/A
1N/A  use ThisProject::I18N;
1N/A  my $lh = ThisProject::I18N->get_handle();
1N/A   # For the moment, assume that things are set up so
1N/A   # that we load class ThisProject::I18N::en
1N/A   # and that that's the class that $lh belongs to.
1N/A  ...
1N/A  if(-e $filename) {
1N/A    go_process_file($filename)
1N/A  } else {
1N/A    print $lh->maketext(
1N/A      "Couldn't find file \"[_1]\"!\n", $filename
1N/A    );
1N/A  }
1N/A
1N/ANow, right after you've just written the above lines, you'd
1N/Anormally have to go open the file
1N/AThisProject/I18N/en.pm, and immediately add an entry:
1N/A
1N/A  "Couldn't find file \"[_1]\"!\n"
1N/A  => "Couldn't find file \"[_1]\"!\n",
1N/A
1N/ABut I consider that somewhat of a distraction from the work
1N/Aof getting the main code working -- to say nothing of the fact
1N/Athat I often have to play with the program a few times before
1N/AI can decide exactly what wording I want in the messages (which
1N/Ain this case would require me to go changing three lines of code:
1N/Athe call to maketext with that key, and then the two lines in
1N/AThisProject/I18N/en.pm).
1N/A
1N/AHowever, if you set "_AUTO => 1" in the %Lexicon in,
1N/AThisProject/I18N/en.pm (assuming that English (en) is
1N/Athe language that all your programmers will be using for this
1N/Aproject's internal message keys), then you don't ever have to
1N/Ago adding lines like this
1N/A
1N/A  "Couldn't find file \"[_1]\"!\n"
1N/A  => "Couldn't find file \"[_1]\"!\n",
1N/A
1N/Ato ThisProject/I18N/en.pm, because if _AUTO is true there,
1N/Athen just looking for an entry with the key "Couldn't find
1N/Afile \"[_1]\"!\n" in that lexicon will cause it to be added,
1N/Awith that value!
1N/A
1N/ANote that the reason that keys that start with "_"
1N/Aare immune to _AUTO isn't anything generally magical about
1N/Athe underscore character -- I just wanted a way to have most
1N/Alexicon keys be autoable, except for possibly a few, and I
1N/Aarbitrarily decided to use a leading underscore as a signal
1N/Ato distinguish those few.
1N/A
1N/A=head1 CONTROLLING LOOKUP FAILURE
1N/A
1N/AIf you call $lh->maketext(I<key>, ...parameters...),
1N/Aand there's no entry I<key> in $lh's class's %Lexicon, nor
1N/Ain the superclass %Lexicon hash, I<and> if we can't auto-make
1N/AI<key> (because either it starts with a "_", or because none
1N/Aof its lexicons have C<_AUTO =E<gt> 1,>), then we have
1N/Afailed to find a normal way to maketext I<key>.  What then
1N/Ahappens in these failure conditions, depends on the $lh object
1N/A"fail" attribute.
1N/A
1N/AIf the language handle has no "fail" attribute, maketext
1N/Awill simply throw an exception (i.e., it calls C<die>, mentioning
1N/Athe I<key> whose lookup failed, and naming the line number where
1N/Athe calling $lh->maketext(I<key>,...) was.
1N/A
1N/AIf the language handle has a "fail" attribute whose value is a
1N/Acoderef, then $lh->maketext(I<key>,...params...) gives up and calls:
1N/A
1N/A  return &{$that_subref}($lh, $key, @params);
1N/A
1N/AOtherwise, the "fail" attribute's value should be a string denoting
1N/Aa method name, so that $lh->maketext(I<key>,...params...) can
1N/Agive up with:
1N/A
1N/A  return $lh->$that_method_name($phrase, @params);
1N/A
1N/AThe "fail" attribute can be accessed with the C<fail_with> method:
1N/A
1N/A  # Set to a coderef:
1N/A  $lh->fail_with( \&failure_handler );
1N/A
1N/A  # Set to a method name:
1N/A  $lh->fail_with( 'failure_method' );
1N/A
1N/A  # Set to nothing (i.e., so failure throws a plain exception)
1N/A  $lh->fail_with( undef );
1N/A
1N/A  # Simply read:
1N/A  $handler = $lh->fail_with();
1N/A
1N/ANow, as to what you may want to do with these handlers:  Maybe you'd
1N/Awant to log what key failed for what class, and then die.  Maybe
1N/Ayou don't like C<die> and instead you want to send the error message
1N/Ato STDOUT (or wherever) and then merely C<exit()>.
1N/A
1N/AOr maybe you don't want to C<die> at all!  Maybe you could use a
1N/Ahandler like this:
1N/A
1N/A  # Make all lookups fall back onto an English value,
1N/A  #  but after we log it for later fingerpointing.
1N/A  my $lh_backup = ThisProject->get_handle('en');
1N/A  open(LEX_FAIL_LOG, ">>wherever/lex.log") || die "GNAARGH $!";
1N/A  sub lex_fail {
1N/A    my($failing_lh, $key, $params) = @_;
1N/A    print LEX_FAIL_LOG scalar(localtime), "\t",
1N/A       ref($failing_lh), "\t", $key, "\n";
1N/A    return $lh_backup->maketext($key,@params);
1N/A  }
1N/A
1N/ASome users have expressed that they think this whole mechanism of
1N/Ahaving a "fail" attribute at all, seems a rather pointless complication.
1N/ABut I want Locale::Maketext to be usable for software projects of I<any>
1N/Ascale and type; and different software projects have different ideas
1N/Aof what the right thing is to do in failure conditions.  I could simply
1N/Asay that failure always throws an exception, and that if you want to be
1N/Acareful, you'll just have to wrap every call to $lh->maketext in an
1N/AS<eval { }>.  However, I want programmers to reserve the right (via
1N/Athe "fail" attribute) to treat lookup failure as something other than
1N/Aan exception of the same level of severity as a config file being
1N/Aunreadable, or some essential resource being inaccessible.
1N/A
1N/AOne possibly useful value for the "fail" attribute is the method name
1N/A"failure_handler_auto".  This is a method defined in class
1N/ALocale::Maketext itself.  You set it with:
1N/A
1N/A  $lh->fail_with('failure_handler_auto');
1N/A
1N/AThen when you call $lh->maketext(I<key>, ...parameters...) and
1N/Athere's no I<key> in any of those lexicons, maketext gives up with
1N/A
1N/A  return $lh->failure_handler_auto($key, @params);
1N/A
1N/ABut failure_handler_auto, instead of dying or anything, compiles
1N/A$key, caching it in $lh->{'failure_lex'}{$key} = $complied,
1N/Aand then calls the compiled value, and returns that.  (I.e., if
1N/A$key looks like bracket notation, $compiled is a sub, and we return
1N/A&{$compiled}(@params); but if $key is just a plain string, we just
1N/Areturn that.)
1N/A
1N/AThe effect of using "failure_auto_handler"
1N/Ais like an AUTO lexicon, except that it 1) compiles $key even if
1N/Ait starts with "_", and 2) you have a record in the new hashref
1N/A$lh->{'failure_lex'} of all the keys that have failed for
1N/Athis object.  This should avoid your program dying -- as long
1N/Aas your keys aren't actually invalid as bracket code, and as
1N/Along as they don't try calling methods that don't exist.
1N/A
1N/A"failure_auto_handler" may not be exactly what you want, but I
1N/Ahope it at least shows you that maketext failure can be mitigated
1N/Ain any number of very flexible ways.  If you can formalize exactly
1N/Awhat you want, you should be able to express that as a failure
1N/Ahandler.  You can even make it default for every object of a given
1N/Aclass, by setting it in that class's init:
1N/A
1N/A  sub init {
1N/A    my $lh = $_[0];  # a newborn handle
1N/A    $lh->SUPER::init();
1N/A    $lh->fail_with('my_clever_failure_handler');
1N/A    return;
1N/A  }
1N/A  sub my_clever_failure_handler {
1N/A    ...you clever things here...
1N/A  }
1N/A
1N/A=head1 HOW TO USE MAKETEXT
1N/A
1N/AHere is a brief checklist on how to use Maketext to localize
1N/Aapplications:
1N/A
1N/A=over
1N/A
1N/A=item *
1N/A
1N/ADecide what system you'll use for lexicon keys.  If you insist,
1N/Ayou can use opaque IDs (if you're nostalgic for C<catgets>),
1N/Abut I have better suggestions in the
1N/Asection "Entries in Each Lexicon", above.  Assuming you opt for
1N/Ameaningful keys that double as values (like "Minimum ([_1]) is
1N/Alarger than maximum ([_2])!\n"), you'll have to settle on what
1N/Alanguage those should be in.  For the sake of argument, I'll
1N/Acall this English, specifically American English, "en-US".
1N/A
1N/A=item *
1N/A
1N/ACreate a class for your localization project.  This is
1N/Athe name of the class that you'll use in the idiom:
1N/A
1N/A  use Projname::L10N;
1N/A  my $lh = Projname::L10N->get_handle(...) || die "Language?";
1N/A
1N/AAssuming your call your class Projname::L10N, create a class
1N/Aconsisting minimally of:
1N/A
1N/A  package Projname::L10N;
1N/A  use base qw(Locale::Maketext);
1N/A  ...any methods you might want all your languages to share...
1N/A
1N/A  # And, assuming you want the base class to be an _AUTO lexicon,
1N/A  # as is discussed a few sections up:
1N/A
1N/A  1;
1N/A
1N/A=item *
1N/A
1N/ACreate a class for the language your internal keys are in.  Name
1N/Athe class after the language-tag for that language, in lowercase,
1N/Awith dashes changed to underscores.  Assuming your project's first
1N/Alanguage is US English, you should call this Projname::L10N::en_us.
1N/AIt should consist minimally of:
1N/A
1N/A  package Projname::L10N::en_us;
1N/A  use base qw(Projname::L10N);
1N/A  %Lexicon = (
1N/A    '_AUTO' => 1,
1N/A  );
1N/A  1;
1N/A
1N/A(For the rest of this section, I'll assume that this "first
1N/Alanguage class" of Projname::L10N::en_us has
1N/A_AUTO lexicon.)
1N/A
1N/A=item *
1N/A
1N/AGo and write your program.  Everywhere in your program where
1N/Ayou would say:
1N/A
1N/A  print "Foobar $thing stuff\n";
1N/A
1N/Ainstead do it thru maketext, using no variable interpolation in
1N/Athe key:
1N/A
1N/A  print $lh->maketext("Foobar [_1] stuff\n", $thing);
1N/A
1N/AIf you get tired of constantly saying C<print $lh-E<gt>maketext>,
1N/Aconsider making a functional wrapper for it, like so:
1N/A
1N/A  use Projname::L10N;
1N/A  use vars qw($lh);
1N/A  $lh = Projname::L10N->get_handle(...) || die "Language?";
1N/A  sub pmt (@) { print( $lh->maketext(@_)) }
1N/A   # "pmt" is short for "Print MakeText"
1N/A  $Carp::Verbose = 1;
1N/A   # so if maketext fails, we see made the call to pmt
1N/A
1N/ABesides whole phrases meant for output, anything language-dependent
1N/Ashould be put into the class Projname::L10N::en_us,
1N/Awhether as methods, or as lexicon entries -- this is discussed
1N/Ain the section "Entries in Each Lexicon", above.
1N/A
1N/A=item *
1N/A
1N/AOnce the program is otherwise done, and once its localization for
1N/Athe first language works right (via the data and methods in
1N/AProjname::L10N::en_us), you can get together the data for translation.
1N/AIf your first language lexicon isn't an _AUTO lexicon, then you already
1N/Ahave all the messages explicitly in the lexicon (or else you'd be
1N/Agetting exceptions thrown when you call $lh->maketext to get
1N/Amessages that aren't in there).  But if you were (advisedly) lazy and are
1N/Ausing an _AUTO lexicon, then you've got to make a list of all the phrases
1N/Athat you've so far been letting _AUTO generate for you.  There are very
1N/Amany ways to assemble such a list.  The most straightforward is to simply
1N/Agrep the source for every occurrence of "maketext" (or calls
1N/Ato wrappers around it, like the above C<pmt> function), and to log the
1N/Afollowing phrase.
1N/A
1N/A=item *
1N/A
1N/AYou may at this point want to consider whether the your base class
1N/A(Projname::L10N) that all lexicons inherit from (Projname::L10N::en,
1N/AProjname::L10N::es, etc.) should be an _AUTO lexicon.  It may be true
1N/Athat in theory, all needed messages will be in each language class;
1N/Abut in the presumably unlikely or "impossible" case of lookup failure,
1N/Ayou should consider whether your program should throw an exception,
1N/Aemit text in English (or whatever your project's first language is),
1N/Aor some more complex solution as described in the section
1N/A"Controlling Lookup Failure", above.
1N/A
1N/A=item *
1N/A
1N/ASubmit all messages/phrases/etc. to translators.
1N/A
1N/A(You may, in fact, want to start with localizing to I<one> other language
1N/Aat first, if you're not sure that you've property abstracted the
1N/Alanguage-dependent parts of your code.)
1N/A
1N/ATranslators may request clarification of the situation in which a
1N/Aparticular phrase is found.  For example, in English we are entirely happy
1N/Asaying "I<n> files found", regardless of whether we mean "I looked for files,
1N/Aand found I<n> of them" or the rather distinct situation of "I looked for
1N/Asomething else (like lines in files), and along the way I saw I<n>
1N/Afiles."  This may involve rethinking things that you thought quite clear:
1N/Ashould "Edit" on a toolbar be a noun ("editing") or a verb ("to edit")?  Is
1N/Athere already a conventionalized way to express that menu option, separate
1N/Afrom the target language's normal word for "to edit"?
1N/A
1N/AIn all cases where the very common phenomenon of quantification
1N/A(saying "I<N> files", for B<any> value of N)
1N/Ais involved, each translator should make clear what dependencies the
1N/Anumber causes in the sentence.  In many cases, dependency is
1N/Alimited to words adjacent to the number, in places where you might
1N/Aexpect them ("I found the-?PLURAL I<N>
1N/Aempty-?PLURAL directory-?PLURAL"), but in some cases there are
1N/Aunexpected dependencies ("I found-?PLURAL ..."!) as well as long-distance
1N/Adependencies "The I<N> directory-?PLURAL could not be deleted-?PLURAL"!).
1N/A
1N/ARemind the translators to consider the case where N is 0:
1N/A"0 files found" isn't exactly natural-sounding in any language, but it
1N/Amay be unacceptable in many -- or it may condition special
1N/Akinds of agreement (similar to English "I didN'T find ANY files").
1N/A
1N/ARemember to ask your translators about numeral formatting in their
1N/Alanguage, so that you can override the C<numf> method as
1N/Aappropriate.  Typical variables in number formatting are:  what to
1N/Ause as a decimal point (comma? period?); what to use as a thousands
1N/Aseparator (space? nonbreaking space? comma? period? small
1N/Amiddot? prime? apostrophe?); and even whether the so-called "thousands
1N/Aseparator" is actually for every third digit -- I've heard reports of
1N/Atwo hundred thousand being expressible as "2,00,000" for some Indian
1N/A(Subcontinental) languages, besides the less surprising "S<200 000>",
1N/A"200.000", "200,000", and "200'000".  Also, using a set of numeral
1N/Aglyphs other than the usual ASCII "0"-"9" might be appreciated, as via
1N/AC<tr/0-9/\x{0966}-\x{096F}/> for getting digits in Devanagari script
1N/A(for Hindi, Konkani, others).
1N/A
1N/AThe basic C<quant> method that Locale::Maketext provides should be
1N/Agood for many languages.  For some languages, it might be useful
1N/Ato modify it (or its constituent C<numerate> method)
1N/Ato take a plural form in the two-argument call to C<quant>
1N/A(as in "[quant,_1,files]") if
1N/Ait's all-around easier to infer the singular form from the plural, than
1N/Ato infer the plural form from the singular.
1N/A
1N/ABut for other languages (as is discussed at length
1N/Ain L<Locale::Maketext::TPJ13|Locale::Maketext::TPJ13>), simple
1N/AC<quant>/C<numerify> is not enough.  For the particularly problematic
1N/ASlavic languages, what you may need is a method which you provide
1N/Awith the number, the citation form of the noun to quantify, and
1N/Athe case and gender that the sentence's syntax projects onto that
1N/Anoun slot.  The method would then be responsible for determining
1N/Awhat grammatical number that numeral projects onto its noun phrase,
1N/Aand what case and gender it may override the normal case and gender
1N/Awith; and then it would look up the noun in a lexicon providing
1N/Aall needed inflected forms.
1N/A
1N/A=item *
1N/A
1N/AYou may also wish to discuss with the translators the question of
1N/Ahow to relate different subforms of the same language tag,
1N/Aconsidering how this reacts with C<get_handle>'s treatment of
1N/Athese.  For example, if a user accepts interfaces in "en, fr", and
1N/Ayou have interfaces available in "en-US" and "fr", what should
1N/Athey get?  You may wish to resolve this by establishing that "en"
1N/Aand "en-US" are effectively synonymous, by having one class
1N/Azero-derive from the other.
1N/A
1N/AFor some languages this issue may never come up (Danish is rarely
1N/Aexpressed as "da-DK", but instead is just "da").  And for other
1N/Alanguages, the whole concept of a "generic" form may verge on
1N/Abeing uselessly vague, particularly for interfaces involving voice
1N/Amedia in forms of Arabic or Chinese.
1N/A
1N/A=item *
1N/A
1N/AOnce you've localized your program/site/etc. for all desired
1N/Alanguages, be sure to show the result (whether live, or via
1N/Ascreenshots) to the translators.  Once they approve, make every
1N/Aeffort to have it then checked by at least one other speaker of
1N/Athat language.  This holds true even when (or especially when) the
1N/Atranslation is done by one of your own programmers.  Some
1N/Akinds of systems may be harder to find testers for than others,
1N/Adepending on the amount of domain-specific jargon and concepts
1N/Ainvolved -- it's easier to find people who can tell you whether
1N/Athey approve of your translation for "delete this message" in an
1N/Aemail-via-Web interface, than to find people who can give you
1N/Aan informed opinion on your translation for "attribute value"
1N/Ain an XML query tool's interface.
1N/A
1N/A=back
1N/A
1N/A=head1 SEE ALSO
1N/A
1N/AI recommend reading all of these:
1N/A
1N/AL<Locale::Maketext::TPJ13|Locale::Maketext::TPJ13> -- my I<The Perl
1N/AJournal> article about Maketext.  It explains many important concepts
1N/Aunderlying Locale::Maketext's design, and some insight into why
1N/AMaketext is better than the plain old approach of just having
1N/Amessage catalogs that are just databases of sprintf formats.
1N/A
1N/AL<File::Findgrep|File::Findgrep> is a sample application/module
1N/Athat uses Locale::Maketext to localize its messages.  For a larger
1N/Ainternationalized system, see also L<Apache::MP3>.
1N/A
1N/AL<I18N::LangTags|I18N::LangTags>.
1N/A
1N/AL<Win32::Locale|Win32::Locale>.
1N/A
1N/ARFC 3066, I<Tags for the Identification of Languages>,
1N/Aas at http://sunsite.dk/RFC/rfc/rfc3066.html
1N/A
1N/ARFC 2277, I<IETF Policy on Character Sets and Languages>
1N/Ais at http://sunsite.dk/RFC/rfc/rfc2277.html -- much of it is
1N/Ajust things of interest to protocol designers, but it explains
1N/Asome basic concepts, like the distinction between locales and
1N/Alanguage-tags.
1N/A
1N/AThe manual for GNU C<gettext>.  The gettext dist is available in
1N/AC<ftp://prep.ai.mit.edu/pub/gnu/> -- get
1N/Aa recent gettext tarball and look in its "doc/" directory, there's
1N/Aan easily browsable HTML version in there.  The
1N/Agettext documentation asks lots of questions worth thinking
1N/Aabout, even if some of their answers are sometimes wonky,
1N/Aparticularly where they start talking about pluralization.
1N/A
1N/AThe Locale/Maketext.pm source.  Obverse that the module is much
1N/Ashorter than its documentation!
1N/A
1N/A=head1 COPYRIGHT AND DISCLAIMER
1N/A
1N/ACopyright (c) 1999-2004 Sean M. Burke.  All rights reserved.
1N/A
1N/AThis library is free software; you can redistribute it and/or modify
1N/Ait under the same terms as Perl itself.
1N/A
1N/AThis program is distributed in the hope that it will be useful, but
1N/Awithout any warranty; without even the implied warranty of
1N/Amerchantability or fitness for a particular purpose.
1N/A
1N/A=head1 AUTHOR
1N/A
1N/ASean M. Burke C<sburke@cpan.org>
1N/A
1N/A=cut