2N/A * The contents of this file are subject to the terms of the 2N/A * Common Development and Distribution License, Version 1.0 only 2N/A * (the "License"). You may not use this file except in compliance 2N/A * See the License for the specific language governing permissions 2N/A * and limitations under the License. 2N/A * When distributing Covered Code, include this CDDL HEADER in each 2N/A * If applicable, add the following below this CDDL HEADER, with the 2N/A * fields enclosed by brackets "[]" replaced with your own identifying 2N/A * information: Portions Copyright [yyyy] [name of copyright owner] 2N/A * Copyright (c) 1999 by Sun Microsystems, Inc. 2N/A * All rights reserved. 2N/A#
pragma ident "%Z%%M% %I% %E% SMI" 2N/A * UTF-8 encoded Unicode parsing routines. For efficiency, we convert 2N/A * to wide chars only when absolutely needed. The following interfaces 2N/A * are exported to libslp: 2N/A * slp_utf_strchr: same semantics as strchr, but handles UTF-8 strings 2N/A * slp_fold_space: folds white space around and in between works; 2N/A * handles UTF-8 strings 2N/A * slp_strcasecmp: same semantics as strcasecmp, but also folds white 2N/A * space and attempts locale-specific 2N/A * case-insensitive comparisons. 2N/A * Same semantics as strchr. 2N/A * Assumes that we start on a char boundry, and that c is a 7-bit 2N/A for (p = (
char *)s; *p; p +=
len) {
2N/A * folds white space around and in between words. 2N/A * " aa bb " becomes "aa bb". 2N/A * returns NULL if it couldn't allocate memory. The caller must free 2N/A * the result when done. 2N/A /* step 1: skip white space */ 2N/A /* if we are in between words, keep one space */ 2N/A /* step 2: copy into folded until we hit more white space */ 2N/A * performs like strcasecmp, but also folds white space before comparing, 2N/A * and will handle UTF-8 comparisons (including case). Note that the 2N/A * application's locale must have been set to a UTF-8 locale for this 2N/A /* optimization: try simple case first */ 2N/A /* fold white space, and try again */ 2N/A * try converting to wide char -- we must be in a locale which 2N/A * supports the UTF8 codeset for this to work.