2362N/A * Copyright (c) 2009, Oracle and/or its affiliates. All rights reserved. 1280N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 1280N/A * This code is free software; you can redistribute it and/or modify it 1280N/A * under the terms of the GNU General Public License version 2 only, as 2362N/A * published by the Free Software Foundation. Oracle designates this 1280N/A * particular file as subject to the "Classpath" exception as provided 2362N/A * by Oracle in the LICENSE file that accompanied this code. 1280N/A * This code is distributed in the hope that it will be useful, but WITHOUT 1280N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1280N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 1280N/A * version 2 for more details (a copy is included in the LICENSE file that 1280N/A * You should have received a copy of the GNU General Public License version 1280N/A * 2 along with this work; if not, write to the Free Software Foundation, 1280N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 2362N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 2362N/A * or visit www.oracle.com if you need additional information or have any 1280N/A ******************************************************************************* 1280N/A * (C) Copyright IBM Corp. and others, 1996-2009 - All Rights Reserved * 1280N/A * The original version of this source code and documentation is copyrighted * 1280N/A * and owned by IBM, These materials are provided under terms of a License * 1280N/A * Agreement between IBM and Sun. This technology is protected by multiple * 1280N/A * US and International patents. This notice and attribution to IBM may not * 1280N/A ******************************************************************************* 1280N/A/* Written by Simon Montagu, Matitiahu Allouche 1280N/A * (ported from C code written by Markus W. Scherer) 1280N/A * General remarks about the functions in this file: 1280N/A * These functions deal with the aspects of potentially mixed-directional 1280N/A * text in a single paragraph or in a line of a single paragraph 1280N/A * which has already been processed according to 1280N/A * the Unicode 3.0 Bidi algorithm as defined in 1280N/A * also described in The Unicode Standard, Version 4.0.1 . 1280N/A * This means that there is a Bidi object with a levels 1280N/A * paraLevel and direction are also set. 1280N/A * Only if the length of the text is zero, then levels==dirProps==NULL. 1280N/A * The overall directionality of the paragraph 1280N/A * or line is used to bypass the reordering steps if possible. 1280N/A * Even purely RTL text does not need reordering there because 1280N/A * index on the fly in such a case. 1280N/A * The implementation of the access to same-level-runs and of the reordering 1280N/A * do attempt to provide better performance and less memory usage compared to 1280N/A * a direct implementation of especially rule (L2) with an array of 1280N/A * one (32-bit) integer per text character. 1280N/A * Here, the levels array is scanned as soon as necessary, and a vector of 1280N/A * same-level-runs is created. Reordering then is done on this vector. 1280N/A * For each run of text positions that were resolved to the same level, 1280N/A * only 8 bytes are stored: the first text position of the run and the visual 1280N/A * position behind the run after reordering. 1280N/A * One sign bit is used to hold the directionality of the run. 1280N/A * This is inefficient if there are many very short runs. If the average run 1280N/A * length is <2, then this uses more memory. 1280N/A * In a further attempt to save memory, the levels array is never changed 1280N/A * after all the resolution rules (Xn, Wn, Nn, In). 1280N/A * Many methods have to consider the field trailingWSStart: 1280N/A * if it is less than length, then there is an implicit trailing run 1280N/A * which is not reflected in the levels array. 1280N/A * This allows a line Bidi object to use the same levels array as 1280N/A * its paragraph parent object. 1280N/A * When a Bidi object is created for a line of a paragraph, then the 1280N/A * paragraph's levels and dirProps arrays are reused by way of setting 1280N/A * a pointer into them, not by copying. This again saves memory and forbids to 1280N/A * change the now shared levels for (L1). 1280N/A /* handle trailing WS (L1) -------------------------------------------------- */ 1280N/A * setTrailingWSStart() sets the start index for a trailing 1280N/A * run of WS in the line. This is necessary because we do not modify 1280N/A * the paragraph's levels array that we just point into. 1280N/A * Using trailingWSStart is another form of performing (L1). 1280N/A * To make subsequent operations easier, we also include the run 1280N/A * before the WS if it is at the paraLevel - we merge the two here. 1280N/A * This method is called only from setLine(), so paraLevel is 1280N/A * set correctly for the line even when contextual multiple paragraphs. 1280N/A /* If the line is terminated by a block separator, all preceding WS etc... 1280N/A are already set to paragraph level. 1280N/A Setting trailingWSStart to pBidi->length will avoid changing the 1280N/A level of B chars from 0 to paraLevel in getLevels when 1280N/A /* go backwards across all WS, BN, explicit codes */ 1280N/A /* if the WS run can be merged with the previous run then do so here */ 1280N/A /* set the values in lineBidi from its paraBidi parent */ 1280N/A /* class members are already initialized to 0 */ 1280N/A // lineBidi.paraBidi = null; /* mark unfinished setLine */ 1280N/A // lineBidi.controlCount = 0; 1280N/A /* copy proper subset of DirProps */ 1280N/A /* copy proper subset of Levels */ 1280N/A /* the parent is already trivial */ 1280N/A * The parent's levels are all either 1280N/A * implicitly or explicitly ==paraLevel; 1280N/A /* recalculate lineBidi.direction */ 1280N/A /* all levels are at paraLevel */ 1280N/A /* get the level of the first character */ 1280N/A /* if there is anything of a different level, then the line 1280N/A /* the trailing WS is at paraLevel, which differs from 1280N/A /* see if levels[1..trailingWSStart-1] have the same 1280N/A direction as levels[0] and paraLevel */ 1280N/A /* the direction values match those in level */ 1280N/A /* make sure paraLevel is even */ 1280N/A /* all levels are implicitly at paraLevel (important for 1280N/A /* make sure paraLevel is odd */ 1280N/A /* all levels are implicitly at paraLevel (important for 1280N/A /* return paraLevel if in the trailing WS run, otherwise the real level */ 1280N/A /* the current levels array does not reflect the WS run */ 1280N/A * After the previous if(), we know that the levels array 1280N/A * has an implicit trailing WS run and therefore does not fully 1280N/A * reflect itself all the levels. 1280N/A * This must be a Bidi object for a line, and 1280N/A * we need to create a new levels array. 1280N/A /* bidiBase.paraLevel is ok even if contextual multiple paragraphs, 1280N/A since bidiBase is a line object */ 1280N/A /* this new levels array is set for the line and reflects the WS run */ 1280N/A /* this is done based on runs rather than on levels since levels have 1280N/A a special interpretation when REORDER_RUNS_ONLY 1280N/A /* in trivial cases there is only one trivial run; called by getRuns() */ 1280N/A /* simple, single-run case */ 1280N/A /* fill and reorder the single run */ 1280N/A /* reorder the runs array (L2) ---------------------------------------------- */ 1280N/A * Reorder the same-level runs in the runs array. 1280N/A * Here, runCount>1 and maxLevel>=minLevel>=paraLevel. 1280N/A * All the visualStart fields=logical start before reordering. 1280N/A * The "odd" bits are not set yet. 1280N/A * Reordering with this data structure lends itself to some handy shortcuts: 1280N/A * Since each run is moved but not modified, and since at the initial maxLevel 1280N/A * each sequence of same-level runs consists of only one run each, we 1280N/A * don't need to do anything there and can predecrement maxLevel. 1280N/A * In many simple cases, the reordering is thus done entirely in the 1280N/A * Also, reordering occurs only down to the lowest odd level that occurs, 1280N/A * which is minLevel|1. However, if the lowest level itself is odd, then 1280N/A * in the last reordering the sequence of the runs at this level or higher 1280N/A * will be all runs, and we don't need the elaborate loop to search for them. 1280N/A * This is covered by ++minLevel instead of minLevel|=1 followed 1280N/A * by an extra reorder-all after the reorder-some loop. 1280N/A * Such a run would need special treatment because its level is not 1280N/A * reflected in levels[] if this is not a paragraph object. 1280N/A * Instead, all characters from trailingWSStart on are implicitly at 1280N/A * However, for all maxLevel>paraLevel, this run will never be reordered 1280N/A * and does not need to be taken into account. maxLevel==paraLevel is only reordered 1280N/A * if minLevel==paraLevel is odd, which is done in the extra segment. 1280N/A * This means that for the main reordering loop we don't need to consider 1280N/A * this run and can --runCount. If it is later part of the all-runs 1280N/A * reordering, then runCount is adjusted accordingly. 1280N/A * Reorder only down to the lowest odd level 1280N/A * and reorder at an odd minLevel in a separate, simpler loop. 1280N/A * See comments above for why minLevel is always incremented. 1280N/A /* do not include the WS run at paraLevel<=old minLevel except in the simple loop */ 1280N/A /* loop for all sequences of runs */ 1280N/A /* look for a sequence of runs that are all at >=maxLevel */ 1280N/A /* look for the first run of such a sequence */ 1280N/A break;
/* no more such runs */ 1280N/A /* look for the limit run of such a sequence (the run behind it) */ 1280N/A /* Swap the entire sequence of runs from firstRun to limitRun-1. */ 1280N/A break;
/* no more such runs */ 1280N/A /* now do maxLevel==old minLevel (==odd!), see above */ 1280N/A /* include the trailing WS run in this complete reordering */ 1280N/A /* Swap the entire sequence of all runs. (endRun==runCount) */ 1280N/A /* compute the runs array --------------------------------------------------- */ 1280N/A /* we should never get here */ 1280N/A * Compute the runs array from the levels array. 1280N/A * After getRuns() returns true, runCount is guaranteed to be >0 1280N/A * and the runs are reordered. 1280N/A * Odd-level runs have visualStart on their visual right edge and 1280N/A * they progress visually to the left. 1280N/A * If option OPTION_INSERT_MARKS is set, insertRemove will contain the 1280N/A * If option OPTION_REMOVE_CONTROLS is set, insertRemove will contain the 1280N/A * negative number of BiDi control characters within this run. 1280N/A * This method returns immediately if the runs are already set. This 1280N/A * includes the case of length==0 (handled in setPara).. 1280N/A /* simple, single-run case - this covers length==0 */ 1280N/A /* bidiBase.paraLevel is ok even for contextual multiple paragraphs */ 1280N/A }
else /* BidiBase.MIXED, length>0 */ {
1280N/A * If there are WS characters at the end of the line 1280N/A * and the run preceding them has a level different from 1280N/A * paraLevel, then they will form their own run at paraLevel (L1). 1280N/A * We need some special treatment for this in order to not 1280N/A * modify the levels array which a line Bidi object shares 1280N/A * with its paragraph parent and its other line siblings. 1280N/A * In other words, for the trailing WS, it may be 1280N/A * levels[]!=paraLevel but we have to treat it like it were so. 1280N/A /* count the runs, there is at least one non-WS run, and limit>0 */ 1280N/A /* increment runCount at the start of each run */ 1280N/A * We don't need to see if the last run can be merged with a trailing 1280N/A * WS run because setTrailingWSStart() would have done that. 1280N/A /* There is only one non-WS run and no trailing WS-run. */ 1280N/A }
else /* runCount>1 || limit<length */ {
1280N/A /* allocate and set the runs */ 1280N/A /* now, count a (non-mergeable) WS run */ 1280N/A /* FOOD FOR THOUGHT: this could be optimized, e.g.: 1280N/A * 464->444, 484->444, 575->555, 595->555 1280N/A * However, that would take longer. Check also how it would 1280N/A * interact with BiDi control removal and inserting Marks. 1280N/A /* search for the run limits and initialize visualLimit values with the run lengths */ 1280N/A /* look for the run limit */ 1280N/A /* i is another run limit */ 1280N/A /* there is a separate WS run */ 1280N/A /* For the trailing WS run, bidiBase.paraLevel is ok even 1280N/A if contextual multiple paragraphs. */ 1280N/A /* set the object fields */ 1280N/A /* now add the direction flags and adjust the visualLimit's to be just that */ 1280N/A /* this loop will also handle the trailing WS run */ 1280N/A /* Set the embedding level for the trailing WS run. */ 1280N/A /* For a RTL paragraph, it will be the *first* run in visual order. */ 1280N/A /* For the trailing WS run, bidiBase.paraLevel is ok even if 1280N/A contextual multiple paragraphs. */ 1280N/A /* handle remove BiDi control characters */ 1280N/A /* determine minLevel and maxLevel */ 1280N/A /* initialize the index map */ 1280N/A /* reorder only down to the lowest odd level */ 1280N/A /* loop maxLevel..minLevel */ 1280N/A /* loop for all sequences of levels to reorder at the current maxLevel */ 1280N/A /* look for a sequence of levels that are all at >=maxLevel */ 1280N/A /* look for the first index of such a sequence */ 1280N/A break;
/* no more such runs */ 1280N/A /* look for the limit of such a sequence (the index behind it) */ 1280N/A * Swap the entire interval of indexes from start to limit-1. 1280N/A * We don't need to swap the levels for the purpose of this 1280N/A * algorithm: the sequence of levels that we look at does not 1280N/A break;
/* no more such sequences */ 1280N/A /* fill a visual-to-logical index map using the runs[] */ 1280N/A /* visualStart==visualLimit; */ 1280N/A /* count all inserted marks */ 1280N/A /* move back indexes by number of preceding marks */ 1280N/A /* move forward indexes by number of preceding controls */ 1280N/A /* if no control found yet, nothing to do in this run */ 1280N/A /* if no control in this run */