bibledit-general
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [be] unicode hyphen


From: Birch Champeon
Subject: Re: [be] unicode hyphen
Date: Fri, 6 Nov 2009 04:32:55 -0500

Thanks for all the info guys.  I'll ask a question on the pango list
and then post a bug if needed

On Fri, Nov 6, 2009 at 3:52 AM, Teus Benschop <address@hidden> wrote:
> The spelling checker library that is being used is enchant. However, the
> word boundaries are determined using the functions
> "gtk_text_iter_forward_word_end" and "gtk_text_iter_backward_word_start"
> as provided in the Gtk library. The information on one of these
> functions says: "Word breaks are determined by Pango and should be
> correct for nearly any language (if not, the correct fix would be to the
> Pango word break algorithms).". Since Pango determines the word breaks,
> what might help is to report this to the Pango project. It is at
> http://www.pango.org/. Hope that this helps. Teus.
>
>
> On Thu, 2009-11-05 at 23:35 -0500, Doug Glidden wrote:
>> Hmm, that's a tough one.  As far as I can tell, Unicode does not
>> specify any hyphen character that must not act as a word boundary
>> (except for the soft hyphen, but that would not fulfill your needs
>> because it is not visible except when it is followed by a line break);
>> pretty much any of the hyphen characters may be tailored not to act as
>> a word boundary, though.  I would say that should be something that
>> the localization of the spell checker should take care of, so if none
>> of the characters works (see the list below—in particular notice that
>> the actual hyphen Unicode character is not what you get when you type
>> a hyphen on your keyboard; that is instead the hyphen-minus
>> character), you may want to submit this as a bug/feature request with
>> the project for whatever spell checking library is used by Bibledit
>> (I'm assuming BE uses one of the many open-source spelling libraries
>> and not a home-grown one).  In particular, I would say that a person
>> who uses a "non-breaking hyphen" probably typically expects its
>> "non-breaking" aspect to apply to words as well as lines (although in
>> reality the Unicode standard requires only that a non-breaking hyphen
>> prevent line breaks).
>>
>> Doug
>>
>> P.S.  The complete list of hyphen characters in Unicode is as follows:
>>
>> Hyphen or minus sign (hyphen-minus or hyphus) - U+002D
>> Soft (or discretionary) hyphen - U+00AD
>> Hyphen - U+2010
>> Non-breaking hyphen - U+2011
>> Hyphen bullet - U+2043
>>
>> See also the set of dash characters (U+2012 through U+2015), but note
>> that these are not the same size as a standard hyphen.
>>
>> On Thu, Nov 5, 2009 at 9:55 PM, Birch Champeon
>> <address@hidden> wrote:
>>         I'm working with a bunch of languages that use the hyphen
>>         within their
>>         words.  The hyphen is seen as a word break in BE.  Does anyone
>>         know if
>>         there a unicode character that looks like a hyphen, but will
>>         be viewed
>>         as a regular character?  I've tried a non-breaking hyphen and
>>         the
>>         spellchecker still sees it as a word break.
>>
>>         Thanks
>>         Birch
>>
>>
>>
>
>
>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]