bug-global
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gtags bug report: issue with S-JIS encoding files


From: Johnny Cheng
Subject: gtags bug report: issue with S-JIS encoding files
Date: Fri, 17 Nov 2023 11:29:44 +0900

Hi,

I found that if a file contains a specific CJK characters sequence, the parser seems fail to continue parsing the file.

See the follow example source file, let’s say `test.c` in encoding of Shift-JIS (cp932).

extern void printf(char * msg, ...);

 

void Foo() {

    char msg[] = "機能";

    printf(msg);

}

 

void Hello() {

    return;

}

(In case of mojibake due to encoding issue for Kanji, screenshots are also provided below.) 

  • What was occurred? (as is)

Now if you run `gtags` command in same folder follow by `global -f test.c`, you only get one tag, which is `Foo`, but `Hello` shall also be found.

  • What did you expect from it?

However, if I modify the source a little bit, then tag `Hello` is found. See variations I tried in the table below.


Cases Table

Cases

Source Code Screenshot

global -f test.c

Bad Case

image001.png

(Encoding is cp932, or shift-jis)

Foo                 4 test.cpp         void Foo() {

Good Cases

<image001.png>

(Encoding is utf8)

 

image002.png

(Encoding is cp932, or shift-jis)

 

image003.png

(Encoding is cp932, or shift-jis)

Foo                 4 test.cpp         void Foo() {

Hello               9 test.cpp         void Hello() {


My environment

OS

Windows 11 Enterprise 22H2 64bit Build 22621.2428

gtags --version

gtags (Global) 6.6.9

Powered by Berkeley DB 1.85.

Copyright (c) 1996-2022 Tama Communications Corporation

License GPLv3+: GNU GPL version 3 or later http://www.gnu.org/licenses/gpl.html

This is free software; you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.


Possible Solutions

  • Add a command line encoding option to read the file properly.
  • Find out why such file cannot be fully parsed, ignore such special error, and continue parsing.

Also, if such case happens, at least print out some error message to inform user that some files are not fully parsed.

 

 

Johnny Cheng


reply via email to

[Prev in Thread] Current Thread [Next in Thread]