bug-make
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Use UTF-8 active code page for Windows host.


From: Costas Argyris
Subject: Re: [PATCH] Use UTF-8 active code page for Windows host.
Date: Sun, 19 Mar 2023 16:34:54 +0000

OK, but how is the make.exe you produced built?

I actually did what you suggested but was somewhat confused with the
result.    Usually I do this with 'ldd', but both msvcrt.dll and ucrtbase.dll
show up in 'ldd make.exe' output, and I wasn't sure what to think of it.

However, your approach with objdump gives fewer results and only
lists msvcrt.dll, not ucrtbase.dll:

C:\Users\cargyris\temp>objdump -p make.exe | grep "DLL Name:"
        DLL Name: ADVAPI32.dll
        DLL Name: KERNEL32.dll
        DLL Name: msvcrt.dll
        DLL Name: USER32.dll

So I guess MSVCRT is enough, i.e. no need for UCRT.

If you try using in a Makefile file names with non-ASCII
characters outside of the current ANSI codepage, does Make succeed to
recognize files mentioned in the Makefile whose letter-case is
different from what is seen in the file system?


I think it does, here is the experiment:

C:\Users\cargyris\temp>ls ❎
 src.c

There is only src.c in that folder.

Makefile utf8.mk is UTF-8 encoded and has this content that
checks for the existence of:

❎\src.c
❎\src.C
❎\src.cs

where ❎ is outside the ANSI codepage (1252).

If I understand this correctly, both src.c and src.C should be found,
but not src.cs (just to show a negative case as well).

hello :
@gcc ©\src.c -o ©\src.exe



ifneq ("$(wildcard ❎\src.c)","")
@echo ❎\src.c exists
else
@echo ❎\src.c does NOT exist
endif



ifneq ("$(wildcard ❎\src.C)","")
@echo ❎\src.C exists
else
@echo ❎\src.C does NOT exist
endif



ifneq ("$(wildcard ❎\src.cs)","")
@echo ❎\src.cs exists
else
@echo ❎\src.cs does NOT exist
endif

Here is the result of running the UTF-8-patched Make on it:

C:\Users\cargyris\temp>make.exe -f utf8.mk
❎\src.c exists
❎\src.C exists
❎\src.cs does NOT exist

I don't know if that was a good way to test your point, feel free to suggest
a different one if it was not.    It seems to be doing the right thing, finding
the .C file as well.

Indeed.  But build_w32.bat is a very simple batch file, so I don't
think modifying it will present any difficulty.  Let us know if you
need help in that matter.


Sure, thanks.

Btw, there's one aspect where Make on MS-Windows will probably fall
short of modern Posix systems: the display of non-ASCII characters on
the screen.


Indeed, some thoughts on that:

1) As you know, this is only affecting the visual aspect of the logs, not the
inner workings of Make.    This could confuse users because they would
be seeing "errors" on the screen, without there being any real errors.
Perhaps a mention in the doc or release notes could remedy that.

2) To some extent (maybe even completely, I don't know) this can be
mitigated with using PowerShell instead of the classic Command Prompt.
This seems to be working in this case at least:

Command Prompt:

C:\Users\cargyris\temp>make.exe -f utf8.mk
echo â?Z\src.c exists

PowerShell:

PS C:\Users\cargyris\temp> make.exe -f utf8.mk
echo ❎\src.c exists

If anything, it could be worth a mention in the doc.

On Sun, 19 Mar 2023 at 14:38, Eli Zaretskii <eliz@gnu.org> wrote:
> From: Costas Argyris <costas.argyris@gmail.com>
> Date: Sun, 19 Mar 2023 13:42:52 +0000
> Cc: bug-make@gnu.org
>
> Does this support require Make to be linked against the UCRT
> run-time library, or does it also work with the older MSVCRT?
>
> I haven't found anything explicitly mentioned about this in the official
> doc:
>
> https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

OK, but how is the make.exe you produced built? is it using UCRT or
MSVCRT when it runs?  You can check that by examining the dependencies
of the .exe file with, e.g., the Dependency Walker program
(https://www.dependencywalker.com/) or similar.  Or just use objdump
from GNU Binutils:

  objdump -p make.exe | fgrep "DLL Name:"

and see if this shows MSVCRT.DLL or the UCRT one.

> Does using UTF-8 as the active page in Make mean that locale-dependent
> C library functions will behave as expected?
>
> I think so.    Here is the relevant doc I found:
>
> https://learn.microsoft.com/en-us/cpp/text/locales-and-code-pages?view=msvc-170

This is not enough.  If locale-dependent C library function still
support only the characters expressible with the ANSI codepage, then a
program using the UTF-8 active codepage will be unable to process the
non-ASCII characters outside of the ANSI codepage correctly.  For
example, downcasing such characters or comparing them in
case-insensitive manner will not work.  This is because for this to
work those functions need to have access to tables of character
properties for the entire Unicode range, not just for the current
locale.  If you try using in a Makefile file names with non-ASCII
characters outside of the current ANSI codepage, does Make succeed to
recognize files mentioned in the Makefile whose letter-case is
different from what is seen in the file system?

> Also, since the above experiments seem to suggest that we are not
> dropping existing support for non-ASCII characters in programs
> called by Make, it seems like a clear step forwards in terms of
> Unicode support on Windows.

I agree.

> I cross-compiled Make for Windows using gcc (mingw-w64) and the
> autoconf + automake + configure + make approach, so it clearly worked
> for me, but I didn't imagine that this wasn't the standard way to build for
> Windows host.

Make is a basic utility used to built others, so we don't require a
full suite of build tools for building Make itself.

> Does this mean that all builds of Make found in the various build
> distributions of the GNU toolchain for Windows (like
> mingw32-make.exe in the examples above) were necessarily built using
> build_w32.bat?

I don't know.  I can tell you that the precompiled binaries I make
available here:

  https://sourceforge.net/projects/ezwinports/files/

are produced by running that batch file.

> Since build_w32.bat is a Windows-specific batch file, does this rule out
> cross-compilation as a canonical way to build Make for Windows?

No, it doesn't rule that out.  But using cross-compilation is not very
important these days, since one can have a fully functional MinGW
build environment quite easily.

> Assuming all questions are answered first, would it be OK to work on the
> build_w32.bat changes in a second separate patch, and keep the first one
> focused only on the Unix-like build process?

Yes.  But my point is that without also changing build_w32.bat the
change is incomplete.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]