savannah-register-public
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Savannah-register-public] [task #5082] Submission of Hindawi Vernacular


From: Abhishek Choudhary
Subject: [Savannah-register-public] [task #5082] Submission of Hindawi Vernacular Programming System
Date: Wed, 11 Jan 2006 14:03:01 +0530
User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; digit_may2002)

Follow-up Comment #11, task #5082 (project administration):

Hello Sylvain,

Nice to hear from you. I am open to all questions related to the Hindawi
project. Please feel free to question, advise or criticise, as that gives me
an idea about how others percieve my work. I want this project to be based on
as solid a foundation as possible from the very beginning. I am working
towards bridging a digital divide (deeper than the Great Canyon), carved by
the rivers where unknown languages flow, and I want this bridge to be as
stable and as secure as possible, so that it does not break midway, besides
if it does not seem secure enough who would risk travelling by it. What I
mean in simple words is that as and when there are vernacular developers (who
write programs in programming languages in their mother tongue), they should
not find themselves disadvantaged for the lack of a solid design of the
system they work on or up against a glass cieling because there is no upward
technological support. I intend to provide them a tool to carve out a living
in this age of money and ICT synonymy. 

Now about your queries.

You wrote:
> It would be enlightening for us if you could explain the 
> difference between your approach and Unicode support (your 
> mention of 8-bit charset is a bit surprising).

First, one needs to understand that Indic scripts include scripts used in the
Indian subcontinent and not just India alone. Prior to the advent of UNICODE
the only available encoding "standard" for these scripts was the Indian
Standard Code for Information Interchange, which replicated the 7-bit ASCII
and coded the Indic scripts in the extended (8-bit) half. I have quoted the
word standard owing to the fact that there were numerous proprietary coding
methods developed, but ISCII is the only identifiable standard which came
into common use. Reference to 8-bit *support* (ref. in last msg - "We can
certainly write a lexer with flex which accepts 8-bit Indic code") implies
being able to use a lexical analysis tool which can analyse (tokenise) ISCII
based source code, where even keywords and variable names are in ISCII, and
not just string literals, or comments (as in a system developed by IIT-Madras
where you can put only ISCII comments in a C program but the keywords and
variable names are in latin or roman script).

These vernacular programming languages that the Hindawi project has developed
are a *new entity* altogether, if I might say so. They are not merely
libraries or functions providing I/O support for Indic scripts. I call them
Hindi C or Hindi assembly or Bangla C or Bangla assembly etc. only for the
sake of common understanding. They even have different names; such as, the
equivalent programming language of C in Indian languages is Shaili Guru,
vernacular C++ is Shaili Shraeni, vernacular yacc is Shaili Vyaakaran and so
on. However, these languages are syntax-compatible with their traditional
English counterparts and can, therefore, utilise the existing libraries such
as glibc etc. In a manner similar to the way C++ is syntax compatible to C
and hence can use most of the C libraries.

Nevertheless, the significance of 8-bit coding standard also needs to be
understood from a very practical viewpoint, keeping in mind that the Hindawi
project aims at *localising* programming languages and not just at providing
Indic support for I/O alone. If the concern was to provide a method by which
Indic scripts could be read (input) or written (output), then the UNICODE
support, possibly using wchar_t data type and related functions could
suffice. The goal or target is to create *compilers* for *programming
languages* in Indian vernaculars. The data type wchar_t and related stuff use
32-bit / 16-bit implementation (platform dependent). And  in my last message
itself I had pointed out the problems in compiler design if the keywords and
variables, or in general the tokens, of a programming language are based on
an 8-bit encoding. Obviously using 32 or 16 bit encoding is not feasible.
UNICODE support is ok as long as we wish to do Indic I/O in the regular
English derived compilers, where only string literals are in UNICODE, but
UNICODE based programming languages would, as of now, be inefficient. This is
because, beside the lexer overload and possible breakdown, they would also
suffer from the fact that the major existing libraries have their symbols
(language tokens, keywords, variables) encoded in 7-bit ASCII and not
UNICODE; not even 8-bit ASCII since all alphabets and numerals are encoded
between codes 0 and 127. Hope you appreciate this fact.

Another point at hand is that the Hindawi project aims at having Indic
support at all levels of a computer system, right from the BIOS/POST to
programming languages. The internal representation of text needs to be 8-bit,
unless we have UNICODE hardware suport. Hindawi project has made it possible
to even have BIOS/POST in Indic scripts. Along with this the assembly
language used for coding the BIOS can also be a vernacular equivalent
(Hindawi Shaili Yantrik - the Hindi equivalent of traditional assembly
language; currently supporting the I32 (x86) processor assembly).




You wrote:
> we would like to better see the goal of your project and 
> understand where support for other languages is added

Now if you are wondering whether such vernacular support is required then
consider this scenario, which elucidates the goal, though only to an extent:

Presume that you are educated in your mother tongue which is not English and
you know nothing about English other than yes, no and a few greetings. Also
presume that you live in a developing nation, and have meagre family
resources. There are families, with graduate members remaining unemployed,
which barely manage a *single* meal a day, if at all. Presume that you are
such a vernacular-literate (with schooling and/or graduation in a vernacular
medium), and yet unemployed. Someone suggests "why don't you join IT, there
is a lot of money in this field. Nice advice, but what next? Well you go
ahead and join a course in programming, where the programming language is
based on English, so first you *struggle* to pick up the chords of English
and then try to acquire programming skills. Presuming you get through this
phase and want to delve into deeper topics, again English is a hindrance as
all advanced technology requires English. On the other hand if you join IT
without learning English, you are restricted to only some document creation
software or applications. You can not decide your own destiny and are
dependent upon people who know English and will produce programs for you with
a vernacular interface. The vernacular applications being produced by such
people are very good indeed, and I often compare them to a BMW or Mercedes;
but without a steering wheel! They will only take you along a fixed road. How
do you then choose your own destination and your own destiny? Are we looking
at another colonial future? Remember we are talking FreeSoftware - free in
its very essence.

The solution lies in developing vernacular programming languages, where the
lack of knowledge of English will not proove to be a hurdle. The day you
start learing programming, you will start off with a topic in data
structures, or algorithms etc., instead of a dictionary! Hindawi is such a
solution, such an empowering tool.

I have already answered possible questions relating to why Hindawi is a
separate project and not a fork off some existing one in the previous
message. Please understand that these are separate programming languages and
not a separate I/O library for a existing language. This has been explained
earlier. Hindawi compilers are "currently" implemented as *front-end*
compilers to existing compilers for traditional English based programming
languages. This implies that a program written in Hindawi is "compiled" into
a program for another traditional programming language, and then compiled
using an existing compiler for that language. This is the standard first step
in "bootstrapping" the compiler for almost any programming language. This also
has the benefit that Hindawi programs become internationally marketable. This
is explained by the following topic discussed in another email conversation.

-----------------------Pasted from another email conversation---------
Topic 1 : Bridging the gap beetween vernacular and English developers

Vivek wrote:
>I have followed your project for sometime now.  Your project stands
>between two things.  The already ready english based programming 
>community and the possibly new and developing vernacular developers.
>This gap might again be challenge to bridge.

Point-> Hindawi(Hindi/Bangla C, C++ etc.) does 'not' create any further
divide or gap between vernacular developers and traditional (English) 
developers. Hindi / Bangla C, C++, Java, assembly etc. get 'readily' 
translated into their English equivalent. There is also provision for 
translation of variable names, which certainly is not a trivial task and
formed the basis of my interaction with Swami Sarvottamanand, Dean of 
Research and HOD Comp. Sci. at Belur Math college, who was one of the 
distinguished judges during the YIPTA event. The reverse, that is, 
translation of English programming languages and variable names to 
vernacular programming languages and variable names also takes 
place 'readily'; (here readily implies without any extra programming 
effort). So there isn't any new gap that develops between traditional and
vernacular developers; only the old ones are bridged. I am working on a
system for complete machine translation of documentation as well (a limited
scale version should be ready in a month or two.)

Explanation-> This has been a very basic consideration during the course
of development of Hindawi. This also relates to another concern - that of
the utility of vernacular programming languages. Vernacular BASIC and LOGO
are fine as paedagogical aids (teaching tools), supposed to be used in the
classroom setting or by hobby programmers, (though I must point out that
Hindi/Bangla BASIC allows EXE files to be created and C code to be 
embedded). However, the moment one talks of vernacular C, C++ , lex, yacc
or say vernacular Java, one has to consider the fact that learning these
languages shall involve a substantial effort. Though it is comparatively
simpler for a vernacular literate person than learning their English 
manifestations. This shall prove meaningless if the skills acquired cannot
be used professionally, that is, if they are not marketable enough. 
And "enough" certainly would include international markets. As metioned 
earlier, the ready conversion of programs written in vernacular languages
to their English counterparts, and vice-versa, solves this problem. For
instance, let us consider a possible scenario set in the not-so-distant
future.

        We are in 2012AD (six years from now), the first mile-stone year 
in achieving the target of making India an ICT superpower. (By then I'm
sure ICT shall stand for Indian Cottage Technology, for indeed, that is
what I envision - to make Information and Communication Technology a 
literal 'cottage' industry in India which shall cater solutions globally,
and provide at least some major relief to the unemployment situation. That
would be akin to the electronics industry in China today.) A person, say,
in the USA needs a piece of software for his new startup. He logs onto
the Indian "Software Exchange" (***a new social concept***) website,
posts his requirement in the standard format available there, and pays the
prescribed fees online (say an advance, with balance to be cleared as and
when the final settlements are done). This requirements' document is in a
restricted-grammar format and can be translated into a vernacular even with
2005 technology. The vernacular requirements' document is then provided to a
vernacular developer as per turn, who may accept or refuse the task; in
the latter case, it is handed over to the next developer in queue. The 
vernacular developer then proceeds with the standard software engineering
steps of analysis, coding, testing, etc. He certainly codes in vernacular
programming languages and also does the analysis and testing in vernacular.
Where input from the end user is needed in the development process, the
developer provides the English source code, which as I have pointed out
is readily generated by the Hindawi compiler system (which by then shall 
certainly have improved a lot). As about the English text messages 
contained in the program; they are converted back into English by the 
translation process at the Software Exchange. (*** Machine translation
is the piece of technology I am focussing my development efforts on now, but
for restricted-grammar this has been achieved ***.) The end product is
finally provided to the person in the USA, with complete source, variable
names, documentation etc. in English. Further work may be handled by either
a traditional (English) or vernacular developer. As an aside, I have only
skimmed through the description of this scenario. I already have worked
on aspects such as what changes to the currently practised software metrics
may be needed. Besides, a few new postions may need to be created, and
this person-centric scenario should be viewed as a team-centric one.

Summary-> There can be complete synergy between vernacular and traditional
(English) deveoper communities.
--------------------End of pasted block---------------------------------




You wrote:
> understand where support for other languages is added:
> is it in keyboard input methods, in programs input functions, in program 
> identifiers, in programs output functions. If I understand well, your C 
> programs won't be compilable with a classical gcc package, which seems a 
> bit surprising as well

These are *new* programming languages, so how are they intended to be
compiled with a classical GCC package, and if they could be then what was the
need for project Hindawi! OK, now understand this - Hindawi is a *suite* of
*programming languages* (new ones!), i.e. a compiler collection just as gcc
is, for programming languages which use Indian languages as a basis, instead
of being based on English. Currently I'm using gcc (and yes, *classical* gcc)
as the back-end compiler, but any other back end could be used, that is to say
that Hindawi and all its sub-projects (shaili guru etc.) could use any other
compiler system instead of gcc as a backend. Hindawi is based on the
*completely new* and independently developed technologies of Romenagri and
APCISR, which I could have patented, but decided to make them FreeSoftware.
But again what do you mean by "won't be compilable with a classical gcc
package"? Can a Fortran program be compiled with a C compiler, or a C program
with a Fortran compiler? Hindawi is quite like P2C, the pascal to C converter,
for now. Remember I told you about how C++ was intially made available as the
CFront compiler, which you must be aware of. Key-board input, console output,
and everything else is supported, but more than that the ability to compile
source code with keywords and variable names in Indic scripts while
maintaining compatibility with existing libraries is its uniqueness. Hope its
clear now.

The capabilities you mention are provided by related projects Romenagri and
APCISR.


You wrote:
> explaining what's the goal/principle of "porting Linux kernel sources to 
> Hindi/Bangla C, asm etc." would be interesting too

Again I have to draw your reference to the glass cieling factor. What about a
developer who has mastered Shaili Guru (Hindi C) and wishes to study the
internals of an OS like GNU/Linux? Porting Linux kernel sources helps with
this. Besides, just as Hindawi compilers can convert programs written in
vernacular programming languages to traditional (English) programming
languages, Hindawi also consists of a set of reverse compilers which do the
opposite, i.e. convert programs written in traditional (English) programming
languages to vernacular programming languages. These reverse compilers can be
used by a person to convert the Linux kernel sources to the desired vernacular
language. That is the principle and making Linux kernel sources accesible to
vernacular programming language based developers is the goal.

Suitable filters and shell scripts shall be submitted to the Linux kernel
maintainers when developed, for inclusion with Linux kernel sources. However,
to prevent further confusion let me emphasize that we are not aiming at
modifying the Linux kernel sources in any manner under the project Hindawi. I
must admit here that the target is to make Hindawi capable of handling a
program as complex as the Linux kernel, the intention is not to have a
parallel kernel written in vernacular languages. Besides, that becomes
trivial as explained already, since the necessary translators will exist. (I
do not rule out such distributions by people interested in paedagogical
purposes, though. Linux kernel sources do not have comments in Indian
vernaculars, but that shall not be done under project Hindawi.)



You wrote:
> We also need to check that your project can be run on a free operating >
system.
> Your project appears to rely on FreeDOS, but I also see screenshots taken 
> using MS Windows in your website

Certainly Hindawi runs on *vanilla* FreeDOS, and runs under Windows only
under DOS mode or in a text-mode DOS box. Hindawi is intended to use Linux
and FreeDOS as its primary basis. Why don't you try out the floppy
distribution of Swaadheen DOS, which is a bootable floppy (FreeDOS based) and
has Hindawi shell (aadesh) as a replacement for command.com, resulting in a
vernacular DOS. If you wish to try out the vernacular languages follow the
instructions below.

A) For Hindawi / BangaBhasha (suite of programming languages) under FreeDOS
1) Download the necessary zip files from http://indicybers.com
For Hindi 
http://www.indicybers.com/HindawiR2CD.zip
For Bangla 
http://www.indicybers.com/BangaBhashaR1CD.zip
2) Unzip the files to a directory on your computer
3) The package has technical write-up's as PDF files and a voice narrated
presentation (PPT) file
4) At the DOS prompt switch to the directory
where you unzipped the downloaded zip file
5) Run setup.exe
6) Follow on screen instructions (in English and vernaculars)
7) I would suggest you accept the default locations, unless you have some
constraints. The total installed foot-print is around 50 MB
8) Setup creates a batch file "hindawi.bat" with the necessary
startup 
script to set up the required environment variables.
9) After installation is completed, start Hindawi by running "hindawi.bat"
from the directory of installation.
10) This will take you to Aadesh -  the Hindi command shell.
11) To type in Hindi /Bangla turn Scroll lok on (this is indicated on the
low right hand corner of the screen)
12) Hindawi uses INSCRIPT keyboard layout. JPEG file is available online
http://www.indicybers.com/devanagari.jpg
http://www.indicybers.com/bangla.jpg
13) To start the IDE type "Lekhak" in English or vernacular
14) Once lekhak has started up, you are taken to the welcome page of the
online help system
15) To close the help-box press Esc
16) To activate a menu press Alt+Red_lettered_key or use a function key
shortcut 
17) Press F3 to open the load file dialog
18) Press Tab to go to the file-list section
19) Navigate down with the arrow keys till "samples/" is highlighted
and 
press enter.
20) Similarly select the directory for the language for which you wish
to 
see a program (later you may try writing your own vernacular programs)
robot - logo
prathmik - BASIC
guru - C
shraeni - C++
shabda - lex
vyaaka - yacc
kritrim - Java
yantrik - x86 assembly
21) Load the vernacular named files by highlighting their names as above
and pressing Enter
22) Follow the following shortcuts to compile and run
F5 - Compile + execute
F6 - execute a previously compiled program (only in hindi)
F7 - prepare a compiled program for deployment
F9 - compile only
NB: Please navigate down the source files by pressing PageDown as there
is 
a lot of copyright information in the beginning of each
23) To exit lekhak press Alt-X or choose nikaas(exit) from khaata(file)
menu
24) To exit Aadesh type exit or nikaas(in vernaular)



B) For Bangla / Hindi DOS
i) Download and unzip the Hindi/Bangla DOS zip file 
http://www.indicybers.com/HINDIDOS.ZIP
or 
http://www.indicybers.com/BANG_DOS.ZIP
ii) Run mkdisk.bat from the DOS prompt
iii) Follow on-screen prompts (in English) to create a boot disk for 
Hindi / Bangla DOS
iv) Boot up your computer with this.
v) The DOS disk has the necessary files for hard-disk installation, but
you 
must be familiar with fdisk etc. Automated Hindi/Bangla DOS setup is not yet
ready.



You wrote:
> Can you all parts of your project be ran using a stock version of FreeDOS,
> or may it require proprietary components?


Yes, Hindawi runs on *stock* versions of FreeDOS and does not require any
proprietary components. This is perhaps the *only* Indic software which does
so, leave alone any Indic programming language. (There are *no* other
*successful* implementations Indic programming languages besides Hindawi, as
yet!)


Finally, I would like to bring it to your notice that Hindawi has also been
selected for LinuxAsia 2006.


Regards,
Abhishek Choudhary

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/task/?func=detailitem&item_id=5082>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]