bug-apl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Improvements to gnuapl


From: Elias Mårtenson
Subject: Re: Improvements to gnuapl
Date: Tue, 23 Feb 2021 21:09:53 +0800

Perhaps this library can help: https://github.com/google/gumbo-parser

It should be reasonably easy to call it from GNU APL. 

Den tis 23 feb. 2021 20:09Blake McBride <blake1024@gmail.com> skrev:
If I were parsing HTML, I would have an exception list that contained the few tags that don't have closing tags.  I wouldn't expect a closing tag for those.  If I did get one, I'd ignore it.

There is a very small and fixed number of these exceptional tags.  Custom tags should have closing tags.

--blake


On Tue, Feb 23, 2021 at 6:01 AM Dr. Jürgen Sauermann <mail@jürgen-sauermann.de> wrote:
Hi Blake,

You're correct. Another problem in HTML is unquoted attribute values in
HTML tags.

I should have said "One can use ⎕XML for decoding HTML pages and the
like as long as
they obey the fundamental XML encoding rules".

I believe it would be possible to make ⎕XML tolerate some of these HTML
quirks,
but I wonder if it is worth the effort.

Best Regards,
Jürgen



On 2/22/21 10:11 PM, Blake McBride wrote:
> Some of those "optional" end tags are not optional at all.  It's not
> HTML if it's there.  For example:
>
> <br></br>    is not HTML.
>
> --blake
>
>
>
> On Mon, Feb 22, 2021 at 1:21 PM Dr. Jürgen Sauermann
> <mail@jürgen-sauermann.de <mailto:mail@j%C3%BCrgen-sauermann.de>> wrote:
>
>     Hi,
>
>     as far as I understand it, HTML has almost the same format as XML
>     (the main difference being optional end tags in
>     HTML which are mandatory in XML. I would assume that ⎕XML can do
>     the decoding of common web interfaces
>     like the REST API or other XML based queries quite well. Fetching
>     of the data can be done with ⎕FIO[32 ff.] so
>     the combination of them should almost do the job.
>
>     Best Regards,
>     Jürgen
>
>
>     On 2/22/21 4:13 PM, Elias Mårtenson wrote:
>>     This could be quite useful when collecting data from a web site.
>>     For example, pull in a table of numbers from a Wikipedia page.
>>     Google Docs has this feature already and it can be quite useful.
>>
>>     Regards,
>>     Elias
>>
>>     On Mon, 22 Feb 2021 at 22:26, Chris Moller <moller@mollerware.com
>>     <mailto:moller@mollerware.com>> wrote:
>>
>>         Sounds like another native function!  :-)
>>
>>         Maybe after I finish my current project...
>>
>>         On 2/22/21 5:26 AM, Hans-Peter Sorge wrote:
>>>         Hi,
>>>
>>>         I would modify the data model and/or process graph or use an
>>>         adequate programming language.
>>>         In my opinion, having to rely on data content to control
>>>         program flow is 'costly'.
>>>         (My be one reason too, that APL has no language specific
>>>         regular expressions).
>>>
>>>         My highest priority for APL would be the mapping between an
>>>         apl name and a file,
>>>         directory, a db-table, a spread sheet or an editor instance.
>>>
>>>         APL was designed to contain code and data in a 'closed'
>>>         workspace.
>>>         Those days data entry was done by human nature - into the
>>>         work space.
>>>         Nowadays I get the data very likely from somewhere outside
>>>         of the workspace.
>>>         ⍎ ')host' and piping are already a big help here.
>>>
>>>         But for example analyzing a web page, that is being done
>>>         faster in python.
>>>         Having a proper infrastructure in APL, like
>>>         *page ← ⎕curl '...url...' **
>>>         **page['head';'link' ] *
>>>         could return all link tags. - just dreaming:-)
>>>
>>>         However - please no if/then/else
>>>
>>>         Best Regards
>>>         Hans-Peter
>>>
>>>
>>>         Am 20.02.21 um 19:59 schrieb Christian Robert:
>>>>         well I saw the new thrends aka Quad-XML, Quad-JSON,
>>>>         Quad-FFT and so on
>>>>
>>>>         but I think thoses will never be used in real life or quite
>>>>         seldom.
>>>>
>>>>         I really think that Juergen should be looking at
>>>>
>>>>         :if/:elseif/:else/:endif
>>>>
>>>>         :for var :in array
>>>>           loop
>>>>         :endfor
>>>>
>>>>         :while condition:
>>>>           loop
>>>>         :endwhile
>>>>
>>>>         :do
>>>>           loop
>>>>         :until condition
>>>>
>>>>         this will eases newcommers to the language.
>>>>
>>>>         I know that APL goal is to do a whole "program" in one or
>>>>         two lines of code...
>>>>         but the language must accomodate newcommers.
>>>>
>>>>         I asked for that several years ago (may me 8 or 10 years)
>>>>
>>>>         Juergen ansewered at that time "this can be done" but I
>>>>         wont yet
>>>>
>>>>         well my principal next improvements wish list is
>>>>         if/for/while/do_until
>>>>
>>>>         my real though,
>>>>
>>>>         Xtian.
>>>>
>>>
>>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]