samizdat-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: automatic URLs for plain text format?


From: Dmitry Borodaenko
Subject: Re: automatic URLs for plain text format?
Date: Mon, 4 Dec 2006 13:37:14 +0000

On 12/4/06, boud <address@hidden> wrote:
There remains a bug for URLs of the type:

http://a.b:80/c
http://a.b/c:d

i.e. if a ":" is anywhere in the URL, but i think this probably happens
elsewhere when further "cleaning" up the content. The resulting html is

  <a>http://a.b:80/c</a>
  <a>http://a.b/c:d</a>

i'm not sure if this is a bug or a feature, since URLs with port
numbers are not very common any more and are probably not recommended
for usability, and i don't know whether the colon is considered to be
a standard character to be allowed in URLs.

It is a bug in Samizdat::Sanitize. This fragment of xhtml.yaml is to blame:

&uri !ruby/regexp /\A(http:|https:|ftp:|mailto:)?[^:]+\z/i

This needs to remain more restrictive than regexps in uri/common.rb
(to make sure no JavaScript invokation can creep in), but colon is
certainly allowed within URLs so [^:] part should be replaced. Any
suggestions?

BTW: Quite a big discussion on looking at cms'es for the "IMC Alternatives"
collective/website is going on at:
(...)

I know that spam filtering is a major concern for Chuck, so I think we
should get anti-abuse measures working in Samizdat before approaching
him.

My guess is the actual imc-cms discussion process is de facto
suspended since looking for servers is a huge priority issue right
now. On the other hand, people wanting a new cms are not going to wait
for the imc-cms group to come up with a formal, structured decision,
and IMHO they're not going to try samizdat unless someone "techie"
helps them.  i'm unlikely to have time, but i thought i'd mention it
anyway....

Setting up another Samizdat site on a machine that already has the
software installed is really trivial, I can do that in 15-30 minutes.
The hard part that I didn't fully automate yet is the backup/mirroring
scripts, that's something that needs some arrangement on the receiving
end of backups. I also wonder how Samizdat will cope with a real-world
high-traffic sites: synthetic tests can only get you to a certain
point, real world is always different...

BTW(2): i made my very first samizdat cvs commit - on the file
about.html .  Maybe cvs is not so complicated anyway, it's just

info cvs

which gives a huge amount of info and warnings. :)

CVS is not complicated at all, until you get file conflicts and
branching, you only need to care about 'cvs up' and 'cvs ci' commands.

--
Dmitry Borodaenko




reply via email to

[Prev in Thread] Current Thread [Next in Thread]