[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#51733: 27.1; Detect impossible email addresses better
From: |
Eli Zaretskii |
Subject: |
bug#51733: 27.1; Detect impossible email addresses better |
Date: |
Wed, 19 Jan 2022 18:58:54 +0200 |
> From: Lars Ingebrigtsen <larsi@gnus.org>
> Cc: 51733@debbugs.gnu.org, jidanni@jidanni.org
> Date: Wed, 19 Jan 2022 16:45:29 +0100
>
> Eli Zaretskii <eliz@gnu.org> writes:
>
> > OK, but why do you think "Сгсе.ru" is confusable? The SLD part is
> > entirely made of single-script characters, and UTS#39 explicitly
> > allows that:
> >
> > [...] it can be perfectly legitimate to have scripts in a SLD
> > (second level domain) not be the same as scripts in a TLD (top-level
> > domain), such as:
> >
> > Cyrillic labels in a domain name with a TLD of .ru or .рф
> >
> > That's your case, isn't it?
>
> Yes, indeed. But:
>
> ---
> For some applications, it is useful to determine if a given input string has
> any whole-script confusable. For example, the identifier "ѕсоре" using
> Cyrillic characters would pass the single-script test described in Section
> 5.2, Restriction-Level Detection, even though it is likely to be a spoof
> attempt.
> ---
>
> So "Сгсе.ru" is suspicious in most contexts.
Right, but the functions we had back then didn't yet support that
part.
> > Regardless of what they are saying, I don't think the above is
> > suitable for production. I think it should be enough to see whether
> > there could be confusion with the corresponding ASCII characters from
> > confusables.txt.
>
> Yes, so that's what I've done now, but... I'd feel slightly better if I
> knew what they were actually getting at. I think they're saying that if
> "foo" is confusable with anything in any other scripts, then it's
> suspicious?
Yes, that's what they meant.
> But that sounds unworkeable. For instance, "circle.ru" is
> confusable with "СігсӀе.ru", and perhaps it's suspicious to a Russian,
> but I don't see how to make a workable function from that.
They've left that to the implementation...
Anyway, I think confusable to ASCII is good enough for Emacs for now.
> So perhaps what I've implemented now is sufficient for domains.
I think it is, yes. It definitely covers a very large chunk of the
problem.
- bug#51733: 27.1; Detect impossible email addresses better, (continued)
- bug#51733: 27.1; Detect impossible email addresses better, Eli Zaretskii, 2022/01/18
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/19
- bug#51733: 27.1; Detect impossible email addresses better, Eli Zaretskii, 2022/01/19
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/19
- bug#51733: 27.1; Detect impossible email addresses better, Eli Zaretskii, 2022/01/19
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/19
- bug#51733: 27.1; Detect impossible email addresses better,
Eli Zaretskii <=
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/19
- bug#51733: 27.1; Detect impossible email addresses better, Eli Zaretskii, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Eli Zaretskii, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Eli Zaretskii, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Lars Ingebrigtsen, 2022/01/17
- bug#51733: 27.1; Detect impossible email addresses better, Eli Zaretskii, 2022/01/17