gnunet-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GNUnet-developers] Designing a gnunet directory app


From: Igor Wronsky
Subject: Re: [GNUnet-developers] Designing a gnunet directory app
Date: Fri, 14 Jun 2002 13:40:52 +0300 (EEST)

On Thu, 13 Jun 2002, Christian Grothoff wrote:

> > There are several problems. First, a more general issue, how
> > can we get rid of obsolete directories? Suppose I insert one
> > with a keyword "directory" and include a timestamp. Could people
> > querying for "directory" get that same hit even a year afterwards,
> > as hitting it will keep it at least in some sense popular?
> > Does gnunet have mechanism to get rid of files that are
> > often 'hit' but rarely if ever downloaded?
> No, it can't because search-results and downloads look identical in our 
> encoding. For anonymity reasons, you don't want an adversary to be able to 
> correlate hits. But it should not be a problem -- if the directory 
> description contains some timestamp or other versioning information, a clever 
> search-tool could see which are more/less recent directories. Note that any 
> real possibility to 'update and destroy the old copy' is an open door for 
> censorship (update with 'nothing' and destroy) and thus not acceptable. What 
> could also be used is a signed directory that contains the key(word) for the 
> update. Of course, everybody could publish another directory under the 
> update-keyword, but not everybody could sign it with the same public key. 

I was thinking more on the lines of 'unnecessary content timing out' 
than deletion or replacement. Imagine if someone inserts plain
crap using a popular keyword. At least the base entry could stay there
indefinitely, as the queries keep it alive? I don't know a solution
to this one.

For content which can go stale (lists and such) would it ultimately 
be a bad idea to be able to include an optional existence termination date
to the data, by the original sender? That way, a (benevolent) inserter 
could supply a date like "this file is valid until 20.jun.2002" style 
stamp, after which any node encountering the data like that would 
just coldly delete it? Or would it be too much overhead? Atleast the
current system has a pretty small blocksize and if I understood 
correctly,the date would have to be in every block ... Perhaps it would be
enough to just include the date to the root block, and delete
that, and the rest would disappear naturally as they are not
asked for anymore.

> I would rather add an option to 'gnunet-insert-multi' (gnunet-insert *has* 
> all the information that gnunet-search prints) to create a directory when the 
> files are inserted. 

Ah. I see where you are aiming at. Instead of some people collecting
keys posted by others, the original posters would insert directory 
entries at the same time. That is a nice idea, but then its clear that 
the directory system needs some (sub)standardized hierarchy so
that people can choose between lists different styles of content
without having to browse through all. This is because anonymous systems
encourages people to publish all kinds of vile stuff. Not stopping to 
theorize on the tolerance of different people, let us just assume 
that the majority doesn't want to sort through hundreds of lists 
daily which might contain mostly offensive or uninteresting content.

I think it would be enough to tag the inserted directory 
with some general group classification and have a "global directory"
and "specific directories". Lets suppose someone posts 102
pictures of sportscars. This should result in a directory list
indexed with two keywords, "directory-global-[date]" and 
"directory-[group]-[date]". Searching could then result in like

---
# bin/gnunet-search directory-global-14.jun.2002
872487347AA324329434ABC81247124823482342 1248248231 2045
=> pictures.cars : 102 entries : 14.jun.2002 <= (filename: ???, mimetype: 
unknown)
38923589329AABCB49294BBC3939423943943332 482832122 1386
=> pictures.elephants : 52 entries : 14.jun.2002 <= (filename: ???, 
mimetype: unknown)
35929238492384923489238492384182DBD72347 2348238 534
=> punk : 12 entries : 14.jun.2002 <= (filename: ???, mimetype: unknown)
---

and in specific context just

---
# bin/gnunet-search directory-pictures.cars-14.jun.2002
872487347AA324329434ABC81247124823482342 1248248231 2045
=> pictures.cars : 102 entries : 14.jun.2002 <= (filename: ???, 
mimetype: unknown)
----

The point here would be that the global directory would/could
mainly be used to know what groups are active right now, and 
people with specialized interests could limit their query to the
directory of a particular group. If there was a more advanced 
gui, it could get the list of currently active groups by
doing a query, and when inserting files the suitable group 
could be selected from a list. With a command line tool 
it could just be something like 

# gnunet-insert-multi -g [group] <files>

What groups actually form could be left to evolution. ;)

The thing with dates is simply to get only recent listings. If
optional content timeout was implemented, we wouldn't need
the queries to use dates. How to address adversarial time
out dates then? ... Another way, the found listings could be
filtered by the application - eventually though the number
of results to parse would grow prohibitive. And how would
the old, but popular content, stay directorized then? Perhaps
gnunet-download could generate and insert reports of what 
was downloaded succesfully. ;)

All this is ofcourse speculation, or designing. I hope some scheme 
can be considered sufficiently good to be worth implementing. :D

> > Should compression be used on the directories?
> I would make it an option to the user. Some directories will be too small to 
> yield any big gains - compression was already thought as an option for 
> gnunet-insert, but so far nobody did it and I still believe that the users 
> can do it manually up-front if they really want to. The problem with 
> compression for gnunet-insert is also that it conflicts with the on-demand 
> encoding (indexing vs. insertion!), which would not apply for directories. 
> Anyway, 'tar' can't be wrong, and there, it's an option :-)

In a software like gnunet where the point is not to reinvent
the wheel (or so i suppose ;) ) the directory compression/decompression
on the fly could probably be really simply achieved with zlib. 

> > Naturally the directory listings should also be machine (eg gui)
> > readable. The app should be able to retrieve newest up-to-date lists.
> > Perhaps lists could be signed by sender, using a handle perhaps.
> Right, we should have some standard format that allows signing with a 
> pseudonym.

This calls for public/private key pair. Where could the public
key be reliably published? Would it be ok to include a hash
of the public key to the signed content? The actual key could 
then be retrieved by downloading the hash, after which the 
signature could be checked. Of course now someone could claim to
be someone else by just supplying his own hash to the message,
but then it would differ from the previous insertions by
the original author.

> > Of course trivially done that would not be proof against attacks,
> > but if most people are benevolent, it'd enable 'fan communities'
> > to follow directories created by some famous persons. ;)
> I don't see why it should not be done in a safe way - gnunet uses pretty
> much the strongest practical ciphers for signing certain messages, why
> shouldn't we do the same here? The time it takes to sign should be
> insignificant anyway.

I think the question is how complicated the directory system 
should be or needs to be. Is it important enough to the overall
network? In a way, the ability to search by keywords makes
the need for directories smaller, but then again it may be 
difficult to get users to index their content intelligently. 
Perhaps the problem is somewhat equivalent (or a bit harder)
than making the users post their stuff to a suitable group. ;)

> I'm not sure what you mean with messaging capabilities, but looking into 
> existing designs is definitely the right approach. I don't really know frost, 
> any references? 

Frost is an example of a software that was done on "I know best"
approach. When it displayed bad properties, the author claimed
that he has additional ideas and will code a new software with
the problems addressed. He wasn't interested in discussing the
solutions this time either and disappeared underground. I don't 
think there's any good document of frost. :(

Basically, frost implements messaging and file sharing on
freenet by creating two kinds of files: message files and
index files. A header is added to the message and its
inserted into freenet with a key like

news.[group].[date].[daily_msg_counter]

and frost polls for these keys to read messages posted
by other users by incrementing daily_msg_counter until
no more messages can be found. The problem is that
if two users think daily_msg_counter is eg 5, both
will insert at 6, and there are good chances that
the insertions will not meet each other. So different
people might see different messages and some messages
can be entirely lost.

The idea of index (directory) files is similar, 

idx.[group].[date].[daily_idx_counter]

They contain group-specific keylists. When one or more files 
is inserted by frost, it creates such an index file.
  
> Polling? As in repeatedly query or what?

Yes. The above should explain it. If user wants indexes
or messages, he polls for them, typically max 3 days to the 
past, always starting at daily_idx 0 and increasing until
no more can be found. If such a system was implemented for 
gnunet, atleast the daily idx counters could be dropped
because in gnunet keywords are not unique. Also the collision 
problem would not appear. The load to the system would 
probably be quite similar. In freenet it has caused trouble 
(fn developers have implemented a mechanism which puts keys 
to hold for a time if the content wasn't found on last query, 
in order to address the load generated by repeatedly 
polling for nonexisting material).

> > The thing is though that if some similar app
> > isn't done intelligently for gnunet, it will be eventually made
> > brainlessly by a third party and bad things will happen (on freenet
> > there's now atleast four ways to transmit and find files: private,
> > not announced anywhere, "freesites", "frost" and "fmb". this
> > has unnecessarily split the available resources between
> > incompatible methods of announcement/retrieval/etc) :(
> Well, I'm definitely for trying to estabish standards :-)

Good. Here's something to specify. ;)

- The hierarchy-or-flat -issue (and what should the result look like?) 
- The keyword format for locating listings (w/ dates or nodates?)
- The actual listing format 
    - hash, crc, size, what else? same stuff as given by gnunet-search?
    - signatures?
    - compression?

> I would see it as an extention to the gnunet-filesharing library that can 
> then be used via options in the textui-clients or GUIs.

That sounds good to me.


Igor






reply via email to

[Prev in Thread] Current Thread [Next in Thread]