gnunet-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GNUnet-developers] Idea for file storage in GNUnet


From: LRN
Subject: Re: [GNUnet-developers] Idea for file storage in GNUnet
Date: Fri, 07 Dec 2012 11:34:44 +0400
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:20.0) Gecko/20.0 Thunderbird/20.0a1

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Answer: Because it disturbs the logical flow of communication.
Question: Why so many people think that top-posting is bad?

On 07.12.2012 7:28, hypothesys wrote:
> I understand (or at least believe I do) your point about the
> available storage not being a priority as it is not limitative,
> however could this increased storage not make up for a potencially
> lower latency for the network? After all, if the there is more
> storage the same data could be made available from a greater number
> of nodes. As such the number of network hops a single data block
> has to travel between the "asking node" and the "data provider
> node" diminishes, and so does latency at least when in darknet 
> mode.
> 
> I do not know if this would be impossible due to the way GNUnet
> routes data, or a misconception on my reasoning, but I assume
> GNUnet must use some implementation of Key-based routing and DHT.
> Perhaps this lower latency could pose problems from the anonymity
> POV, and I cannot predict the security implications, still it would
> not mean that the smallest route had to be taken, only diminishing
> the minimum number of hops necessary to transfer data. Probably it
> would mean small-world networks world would become smaller
> proportionately to the total amount of available storage at each
> node.
You'd have to wait for Grothoff's reply to get his opinion on this. My
highly uneducated guess is that data variety is greater than the
available space. Thus, no matter how large datastores of GNUnet nodes
are, the probability of just finding a block of data in a random node
that is not sharing it on purpose is relatively small, as compared to
sharing files on purpose (which likely means using indexing ->
datastore size is not relevant) OR sharing blocks that are also being
downloaded by the node (i.e. seeding what you're downloading; unless
you're stingy, your datastore would be large enough to fit all of them
anyway) OR sharing "hot", often-requested blocks (that's the best case
for increased datastore size).



Now, what i find compelling in this idea is the space management.
When i think of what i would tell people about setting up a GNUnet
node, and specifically - about choosing the datastore size, the answer
i come up with is this:

"Think of the largest number of files you'd be downloading
simultaneously, then add their sizes together - that's the minimum
space you should allocate for datastore."

However, if that wost-case scenario actually happens, your node will
not run optimally - all your datastore will be filled with the blocks
you're downloading right now, while all migrated blocks will be
removed, and (depending on your priority choices earlier on) inserted
blocks might go away as well. I wouldn't want this to happen too
often. So the right answer is the above recommendation _PLUS_ any
gigabytes of storage that you can spare.

And "spare" is the problem. I can easily spare 20 or 40 gigabytes, but
100 or 200 is somewhat trickier. I might have that kind of space now,
and be willing to give it to GNUnet, but i might want that space back
at some point. Not sure what GNUnet will do right now, if i shut down
my node, reduce the datastore size, then start the node up again.
Probably discard lowest-priority blocks until datastore shrinks to the
new limit?

Now, having a minimum space allocated to the datastore, and then just
using N% of the remaining free disk space for for datastore too, while
it's available - that really makes the decision easier. If GNUnet is
then taught to use pre-allocated datastore for important blocks (files
being downloaded or published; what are privacy issues here?), that
would mean that your node will serve _your_ interests first, and will
use the free space available to serve the network as best as it can.

It should maintain either F% of space free, or G gigabytes (whichever
is larger). Obviously, F and G are configurable (i.d say - default F
to 20, and G to 20; unless GNUnet daemon that would reclaim free space
would be a slowpoke, 20 gigabytes should give it enough time to react).
It should also be completely disabled for SSDs, IMO. Because they are
small to begin with, _and_ because their performance degrades greatly
as they are filled with data.



Thus the idea is the same as with CPU resources - you set up low and
high thresholds for CPU load that GNUnet can cause. It will go as high
as the high threshold when uncontested, and will go down to the low
threshold when other processes compete for CPU resources with GNUnet.
Same for storage - use large portion of available free space for
datastore (primarily - for migrated and cached blocks), but be ready
to discard all that, and go as low as the size of the pre-allocated
datastore.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (MingW32)
Comment: Using GnuPG with undefined - http://www.enigmail.net/

iQEcBAEBAgAGBQJQwZwTAAoJEOs4Jb6SI2CwdnAIAKd8G+z2v0jcD68PpF0jGvaD
Ioq7xyJZJzS2BGJeDkcGHRZtsSHNjsbUb4RoYa0wvmphJDg9k7Hhmj/0itFl5tOm
eT7DFeLUcc9s2jBjp7CfoLi7DruGCn03ZkFCzGTPsJ7IQk4m79OlwfkD9xu8UbYC
y54cxSb23/4d1UyTfSjJ5kL4ioDTyFL4V9QuhVXM5vLGjskyAaYoBxcdgCj3OW26
ITrN+xTgFH+3x44azr2xrPlWXQRC7zfEKZ+zAONfztWCTun+VoTa2BuOGK86517Q
ZT99XTU93hIH5Jl5CO81RluaMWM+UJnO9T6L+LPWXNsNWy9CTgPu7LluT9akqZU=
=b2n3
-----END PGP SIGNATURE-----



reply via email to

[Prev in Thread] Current Thread [Next in Thread]