[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
From: |
Hermanni Hyytiälä |
Subject: |
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... |
Date: |
Tue, 25 Mar 2003 04:19:25 -0500 |
CVSROOT: /cvsroot/gzz
Module name: gzz
Changes by: Hermanni Hyytiälä <address@hidden> 03/03/25 04:19:25
Modified files:
Documentation/misc/hemppah-progradu: masterthesis.tex
Log message:
huh...do I need to say more ? Tuning...
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.181&tr2=1.182&r1=text&r2=text
Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.181
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.182
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.181 Tue Mar
25 02:51:50 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex Tue Mar 25
04:19:25 2003
@@ -491,6 +491,16 @@
overlays, i.e., $O(\log{n})$ space required for maintaining information about
other peers in
the system and $O(\log{n})$ data lookup efficiency.
+Additionally, authors argue in \cite{balakrishnan03semanticfree} that tightly
structured systems
+are suitable for next generation Reference Resolutions Services (RRS)\footnote{
+Domain Name System (DNS) \cite{rfc1101} is a widely used RRS system in the
Internet.}. They present
+two requirements about the nature of reference resolution. First, there should
be a general-purpose
+and application-indepedent substrate for reference resolution. Second, the
references themselves
+should be unstructured and semantic-free. Authors emphasize that tightly
structured systems (specifically
+DHT-based) provide an elegant platform for RRS. In this context, we define
unstructured reference
+as a reference that it doesn't expose the target in any way and semantic-free
reference as a reference
+that there are no directives in the reference itself which would expose how
the reference should be processed.
+
\section{Differences}
@@ -1286,8 +1296,14 @@
Also, query and routing hot spots may be an issue in tightly structured
overlays \cite{ratnasamy02routing}.
Hot spots happen, when a specific key is being requested extremely often in
tightly structured overlays. Recent study
by Freedman et al. tries to reduce hot spots in the system by performing
\emph{sloppy} hashing
-\cite{sloppy:iptps03}. Another key feature of their work is that peers
self-organize into clusters,
-therefore enabling peers to find nearby data without looking up data from
distant peers.
+\cite{sloppy:iptps03}. Authors' technique is especially suitable for the DOLR
abstraction of tightly structured overlays.
+With Sloppy hashing, we are able to reduce the generation of query hot spots.
Sloppy hashing enables to
+locate nearby data without looking up data from distant peers. Moreover,
authors'
+proposal for self-organizing clusters using network diameters may be useful,
+especially within small groups of working people. Thus, with Sloppy hashing
+we can provide locality properties the system.
+
+
As mentioned before, an implicit assumption of almost every tightly structured
system is that there is a random, uniform
distribution of peer and key identifiers. Even if participating peers are
extremely heterogeneous, e.g., in
@@ -1761,10 +1777,15 @@
one ''virtual file'' may need obtaining several Storm blocks, which are
distributed
randomly throughout the overlay. If not efficient, construction of the
''virtual file''
may take reasonable amount of time while rendering system very unusable.
Third, Peer-to-Peer
-infrastructure has to be scalable and fault tolerant against hostile attacks.
+infrastructure has to be scalable, fault tolerant against hostile attacks and
resilience in
+adverse conditions (e.g., a network partition).
\section{Evaluation of Peer-to-Peer approaches with regard to Fenfire}
+In this section we focus on locating the Storm blocks in Peer-to-Peer
environment. We don't
+respond to fetching of Storm blocks as fetching of Storm block can be
performed easily once
+Storm block is located.
+
In chapter 2, we discussed main differences between the loosely and the
tightly structured
approach. As stated, the most significant difference is that the tightly
structured
approach has logarithmical properties in all internal operations, while the
loosely
@@ -1780,31 +1801,11 @@
structured systems and Fenfire use similar methods for identifying data in the
system, i.e., globally unique identifiers.
Another key feature of tightly structured overlays is that they are able
-to provide general purpose \emph{interface} for Reference Resolution Services
(RRS)\footnote{
-Domain Name System (DNS) \cite{rfc1101} is a widely used RRS system in the
Internet.}
+to provide general purpose \emph{interface} for Reference Resolution Services
(RRS)
\cite{balakrishnan03semanticfree}. Authors argue that next generation RRS
must be
application-independent and references itself should be \emph{unstructured}
and
-\emph{semantically free}. Finally, as said, with tightly structured systems it
is feasible to
-perform \emph{global} data lookups in the overlay. To summarize, these aspects
may be the most important features
-of Peer-to-Peer infrastructure with regard to Fenfire as a distributed,
location transparent hypermedia system.
-Thus, we see the tightly structured approach as the best alternative to locate
data in Peer-to-Peer
-environment.
-
-Once located, we can use regular TCP/IP-protocols, such as Hypertext Transfer
protocol (HTTP)
-\cite{rfc2068} for \emph{fetching} Storm blocks from the overlay. However,
HTTP-protocol may
-not be a optimal solution when obtaining large amounts of data from a
Peer-to-Peer network (e.g.,
-videos, images or music). In this case, multisource downloads can be very
useful
-for better efficiency \cite{maymounkov03ratelesscodes, bittorrenturl}.
Furthermore,
-multisource downloads can be used for decreasing load of a certain peer, thus
avoiding query
-hot spots in the system \cite{ratnasamy02routing}. Current implementation of
Fenfire uses
-standard single source downloads (HTTP) and SHA-1 \cite{fips-sha-1}
cryptographic content
-hash for verifying the integrity of data by recomputing the content hash
-for a scroll block. In face of multisource downloads, Fenfire must support
-tree-based hashes\footnote{With multisource downloads, tree-based hash
functions can be used
-to verify fixed length segments of data. If hash value of data segment is
incorrect,
-we need only to fetch \emph{segment} of data (instead of whole data) from
-an other source.}, such as \cite{merkle87hashtree, mohr02thex} for reliable
and efficient
-data validation.
+\emph{semantically free}. Thus, we see the tightly structured approach as the
best alternative to
+locate data in Peer-to-Peer environment.
Again, there are research challenges with tightly structured systems which
have to be
addressed, as described in chapter 3. The main concerns include decreased
performance and fault
@@ -1813,7 +1814,7 @@
Additionally, there is only little real world experiments yet with tightly
structured systems
(e.g., \cite{overneturl, edonkey2kurl}). Therefore, we can't say for sure, how
well these
systems would perform in real Peer-to-Peer environment. However, we believe
that these issues are
-solved, since there is a strong and wide research community towards to the
tightly structured
+solved in near future, since there is a strong and wide research community
towards to the tightly structured
overlays \cite{projectirisurl}.
@@ -1825,19 +1826,19 @@
\subsection{System proposal}
-Currently, we see Kademlia \cite{maymounkov02kademlia} as the best algorithm
for
+We emphasize that we prefer \emph{abstraction}
+level analysis as very recently better and better tightly structured
algorithms have been proposed.
+Thus, we don't want to bind our system proposal to a specific algorithm
definitively as we expect
+that this development continues. Currently, we see Kademlia
\cite{maymounkov02kademlia} as the best algorithm for
locating data efficiently in the Peer-to-Peer overlay. There are two
reasons for this. First, Kademlia's XOR-based distance function is superior
-over distance functions of other systems (see section 2.4). Second, there
exist already
-deployed real-life systems using Kademlia (e.g., \cite{overneturl,
edonkey2kurl, kashmirurl,
-kato02gisp}), which means that Kademlia's algorithm is simple and easy to
implement.
+over distance functions of other systems (see section 2.3.2). Secondly,
Kademlia
+is one of the only tightly structured systems that has been deployed in real
life
+(e.g., \cite{overneturl, edonkey2kurl, kashmirurl,kato02gisp}), which means
that
+Kademlia's algorithm is simple and easy to implement.
On top of Kademlia, we propose the usage of Sloppy hashing
\cite{sloppy:iptps03} which
-is optimized for the DOLR abstraction of tightly structured overlays. With the
Sloppy hashing,
-we are able to reduce the generation of query hot spots. Sloppy hashing
enables to
-locate nearby data without looking up data from distant peers. Moreover,
authors'
-proposal for self-organizing clusters using network diameters may be useful,
-especially within small groups of working people. Thus, with Sloppy hashing
+is optimized for the DOLR abstraction of tightly structured overlays. With
Sloppy hashing
we can provide locality properties for the Fenfire system.
For better fault tolerance and self-monitoring for Fenfire, we propose
techniques
@@ -1846,31 +1847,28 @@
as sudden network partition, or highly dynamic and heterogeneous environment.
Finally, for more efficient data transfer, we can use variable techniques for
this purpose.
-For small amounts of data, HTTP can be used \cite{rfc2068}. For big amounts of
data, we can use
+For small amounts of data, HTTP can be used \cite{rfc2068}. For large amounts
of data, we can use
multisource downloads for better efficiency and reliability. Specifically, the
technology based
on rateless erasure codes \cite{maymounkov03ratelesscodes} seems very
promising.
+Furthermore, multisource downloads can be used for decreasing load of a
certain peer, thus avoiding query
+hot spots in the system \cite{ratnasamy02routing}. Current client-server
implementation of Fenfire uses
+standard single source downloads (HTTP) and SHA-1 \cite{fips-sha-1}
cryptographic content
+hash for verifying the integrity of data by recomputing the content hash
+for block. In face of multisource downloads, Fenfire must support
+tree-based hashes\footnote{With multisource downloads, tree-based hash
functions can be used
+to verify fixed length segments of data. If hash value of data segment is
incorrect,
+we need only to fetch \emph{segment} of data (instead of whole data) from
+an other source.}, such as \cite{merkle87hashtree, mohr02thex} for reliable
and efficient
+data validation.
\subsection{Methods}
-We use the DOLR abstraction of the tightly structured approach, i.e., each
participating peer hosts
-the data and the overlay maintains only the \emph{pointers} to the data. We
decided to use the DOLR
-abstraction in our model, since DOLR systems locate data without specifying a
storage policy explicitly \cite{rhea03benchmarks}.
+We use the DOLR abstraction of the tightly structured approach since DOLR
systems locate data without
+specifying a storage policy explicitly \cite{rhea03benchmarks}, i.e., each
participating peer hosts
+the data they are offering and the overlay maintains only the \emph{pointers}
to the data.
DHT-based storage systems, such as CFS \cite{dabek01widearea} and PAST
\cite{rowstron01storage}, may have
-severe problems with load balancing in a highly heterogeneous environment. The
problem is caused by peers
-which may not be able to store relatively large amount of data with a
key-value pair, assigned randomly by
-the mapping function of the overlay. These systems waste both storage and
bandwidth, and
-are sensitive to certain attacks (e.g., the DDoS attack). Additionally, we
emphasize that we prefer \emph{abstraction}
-level analysis as very recently better and better tightly structured
algorithms have been proposed.
-Thus, we don't want to bind our system proposal to a specific algorithm
definitively as we expect
-that this development continues. In this model, we use Kademlia's
\cite{maymounkov02kademlia} algorithm for
-locating data in the overlay.
-
-In the following subsections we assume that we know the structure of
-the enfilade before hand, i.e., when assembling the ''virtual file'' we know
all the Storm
-blocks, which are required to complete the enfilade. Also, we don't
-respond to the security issues related to Peer-to-Peer systems, since there is
no working solution
-available yet; we either assume that Fenfire has a reliable technique for
identifying individual entities, or
-there are no hostile entities among participating peers.
+severe problems with load balancing in a highly heterogeneous environment
\cite{rao03loadbalancing}. The problem is caused by peers
+which may not be able to store relatively large blocks, assigned randomly by
the mapping function of the overlay.
In our method, each peer maintains the following data structures for local
operations: a data structure for listing all
key-value pairs which peer maintains; a data structure for listing all
key-value pairs in the
@@ -1879,6 +1877,13 @@
(pointer blocks), or a hash of block's content (scroll blocks) as a key. The
value is always a reference to a hosting
peer (e.g., IP address). Finally, we assume that all local operations can be
done in a constant time.
+We assume that we have resolved the construction of the ''virtual file''
before locating any Storm blocks, i.e.,
+when assembling the ''virtual file'' we know all the Storm blocks, which are
required to complete the ''virtual file''.
+Also, we don't respond to the security issues related to Peer-to-Peer systems,
since there is no working solution
+available yet. Thus, we either assume that Fenfire has a reliable technique
for identifying individual entities, or
+there are no hostile entities among participating peers, i.e., Storm blocks
can be identified correctly (e.g., when
+performing searches). In the next section, we discuss security problems in
more detail.
+
\begin{itemize}
\item Data lookup with a given identifier of Storm scroll block.
@@ -1919,7 +1924,7 @@
in a tightly structured overlay using the DOLR abstraction, where the pointer
random string is given.
Each of these algorithms can locate Fenfire blocks in $O(\log{n})$ time at
application level overlay:
-$O(\log{n})$ time for query routing to pointer peer and constant time for
+$O(\log{n})$ time for query routing to pointer peer (Kademlia
\cite{maymounkov02kademlia}) and constant time for
locating hosting peer with a given reference link.
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., (continued)
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/21
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/21
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/24
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/24
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/24
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/24
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/24
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/24
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/24
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...,
Hermanni Hyytiälä <=
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/25
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/26