gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...


From: Hermanni Hyytiälä
Subject: [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
Date: Tue, 25 Mar 2003 04:19:25 -0500

CVSROOT:        /cvsroot/gzz
Module name:    gzz
Changes by:     Hermanni Hyytiälä <address@hidden>      03/03/25 04:19:25

Modified files:
        Documentation/misc/hemppah-progradu: masterthesis.tex 

Log message:
        huh...do I need to say more ? Tuning...

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.181&tr2=1.182&r1=text&r2=text

Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.181 
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.182
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.181      Tue Mar 
25 02:51:50 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex    Tue Mar 25 
04:19:25 2003
@@ -491,6 +491,16 @@
 overlays, i.e., $O(\log{n})$ space required for maintaining information about 
other peers in 
 the system and $O(\log{n})$ data lookup efficiency.
 
+Additionally, authors argue in \cite{balakrishnan03semanticfree} that tightly 
structured systems
+are suitable for next generation Reference Resolutions Services (RRS)\footnote{
+Domain Name System (DNS) \cite{rfc1101} is a widely used RRS system in the 
Internet.}. They present
+two requirements about the nature of reference resolution. First, there should 
be a general-purpose
+and application-indepedent substrate for reference resolution. Second, the 
references themselves
+should be unstructured and semantic-free. Authors emphasize that tightly 
structured systems (specifically
+DHT-based) provide an elegant platform for RRS. In this context, we define 
unstructured reference
+as a reference that it doesn't expose the target in any way and semantic-free 
reference as a reference 
+that there are no directives in the reference itself which would expose how 
the reference should be processed. 
+
 
 \section{Differences}
 
@@ -1286,8 +1296,14 @@
 Also, query and routing hot spots may be an issue in tightly structured 
overlays \cite{ratnasamy02routing}. 
 Hot spots happen, when a specific key is being requested extremely often in 
tightly structured overlays. Recent study 
 by Freedman et al. tries to reduce hot spots in the system by performing 
\emph{sloppy} hashing 
-\cite{sloppy:iptps03}. Another key feature of their work is that peers 
self-organize into clusters, 
-therefore enabling peers to find nearby data without looking up data from 
distant peers.
+\cite{sloppy:iptps03}. Authors' technique is especially suitable for the DOLR 
abstraction of tightly structured overlays. 
+With Sloppy hashing, we are able to reduce the generation of query hot spots. 
Sloppy hashing enables to 
+locate nearby data without looking up data from distant peers. Moreover, 
authors' 
+proposal for self-organizing clusters using network diameters may be useful, 
+especially within small groups of working people. Thus, with Sloppy hashing
+we can provide locality properties the system.
+
+
 
 As mentioned before, an implicit assumption of almost every tightly structured 
system is that there is a random, uniform 
 distribution of peer and key identifiers. Even if participating peers are 
extremely heterogeneous, e.g., in
@@ -1761,10 +1777,15 @@
 one ''virtual file'' may need obtaining several Storm blocks, which are 
distributed 
 randomly throughout the overlay. If not efficient, construction of the 
''virtual file''
 may take reasonable amount of time while rendering system very unusable. 
Third, Peer-to-Peer 
-infrastructure has to be scalable and fault tolerant against hostile attacks.
+infrastructure has to be scalable, fault tolerant against hostile attacks and 
resilience in 
+adverse conditions (e.g., a network partition).
 
 \section{Evaluation of Peer-to-Peer approaches with regard to Fenfire}
 
+In this section we focus on locating the Storm blocks in Peer-to-Peer 
environment. We don't
+respond to fetching of Storm blocks as fetching of Storm block can be 
performed easily once
+Storm block is located. 
+
 In chapter 2, we discussed main differences between the loosely and the 
tightly structured 
 approach. As stated, the most significant difference is that the tightly 
structured 
 approach has logarithmical properties in all internal operations, while the 
loosely 
@@ -1780,31 +1801,11 @@
 structured systems and Fenfire use similar methods for identifying data in the
 system, i.e., globally unique identifiers.
 Another key feature of tightly structured overlays is that they are able
-to provide general purpose \emph{interface} for Reference Resolution Services 
(RRS)\footnote{
-Domain Name System (DNS) \cite{rfc1101} is a widely used RRS system in the 
Internet.}
+to provide general purpose \emph{interface} for Reference Resolution Services 
(RRS)
  \cite{balakrishnan03semanticfree}. Authors argue that next generation RRS 
must be 
 application-independent and references itself should be \emph{unstructured} 
and 
-\emph{semantically free}. Finally, as said, with tightly structured systems it 
is feasible to 
-perform \emph{global} data lookups in the overlay. To summarize, these aspects 
may be the most important features 
-of Peer-to-Peer infrastructure with regard to Fenfire as a distributed, 
location transparent hypermedia system.
-Thus, we see the tightly structured approach as the best alternative to locate 
data in Peer-to-Peer
-environment.
-
-Once located, we can use regular TCP/IP-protocols, such as Hypertext Transfer 
protocol (HTTP) 
-\cite{rfc2068} for \emph{fetching} Storm blocks from the overlay. However, 
HTTP-protocol may 
-not be a optimal solution when obtaining large amounts of data from a 
Peer-to-Peer network (e.g., 
-videos, images or music). In this case, multisource downloads can be very 
useful 
-for better efficiency \cite{maymounkov03ratelesscodes, bittorrenturl}. 
Furthermore, 
-multisource downloads can be used for decreasing load of a certain peer, thus 
avoiding query 
-hot spots in the system \cite{ratnasamy02routing}. Current implementation of 
Fenfire uses 
-standard single source downloads (HTTP) and SHA-1 \cite{fips-sha-1} 
cryptographic content 
-hash for verifying the integrity of data by recomputing the content hash
-for a scroll block. In face of multisource downloads, Fenfire must support
-tree-based hashes\footnote{With multisource downloads, tree-based hash 
functions can be used 
-to verify fixed length segments of data. If hash value of data segment is 
incorrect, 
-we need only to fetch \emph{segment} of data (instead of whole data) from 
-an other source.}, such as \cite{merkle87hashtree, mohr02thex} for reliable 
and efficient 
-data validation.
+\emph{semantically free}.  Thus, we see the tightly structured approach as the 
best alternative to 
+locate data in Peer-to-Peer environment.
 
 Again, there are research challenges with tightly structured systems which 
have to be
 addressed, as described in chapter 3. The main concerns include decreased 
performance and fault 
@@ -1813,7 +1814,7 @@
 Additionally, there is only little real world experiments yet with tightly 
structured systems
 (e.g., \cite{overneturl, edonkey2kurl}). Therefore, we can't say for sure, how 
well these 
 systems would perform in real Peer-to-Peer environment. However, we believe 
that these issues are 
-solved, since there is a strong and wide research community towards to the 
tightly structured 
+solved in near future, since there is a strong and wide research community 
towards to the tightly structured 
 overlays \cite{projectirisurl}.
 
       
@@ -1825,19 +1826,19 @@
  
 \subsection{System proposal}
 
-Currently, we see Kademlia \cite{maymounkov02kademlia} as the best algorithm 
for
+We emphasize that we prefer \emph{abstraction}
+level analysis as very recently better and better tightly structured 
algorithms have been proposed. 
+Thus, we don't want to bind our system proposal to a specific algorithm 
definitively as we expect 
+that this development continues. Currently, we see Kademlia 
\cite{maymounkov02kademlia} as the best algorithm for
 locating data efficiently in the Peer-to-Peer overlay. There are two
 reasons for this. First, Kademlia's XOR-based distance function is superior
-over distance functions of other systems (see section 2.4). Second, there 
exist already  
-deployed real-life systems using Kademlia (e.g., \cite{overneturl, 
edonkey2kurl, kashmirurl,
-kato02gisp}), which means that Kademlia's algorithm is simple and easy to 
implement.
+over distance functions of other systems (see section 2.3.2). Secondly, 
Kademlia
+is one of the only tightly structured systems that has been deployed in real 
life
+(e.g., \cite{overneturl, edonkey2kurl, kashmirurl,kato02gisp}), which means 
that 
+Kademlia's algorithm is simple and easy to implement.
 
 On top of Kademlia, we propose the usage of Sloppy hashing 
\cite{sloppy:iptps03} which
-is optimized for the DOLR abstraction of tightly structured overlays. With the 
Sloppy hashing, 
-we are able to reduce the generation of query hot spots. Sloppy hashing 
enables to 
-locate nearby data without looking up data from distant peers. Moreover, 
authors' 
-proposal for self-organizing clusters using network diameters may be useful, 
-especially within small groups of working people. Thus, with Sloppy hashing
+is optimized for the DOLR abstraction of tightly structured overlays. With 
Sloppy hashing
 we can provide locality properties for the Fenfire system.
 
 For better fault tolerance and self-monitoring for Fenfire, we propose 
techniques
@@ -1846,31 +1847,28 @@
 as sudden network partition, or highly dynamic and heterogeneous environment.
 
 Finally, for more efficient data transfer, we can use variable techniques for 
this purpose.
-For small amounts of data, HTTP can be used \cite{rfc2068}. For big amounts of 
data, we can use
+For small amounts of data, HTTP can be used \cite{rfc2068}. For large amounts 
of data, we can use
 multisource downloads for better efficiency and reliability. Specifically, the 
technology based
 on rateless erasure codes \cite{maymounkov03ratelesscodes} seems very 
promising.
+Furthermore, multisource downloads can be used for decreasing load of a 
certain peer, thus avoiding query 
+hot spots in the system \cite{ratnasamy02routing}. Current client-server 
implementation of Fenfire uses 
+standard single source downloads (HTTP) and SHA-1 \cite{fips-sha-1} 
cryptographic content 
+hash for verifying the integrity of data by recomputing the content hash
+for block. In face of multisource downloads, Fenfire must support
+tree-based hashes\footnote{With multisource downloads, tree-based hash 
functions can be used 
+to verify fixed length segments of data. If hash value of data segment is 
incorrect, 
+we need only to fetch \emph{segment} of data (instead of whole data) from 
+an other source.}, such as \cite{merkle87hashtree, mohr02thex} for reliable 
and efficient 
+data validation.
 
 \subsection{Methods}
 
-We use the DOLR abstraction of the tightly structured approach, i.e., each 
participating peer hosts 
-the data and the overlay maintains only the \emph{pointers} to the data. We 
decided to use the DOLR 
-abstraction in our model, since DOLR systems locate data without specifying a 
storage policy explicitly \cite{rhea03benchmarks}. 
+We use the DOLR abstraction of the tightly structured approach since DOLR 
systems locate data without 
+specifying a storage policy explicitly \cite{rhea03benchmarks}, i.e., each 
participating peer hosts 
+the data they are offering and the overlay maintains only the \emph{pointers} 
to the data. 
 DHT-based storage systems, such as CFS \cite{dabek01widearea} and PAST 
\cite{rowstron01storage}, may have
-severe problems with load balancing in a highly heterogeneous environment. The 
problem is caused by peers 
-which may not be able to store relatively large amount of data with a 
key-value pair, assigned randomly by 
-the mapping function of the overlay. These systems waste both storage and 
bandwidth, and
-are sensitive to certain attacks (e.g., the DDoS attack). Additionally, we 
emphasize that we prefer \emph{abstraction}
-level analysis as very recently better and better tightly structured 
algorithms have been proposed. 
-Thus, we don't want to bind our system proposal to a specific algorithm 
definitively as we expect 
-that this development continues. In this model, we use Kademlia's 
\cite{maymounkov02kademlia} algorithm for 
-locating data in the overlay.
-
-In the following subsections we assume that we know the structure of 
-the enfilade before hand, i.e., when assembling the ''virtual file'' we know 
all the Storm 
-blocks, which are required to complete the enfilade. Also, we don't 
-respond to the security issues related to Peer-to-Peer systems, since there is 
no working solution
-available yet; we either assume that Fenfire has a reliable technique for 
identifying individual entities, or
-there are no hostile entities among participating peers. 
+severe problems with load balancing in a highly heterogeneous environment 
\cite{rao03loadbalancing}. The problem is caused by peers 
+which may not be able to store relatively large blocks, assigned randomly by 
the mapping function of the overlay. 
 
 In our method, each peer maintains the following data structures for local 
operations: a data structure for listing all 
 key-value pairs which peer maintains; a data structure for listing all 
key-value pairs in the 
@@ -1879,6 +1877,13 @@
 (pointer blocks), or a hash of block's content (scroll blocks) as a key. The 
value is always a reference to a hosting 
 peer (e.g., IP address). Finally, we assume that all local operations can be 
done in a constant time.
 
+We assume that we have resolved the construction of the ''virtual file'' 
before locating any Storm blocks, i.e., 
+when assembling the ''virtual file'' we know all the Storm blocks, which are 
required to complete the ''virtual file''. 
+Also, we don't respond to the security issues related to Peer-to-Peer systems, 
since there is no working solution
+available yet. Thus, we either assume that Fenfire has a reliable technique 
for identifying individual entities, or
+there are no hostile entities among participating peers, i.e.,  Storm blocks 
can be identified correctly (e.g., when
+performing searches). In the next section, we discuss security problems in 
more detail. 
+
 
 \begin{itemize}
 \item Data lookup with a given identifier of Storm scroll block.
@@ -1919,7 +1924,7 @@
 in a tightly structured overlay using the DOLR abstraction, where the pointer 
random string is given.
 
 Each of these algorithms can locate Fenfire blocks in $O(\log{n})$ time at 
application level overlay:
-$O(\log{n})$ time for query routing to pointer peer and constant time for 
+$O(\log{n})$ time for query routing to pointer peer (Kademlia 
\cite{maymounkov02kademlia}) and constant time for 
 locating hosting peer with a given reference link.
 
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]