[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
From: |
Hermanni Hyytiälä |
Subject: |
[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert... |
Date: |
Wed, 12 Mar 2003 02:47:26 -0500 |
CVSROOT: /cvsroot/gzz
Module name: gzz
Changes by: Hermanni Hyytiälä <address@hidden> 03/03/12 02:47:26
Modified files:
Documentation/misc/hemppah-progradu: masterthesis.tex
Log message:
First ispell run
CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.128&tr2=1.129&r1=text&r2=text
Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.128
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.129
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.128 Fri Mar
7 08:23:37 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex Wed Mar 12
02:47:26 2003
@@ -94,17 +94,17 @@
''The user's machine is a client and a server'' describes best Peer-to-Peer
systems. To summarize, Peer-to-Peer systems can be characterized as
distributed
systems in which all communication is symmetric and all participants have
identical
-capabilities and responsabilities. Each \emph{peer} may contribute data or
+capabilities and responsibilities. Each \emph{peer} may contribute data or
computing resources (e.g., unused storage) to the overall system and the
welfare
of the community can scale with the number of participants. Thus, each
participant
rely on one another's services and resources, rather than solely relying on
dedicated
-and centralized infracstructure.
+and centralized infrastructure.
One of the most important properties of any distributed computing system are
efficient
data lookup and security. In this thesis, we\footnote{Use of the plural is
customary even if research paper is authored solely.}
focus on these aspects in Peer-to-Peer domain.
Specifically, we review existing Peer-to-Peer approaches, algorithms and their
key properties. We observe
-that despite of greate amount of proposed Peer-to-Peer systems, all systems
fall either
+that despite of create amount of proposed Peer-to-Peer systems, all systems
fall either
loosely structured approach or tightly structured approach. We also discuss
open problems in
Peer-to-Peer systems and divide problems into three sub-categories: security
related problems,
performance related problems and miscellaneous problems. In the end, we
summarize all
@@ -116,7 +116,7 @@
choose the best alternative to Fenfire's needs. We discover that Fenfire,
xanalogical model and
tightly structured Peer-to-Peer approach all have similar method to deal with
data,
i.e., globally unique identifiers. Finally, we propose system model for
Fenfire in Peer-to-Peer
-environment and present yet simple but efficient algortihms to be used for
data lookups in
+environment and present yet simple but efficient algorithms to be used for
data lookups in
Peer-to-Peer environment.
To our knowledge, this thesis is the most comprehensive work with regard to
summarizing
@@ -178,14 +178,14 @@
\label{fig:application_level}
\end{figure}
-Compared to ARPANET's Peer-to-Peer functionality, modern Peer-to-Peer systems
+Compared to ARPA Net's Peer-to-Peer functionality, modern Peer-to-Peer systems
are ad-hoc, i.e., peers join and leave the system constantly in a dynamic
manner. This
fact constitutes challenging requirements for efficient construction and
maintenance
of the overlay network. Even more demanding tasks are how to perform efficient
data
lookup and maintain security in a varying distributed environment. The most
popular
form of modern Peer-to-Peer computing is file-sharing. In this scenario,
participants
of Peer-to-Peer network share their resources to other participants while
obtaining
-more resources from others. This can been seen as a variant of distributed
filesystem
+more resources from others. This can been seen as a variant of distributed
file system
(e.g., \cite{levy90distributedfilesystems}).
In a development of modern Peer-to-Peer systems, lot of influences has been
attained from
@@ -193,7 +193,7 @@
to ad-hoc nature of complex networks \cite{albert-02-statistical},
\cite{albert-00-tolerance}, \cite{watts00dynamics}.
It's interesting to realize that chemical properties of cells, the Internet,
ad-hoc
Peer-to-Peer systems, have all in common that they self-organize based on same
-principles. Furthermore, the assocation between social connections among
people
+principles. Furthermore, the association between social connections among
people
and Peer-to-Peer overlay topology has been studied recently
\cite{watts00dynamics},
\cite{kleinberg99small}, \cite{nips02-Kleinberg}. This insight is motivated
by Milgram, how noticed that people are very effective to locate other people
in a wide scale
@@ -206,7 +206,7 @@
systems fall: loosely structured approach and tightly structured approach. In
loosely
structured approach the construction and the maintenance of the overlay-- as
the name
suggests- is controlled loosely. This approach gives freedom for participating
peers
-to perform certains tasks in Peer-to-Peer network. On the other hand, tightly
structured
+to perform certain tasks in Peer-to-Peer network. On the other hand, tightly
structured
approach has some rules, which all participating peers have to obey. In the
following
sections, we discuss in more detail both approaches, they disadvantages and
advantages, and
key differences.
@@ -244,11 +244,11 @@
In Gnutella, each participating peer maintains local index of its own shared
content. Also,
each peer has a few connections to other peer, i.e., peer's \emph{neighbors}.
Basic gnutella
-data lookup works as follows: peer broadcasts a query request to its neighors,
which in turn
+data lookup works as follows: peer broadcasts a query request to its
neighbors, which in turn
forwards the query to their neighbors. This leads in the situation where
number of messages
in the network can grow with $O(n^{2})$, where $n$ is the number of
participating peers in the
Gnutella network. To limit the amount of network traffic, Gnutella uses
Time-To-Live-limited
-(TTL) flooding to distributed queries. Gnutella uses a Breadt-First-Search
(BFS) with depth limit
+(TTL) flooding to distributed queries. Gnutella uses a Breadth-First-Search
(BFS) with depth limit
$T$ (e.g., 7), where $T$ is the system-wide maximum TTL of a message in hops.
Therefore, only peers that
are TTL hops away from the query originator will forward the query or respond
to the query.
In Gnutella network, search results are fast, because BFS sends queries to
@@ -263,7 +263,7 @@
\end{figure}
According to \cite{lv02searchreplication}, Gnutella's way to perform data
lookups, \emph{flooding}, has
-following limitations. First, choosing the approriate TTL in practice is not
easy. If the
+following limitations. First, choosing the appropriate TTL in practice is not
easy. If the
TTL is too high, query originator may unnecessarily strain the network. If the
TTL is too
low, the query originator might not find the desired data even it's available
somewhere
in the network. Second, there are many duplicate messages generated by
flooding, especially
@@ -277,7 +277,7 @@
networks\footnote{In power-law networks only a few peers have high number of
neighbor
links and major of peers have low number of neighbor links.} and have found
that by
instructing peers forwarding data lookups to select high degree peers, the
performance of data lookup
-increases signficantly. As a result, some of the most recent loosely
+increases significantly. As a result, some of the most recent loosely
structured Peer-to-Peer systems have adopted this method with some
modifications
\cite{gnutella2url}, \cite{shareazaurl}, \cite{fasttrackurl},
\cite{morpheusurl},
\cite{kazaaurl}, \cite{jxtaurl}, \cite{jxtaoverview},
\cite{botros01jxtasearch},
@@ -311,7 +311,7 @@
\subsection{Sketch of formal definition}
-In this subsection we formalize loosely strucured overlay's main components.
This
+In this subsection we formalize loosely structured overlay's main components.
This
model is based on original Gnutella overlay network with power-law
improvements.
Let $S$ be the aggregate of all services $s$ in system. Let $P$ be the
aggregate of
@@ -319,7 +319,7 @@
expressed as $p = provider(s)$. Every $p$ has neighbor(s), named as
$neighbor$, which
is $P$ = \{$p \in P: \exists neighbor$, which is randomly chosen from $P$\}.
\emph{Super peer} is a peer, which hosts the indices of other peers, $sp =
summaryindex(provider(s))$.
-Moreover, $\forall$ reqular peer $p$, there is super peer, which has has a
index of regular
+Moreover, $\forall$ regular peer $p$, there is super peer, which has has a
index of regular
peer's content, specifically $ps$, $P$ = \{$p \in P: \exists ps$,
where $ps$ = $summaryindex(provider(s)) \wedge (p = provider(s))$\}
@@ -337,7 +337,7 @@
\cite{zhao01tapestry}, Viceroy \cite{malkhi02viceroy} and others
\cite{freedman02trie}.
The biggest difference compared to loosely structured approach is that with
tightly structured systems,
it is now feasible to perform \emph{global} data lookups in the overlay.
-While there are significat differences among proposed systems, they all have
in common
+While there are significant differences among proposed systems, they all have
in common
that \emph{peer identifiers} is assigned to participating peers from
a large \emph{identifier space} by the overlay. Furthermore,
application-specific
data items are also assigned globally unique identifiers, \emph{keys},
@@ -346,7 +346,7 @@
is most widely used. For instance, Chord \cite{stoica01chord}, Koorde
\cite{kaashoek03koorde},
Pastry \cite{rowston01pastry}, SWAN \cite{bonsma02swan}, Tapestry
\cite{zhao01tapestry}
and Viceroy \cite{malkhi02viceroy} use circular identifier space of $n$-bit
integers modulo $2^{n}$. The
-value of $n$ varies among systems. Again, CAN \cite{ratnasamy01can} uses a
$d$-dimensional cartesian
+value of $n$ varies among systems. Again, CAN \cite{ratnasamy01can} uses a
$d$-dimensional Cartesian
model to implement identifier space.
To store data into tightly structured overlay, each application-specific
@@ -356,7 +356,7 @@
Also, each peer in tightly structured overlay maintains a \emph{routing
table}, which
consists of identifiers and IP addresses of other peers in the overlay.
Entries of routing
table are peer's neighbors in the overlay network. Figure
\ref{fig:structured_hashing} illustrates the
-process of data to key mapping in tightly strucuted overlays.
+process of data to key mapping in tightly structured overlays.
\begin{figure}
\centering
@@ -378,7 +378,7 @@
distance from $p_j$ to $p_i$) \cite{maymounkov02kademlia}. On the other
hand, Chord's \cite{stoica01chord} distance function does have the property
of unidirection, but doesn't have symmetry. Pastry's \cite{rowston01pastry}
distance
-function supports symmetry, but doesn't support unidirection. As a
concequence,
+function supports symmetry, but doesn't support unidirection. As a
consequence,
Kademlia's \cite{maymounkov02kademlia} XOR-based metric doesn't need
stabilization (like in Chord \cite{stoica01chord}) and backup links
(like in Pastry \cite{rowston01pastry}) \cite{balakrishanarticle03lookupp2p}.
@@ -407,19 +407,19 @@
addressed in order to perform efficient data lookups in tightly structured
overlays.
First, mapping of keys to peers must be done in a load-balanced
way. Second, the overlay must be able to forward a lookup for a
-specific key to an approriate peer. Third, overlay must have a
+specific key to an appropriate peer. Third, overlay must have a
support for a distance function. Finally, routing tables for each peer
must be constructed and maintained adaptively.
Currently, all proposed tightly structured overlays provide at least
-poly--logaritmical data lookup operations. However, there are some key
+poly--logarithmical data lookup operations. However, there are some key
differences in the data structure that they use as a routing table. For
example, Chord
\cite{stoica01chord}, Skip graphs \cite{AspnesS2003} and Skipnet
\cite{harvey03skipnet2} maintain a local
data structure which resembles Skip lists \cite{78977}.
In figure \ref{fig:structured_query}, we present overview of Chord's lookup
process.
On the left side of Chord's lookup process, we show the same data lookup
process
as binary-tree abstraction. We can notice, that in each step, the distance
between
-locarithmic efficiency.
+logarithmic efficiency.
Kademlia \cite{maymounkov02kademlia}, Pastry \cite{rowston01pastry} and
Tapestry
\cite{zhao01tapestry} uses balanced $k$-trees as routing table's data
structure. Figure
@@ -447,12 +447,12 @@
There are three higher level abstractions which tightly structured overlays
provide
-\cite{zhao03api}. Each of these abstractions fulfil a storage layer in an
overlay, but
+\cite{zhao03api}. Each of these abstractions fulfill a storage layer in an
overlay, but
they have semantical differences in the \emph{usage} of overlay. First,
Distributed Hash
Table (DHT) (see e.g., \cite{dabek01widearea}, \cite{rowstron01storage}),
implements three operations: \texttt{lookup(key)}, \texttt{remove(key)} and
\texttt{insert(key)}. As the name suggests, DHT implements the same
functionality
-as a regular hashtable, by storing the mapping between a key and a value.
DHT's
+as a regular hash table, by storing the mapping between a key and a value.
DHT's
\emph{interface} is generic; values can be any size and type. Figure
\ref{fig:Structured_lookup_using_DHT_model}
shows the DHT abstraction of tightly structured overlay. Second, Decentralized
Object Location (DOLR) (see e.g., \cite{kubiatowicz00oceanstore},
\cite{iyer02squirrel}) is distributed
@@ -462,11 +462,11 @@
difference between DHT and DOLR abstraction is that DOLR routes overlay's
messages
to nearest available peer, hosting a specific data item. This form of locality
is not supported by DHT. Finally, tightly structured overlay can be used for
-scalable group multicast/anycast operations (CAST) (see e.g.,
\cite{zhuang01bayeux}).
+scalable group multicast/any cast operations (CAST) (see e.g.,
\cite{zhuang01bayeux}).
The basic operations are \texttt{join(groupIdentifier)},
\texttt{leave(groupIdentifier)},
\texttt{multicast(message, groupIdentifier)}, \texttt{anycast(message,
groupIdentifier)}.
Participating peers may join and leave the group and send multicast messages to
-the group, or anycast message to a specific member of the group. DOLR and CAST
abstraction
+the group, or any-cast message to a specific member of the group. DOLR and
CAST abstraction
have much in common. For instance, they both use network proximity techniques
to optimize their operation in the overlay. Figure
\ref{fig:Strucutred_lookup_using_DOLR_model}
presents basic operation of DOLR abstraction.
@@ -482,14 +482,14 @@
\begin{figure}
\centering
\includegraphics[width=10cm, height=8cm]{DOLR_lookup.eps}
-\caption{Decentralized Object Lcation (DOLR) abstraction of tightly structured
overlay.}
+\caption{Decentralized Object Location (DOLR) abstraction of tightly
structured overlay.}
\label{fig:Strucutred_lookup_using_DOLR_model}
\end{figure}
\subsection{Sketch of formal definition}
-In this subsection we formalize tightly strucured overlay's main features. The
model
+In this subsection we formalize tightly structured overlay's main features.
The model
describes basic features of tightly structured overlay, i.e., identifiers,
identifier
space and mapping function.
@@ -535,7 +535,7 @@
(such as mapping of data items).
To end user, biggest difference between these systems is how data lookups are
performed. Loosely
-structured systems provide much more richier and user friendly way of
searching data as they
+structured systems provide much more richer and user friendly way of searching
data as they
have support for keyword search and fuzzy search. On the other hand, tightly
structured systems support
only exact key lookups as each data item is identified by globally unique keys.
@@ -620,7 +620,7 @@
\parbox{100pt}{Partial}
\\ \hline
-\parbox{90pt}{Possibility for routing hotspots} &
+\parbox{90pt}{Possibility for routing hot spots} &
\parbox{100pt}{No} &
\parbox{100pt}{Yes}
\\ \hline
@@ -636,7 +636,7 @@
\\ \hline
-\caption{Comparison of loosely structured and tighly structured approaches}
+\caption{Comparison of loosely structured and tightly structured approaches}
\label{table_comparison_approach}
@@ -720,7 +720,7 @@
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(n)$} &
\parbox{85pt}{Typical configuration e.g., {4--150}} &
-\parbox{85pt}{Average lookup performance is $O(\log{n})$ with tens of
thousands concurrent users, beyond that, the performace is $O(n)$}
+\parbox{85pt}{Average lookup performance is $O(\log{n})$ with tens of
thousands concurrent users, beyond that, the performance is $O(n)$}
\\ \hline
@@ -763,7 +763,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$2(\log{n})$} &
-\parbox{85pt}{There are two lookup algorithms. The other is $O(\log{n})$,
which is robus under random deletion. The second is $O(\log^2{n})$, which is
also robust under spam generating model}
+\parbox{85pt}{There are two lookup algorithms. The other is $O(\log{n})$,
which is robust under random deletion. The second is $O(\log^2{n})$, which is
also robust under spam generating model}
\\ \hline
@@ -789,7 +789,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$O(\log{n})$} &
-\parbox{85pt}{Plaxton's algortihm is designed to operate in static environment
(e.g., web cache)}
+\parbox{85pt}{Plaxton's algorithm is designed to operate in static environment
(e.g., web cache)}
\\ \hline
\parbox{37pt}{Skip Graphs \cite{AspnesS2003}} &
@@ -797,7 +797,7 @@
\parbox{37pt}{$O(\log{n})$} &
\parbox{37pt}{$O(\log{n})$} &
\parbox{85pt}{$4r(\log{n}) + (\log{n})$, where r=number of resources
provided)} &
-\parbox{85pt}{In this approach, node is treated as 'named resource'; in this
approach, \emph{resources} self-organise (opposite to DHTs)}
+\parbox{85pt}{In this approach, node is treated as 'named resource'; in this
approach, \emph{resources} self-organize (opposite to DHTs)}
\\ \hline
\parbox{37pt}{SkipNet \cite{harvey03skipnet2}} &
@@ -812,7 +812,7 @@
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(n)$} &
-\parbox{85pt}{Can be 1-10000 connections (aka social connections, connections
are permament)} &
+\parbox{85pt}{Can be 1-10000 connections (aka social connections, connections
are permanent)} &
\parbox{85pt}{Connection number depends on node's memory/network capabilities}
\\ \hline
@@ -829,7 +829,7 @@
\parbox{37pt}{$O(1)$} &
\parbox{37pt}{$O(\log^2{n})$} &
\parbox{85pt}{$r(2b+2s+2l)$ (where r=number of resources provided, b=boot
connections, s=short range connections, l=long range connections), typical
connection configuration: 2*(6+7+8)=36} &
-\parbox{85pt}{In this approach, node is treated as 'named resource'; in this
approach, \emph{resources} self-organise (opposite to DHTs)}
+\parbox{85pt}{In this approach, node is treated as 'named resource'; in this
approach, \emph{resources} self-organize (opposite to DHTs)}
\\ \hline
@@ -883,7 +883,7 @@
approach. However, people often misunderstand the scalability problem of
loosely structured
approach; \emph{network} of loosely structured systems is scalable, but the
\emph{data lookup model} is not.
The main concern of tightly structured system is to make overlay's data lookup
-routing more flexible againts hostile attacks. Another key problems in tightly
structured
+routing more flexible against hostile attacks. Another key problems in tightly
structured
systems are the lack of keyword searches, support for heterogeneous peers and
load balancing
\cite{balakrishanarticle03lookupp2p}.
@@ -898,38 +898,38 @@
\subsection{Attacks}
-There are five well known attack models againts Peer-to-Peer systems: Sybil
attack \cite{douceur02sybil},
+There are five well known attack models against Peer-to-Peer systems: Sybil
attack \cite{douceur02sybil},
Fail-stop attack, Spam attack \cite{naor03simpledht}, Byzantine problem
\cite{357176} and \cite{296824}, and
-general Distrubuted Denial of Service attack.
+general Distributed Denial of Service attack.
In Sybil attack model, hostile entity presents multiple
entities. Therefore, one hostile entity can control a large fraction of the
Peer-to-Peer system. Optimal
possible solution to Sybil attack would be that system could \emph{distinct}
entities of the system reliably. Unfortunately,
-currently there no realizable techiques for this task. Partial solutions for
Sybil attack is to replicate
+currently there no realizable techniques for this task. Partial solutions for
Sybil attack is to replicate
and fragment data randomly among several participating peer. However, both
suggestions assume that two different
remote entities are actually different; Sybil attacks are still possible and
therefore, would need centralized
-authority for reliable authentication. As author arques in
\cite{douceur02sybil}, without centralized authority,
+authority for reliable authentication. As author argues in
\cite{douceur02sybil}, without centralized authority,
Sybil attacks are always possible in Peer-to-Peer system except under extreme
and unrealistic assumptions of
resource parity and coordination among entities.
In random fail-stop model, cited in \cite{naor03simpledht}, faulty peer is
deleted from the Peer-to-Peer system.
-The reason for faultyness of peer can be a software failure, a hostile attack,
or external threat such as virus or
-troijan. Closely related to fail-stop model is the Byzantine attack model
-\cite{357176}. Byzantine model can been seen more seveve than fail-stop model
as there are no restrictions over
-the behaviour of faulty peers. Practical, but partial solution for byzantine
failures has been proposed by Castro et
+The reason for faultiness of peer can be a software failure, a hostile attack,
or external threat such as virus or
+Trojan. Closely related to fail-stop model is the Byzantine attack model
+\cite{357176}. Byzantine model can been seen more severe than fail-stop model
as there are no restrictions over
+the behavior of faulty peers. Practical, but partial solution for Byzantine
failures has been proposed by Castro et
al \cite{296824}.
-Spam generating attack is another known attack model againts Peer-to-Peer
system. In Spam
+Spam generating attack is another known attack model against Peer-to-Peer
system. In Spam
attack, hostile or faulty peer may produce false information of the data, or
refuses/is not able to reply to requests.
-Possible solution againts this attack is that peer should not trust to single
entity. Instead, peer should get
+Possible solution against this attack is that peer should not trust to single
entity. Instead, peer should get
information from multiple entities and trust on majority's opinion. This
methods requires more messages to be
sent to network while increasing the load of system. However, if Spam attack
is combined with Sybil attack, obviously
previously mentioned solution doesn't work. Again, more research is required
to solve this attack model
-safely. Naor et al. \cite{naor03simpledht} has proposed a partial solution
againts Spam attack with
+safely. Naor et al. \cite{naor03simpledht} has proposed a partial solution
against Spam attack with
\emph{faulty} peers (not hostile).
-Traditional overload of targeted peers is best known form of distrubuted
Denial of Service attack (DDoS). For example,
-hostile entity can attempt to burden targetted peers with garbage network
packets. As a implication, peers may act
+Traditional overload of targeted peers is best known form of distributed
Denial of Service attack (DDoS). For example,
+hostile entity can attempt to burden targeted peers with garbage network
packets. As a implication, peers may act
incorrectly or stop working. DDoS attack may be very severe, especially if
rate of replication and caching
in Peer-to-Peer system is low. This may lead to data loss in the Peer-to-Peer
system. Daswani et al.
\cite{daswani02queryflooddos} has done research regarding to this subject.
Authors suggest efficient load balancing
@@ -938,13 +938,13 @@
and replicas should be located physically to different locations.
As stated in \cite{naor03simpledht}, an important aspect is that when it comes
to general security aspects and
-byzantine faults in any Peer-to-Peer system, there should be a clear
distinction between attacks on the
-algorihms assuming the construction of overlay is correct, and attacks on the
construction itself. Clearly, Sybil
+Byzantine faults in any Peer-to-Peer system, there should be a clear
distinction between attacks on the
+algorithms assuming the construction of overlay is correct, and attacks on the
construction itself. Clearly, Sybil
and Spam attack belongs to the first category, and rest of the attacks to the
latter category.
\subsection{Trust, data authenticity and integrity}
-Trust in Peer-to-Peer systems is based on \emph{reputation}. Proposed
repuation methods focus either
+Trust in Peer-to-Peer systems is based on \emph{reputation}. Proposed
reputation methods focus either
on the semantic properties, or data management properties of the trust model.
Some research has been
done on reputation models in Peer-to-Peer systems, such as
\cite{aberer01trust}, \cite{cornelli02reputableservents}.
Implementations include Advogato \cite{advogatourl}. None of the current
proposals or implementations
@@ -956,7 +956,7 @@
data in rather \emph{static} computing systems, such as in the Internet.
However, in Peer-to-Peer
network, the problem of key based security mechanism is the maintenance of the
keys as participating
peer constantly join and leave the system. Specifically, the distribution of
key changes comes an essential
-problem in ad hoc enviroments. These include revokation of keys and new key
distribution in hostile
+problem in ad hoc environments. These include revocation of keys and new key
distribution in hostile
environment.
ConChord \cite{ajmani02conchord} is the first Peer-to-Peer system which has a
support for PKI based
@@ -1001,7 +1001,7 @@
Freenet \cite{clarke00freenet}, Publius \cite{pub00}, Free haven
\cite{dingledine00free}, Crowds \cite{reiter98crowds},
Tangler \cite{502002} and upcoming Mnet \cite{mneturl}. Forwarding proxies are
used in Freenet, Crowds and
Free Haven in order to provide various types of anonymity. Tangler and Publius
uses cryptographic
-sharing methods to split a data into data fragments \cite{Shamir1979a}.
Mixmailer networks, such as
+sharing methods to split a data into data fragments \cite{Shamir1979a}. Mix
mailer networks, such as
\cite{mixminionurl}, are commonly used in distributed systems, which are able
to provide some level
of anonymity
@@ -1015,8 +1015,8 @@
Any distributed computing system must support different levels of access
control. For instance, in Peer-to-Peer
system, we may want to restrict the accessibility of data to only limited
amount of participating peers. Yet, Peer-to-Peer
systems doesn't have working and distributed access control scheme. Moreover,
-there has been a lot of violation of copyright laws by users of Peer-to-Peer
filesharing systems. As a
-consequence, some lawsuits has been created againts the companies how have
build popular file-sharing programs.
+there has been a lot of violation of copyright laws by users of Peer-to-Peer
file sharing systems. As a
+consequence, some lawsuits has been created against the companies how have
build popular file-sharing programs.
To our knowledge, Nejdl et al. \cite{nejdl03accesscontrol} have proposed very
recently first practical solution to access
control problem in Peer-to-Peer systems. They use RDF-based schema policies to
restrict access to certain
@@ -1049,7 +1049,7 @@
\cite{castro02securitystructured} and \cite{castro02securerouting}, authors
suggests the usage
of constrained routing tables and diverse routes, and detection of faults
during query routing.
Additionally, authors present a important aspect of tightly structured
approach with regard
-to fault-tolerant query routing: the probability of routing succesfully
between to arbitrary,
+to fault-tolerant query routing: the probability of routing successfully
between to arbitrary,
correct peers, when a fraction $f$ of the other peers are faulty or hostile,
is only $(1-f)^{h-1}$.
Sit and Morris \cite{sit02securitycons} discuss the possibility of allowing
query originator
@@ -1061,7 +1061,7 @@
maintenance, but their solution seems to have to major problems
\cite{castro02securitystructured}. First,
the solution is very expensive even without faulty or hostile entities.
Second, each group of replicas
in their solution must have less than 1/3 of its peer faulty. Thus, this
feature results in a low
-probability of succesful routing.
+probability of successful routing.
Aspnes et al. in \cite{aspnes02faultrouting} and Kaashoek et al. in
\cite{kaashoek03koorde} formally
prove the lower and upper bounds for space requirements of locating a specific
data item in
@@ -1081,10 +1081,10 @@
\subsection{Other security threats}
-Ross Lee graham lists several external threats againts Peer-to-Peer networks
\cite{grahamp2psecurity}. Most important,
-the list includes viruses and trojans. Currently, there are not even partial
solutions
+Ross Lee graham lists several external threats against Peer-to-Peer networks
\cite{grahamp2psecurity}. Most important,
+the list includes viruses and Trojan. Currently, there are not even partial
solutions
to the problems mentioned above. General robustness properties of Peer-to-Peer
system is able to
-deal with software failures and hostile attack, but fault tolerance againts
external threats is unknown.
+deal with software failures and hostile attack, but fault tolerance against
external threats is unknown.
The reason for this is that there are no experiences on these kinds of
attacks. Possible solution
would be distributed anti-virus software, but much more intensive research is
required until
this kind of solution would be applicable.
@@ -1094,7 +1094,7 @@
\section{Performance and usability problems in Peer-to-Peer}
-In this section, we discuss performance related issues regardin Peer-to-Peer
systems.
+In this section, we discuss performance related issues regarding Peer-to-Peer
systems.
\subsection{Efficient data lookup}
@@ -1104,12 +1104,12 @@
In iterative deepening
\cite{yang02improvingsearch}, multiple BFS searches are initiated
with successively larger TTL depth limits, until either the query is
satisfied,
-or the maximumum depth $D$ has been reached. To perform a data lookup, query
-originator starts a data lookup with small TTL value. If the search is not
succesful,
+or the maximum depth $D$ has been reached. To perform a data lookup, query
+originator starts a data lookup with small TTL value. If the search is not
successful,
the query originator increases the TTL value and performs another data lookup.
This
-process is repeated until the desired data is found or maximumum depth $D$
+process is repeated until the desired data is found or maximum depth $D$
has been reached. Expanding ring, proposed by Shenker et al.,
\cite{lv02searchreplication},
-is similar to iterative deepening techique. With these techniques, search
+is similar to iterative deepening technique. With these techniques, search
may not be fast when desired data item requires many consecutive flooding
rounds.
Directed BFS \cite{yang02improvingsearch} optimizes the original
@@ -1123,7 +1123,7 @@
$h$ hops of itself, where $h$ is a system-wide variable, called radius of the
index\footnote{In normal BFS case, the value of $h$ is 0, as peer only has
index
over its local content.}. Mutual index caching architecture, as proposed in
-\cite{osokine02distnetworks}, is one variation of local indices techique.
+\cite{osokine02distnetworks}, is one variation of local indices technique.
In random walk approach \cite{lv02searchreplication}, peer forwards query to
randomly selected neighbor. The basic random walk approach decreases the
@@ -1133,7 +1133,7 @@
multiple ''walkers''. Freenet \cite{clarke00freenet} Peer-to-Peer system uses
random walk searches in query lookups. Indeed, Freenet's query resembles
Depth-First-Search (DFS) and peers' routing tables are dynamically built
-using caching. This is an outcome of Freenet's main design priciples,
+using caching. This is an outcome of Freenet's main design principles,
i.e., anonymity. Another property of Freenet's data lookup model is that
it adapts well with varying usage patterns. Improvements to Freenet's data
lookup using
''small-world phenomenon'' has been proposed by Zhang et al..
\cite{zhang02using}.
@@ -1141,7 +1141,7 @@
Since tightly structured systems have efficient data lookup at the application
level overlay,
current research efforts are focused on proximity based data lookup. In
proximity based data lookup,
-peers try to choose routing-tables entries refering to other peers that are
\emph{nearby} in the
+peers try to choose routing-tables entries referring to other peers that are
\emph{nearby} in the
underlying network. In this way, tightly structured systems are able to
decrease actual
lookup \emph{latency}. CAN \cite{ratnasamy01can}, Kademlia
\cite{maymounkov02kademlia},
Pastry \cite{rowston01pastry} and Tapestry \cite{zhao01tapestry} have advanced
heuristics for
@@ -1164,7 +1164,7 @@
structured systems are able carry out this requirement. Unfortunately, as
discussed in this text,
the data lookup model of loosely structured approach is not scalable. Thus,
research efforts have
been focused on tightly structured approach.
-The main problem with tightly structured approach is the fact that tightly
structured algorihms
+The main problem with tightly structured approach is the fact that tightly
structured algorithms
performs data lookups based on a globally unique identifier (key). Quite
recent study has been focused
on the feasibility of Peer-to-Peer Web-like indexing and searching
\cite{li03feasibility} on top of
tightly structured overlays. Authors argue, that it is possible to implement
Peer-to-Peer Web-like search with certain radical compromises.
@@ -1183,7 +1183,7 @@
distributions\footnote{Zipf distribution is a variant of power-law function.
Zipf-distribution can be used in observation of frequency of occurrence event
$E$, as a function of the rank
$i$ when the rank is determined by the frequency of occurrence, $E_i \sim
\frac{1}{i^{a}}$, where the exponent
-$a$ is close to unity.} (e.g., \cite{breslau98implications}). Therefore,
caching and precomputation
+$a$ is close to unity.} (e.g., \cite{breslau98implications}). Therefore,
caching and pre-computation
can be done for optimizing search indices \cite{li03feasibility}. Regular
compression algorithms,
Bloom filters \cite{362692}, vector space models
\cite{CuencaAcuna2002DSIWorkshop} and view
trees \cite{Bhattacharjee03resultcache} can be used for even better
optimizations. Authors
@@ -1218,7 +1218,7 @@
requires for half of the living nodes at time $t$ to leave the system. The
half-life from
time $t$ is smaller of the properties stated above. The half-life of the
entire system is the
minimum half-life over all times $t$. Concept of half-time can be used as
basic for developing
-more efficient analytical tools for modelling complex Peer-to-Peer system.
+more efficient analytical tools for modeling complex Peer-to-Peer system.
Some research has been done with regard to load balancing properties of
tightly structured
overlays. Byers et al. suggest "power of two choices" whereby data item is
stored at the less loaded
@@ -1226,8 +1226,8 @@
to control load balance in Peer-to-Peer systems \cite{rao03loadbalancing}.
Their work rests on
idea which was originally introduced by Chord \cite{stoica01chord} system.
-Also, query and routing hotspots may be an issue in tightly structured
overlays \cite{ratnasamy02routing}.
-Hotspots happen, when specific key is being requested extremely often in
tightly structured overlays. Recent study
+Also, query and routing hot spots may be an issue in tightly structured
overlays \cite{ratnasamy02routing}.
+Hot spots happen, when specific key is being requested extremely often in
tightly structured overlays. Recent study
by Freedman et al. tries to reduce hot spots in the system by performing
\emph{sloppy} hashing
\cite{sloppy:iptps03}. Another key feature of their work is that peers
self-organize into clusters,
therefore enabling peers to find nearby data without looking up data from
distant peers.
@@ -1238,22 +1238,22 @@
a serious problem of tightly structured overlays in face of performance and
load balancing. Measurement study
by Saroiu et al. shows that there is extreme heterogeneity among participating
peers in already deployed Peer-to-Peer
systems \cite{saroiu02measurementstudyp2p}. Symphony seems to be the first
tightly structured overlay system
-which support hetergeneity. Zhao et al. have proposed a secondary layer a top
of structured overlay
-to support hetergeneity better \cite{zhao02brocade}.
+which support heterogeneity. Zhao et al. have proposed a secondary layer a top
of structured overlay
+to support heterogeneity better \cite{zhao02brocade}.
Research has been done on self-organization. Ledlie et al. propose techniques
for forming and maintaining
groups in highly dynamic environment \cite{ledlie02selfp2p}. Unfortunately
their work relies on idea that
participating peers would create multiple hierarchical groups; it's not clear
whether this approach
is fault-tolerant and suitable to Peer-to-Peer environment. More promising
work has been done by Rowston et al.
-in \cite{rowston03controlloingreliability}. Authors propose techiques for
self-tuning, dealing with
-uncommon conditions (e.g., network partition and high failure rates).
Moreover, authors arque that
+in \cite{rowston03controlloingreliability}. Authors propose techniques for
self-tuning, dealing with
+uncommon conditions (e.g., network partition and high failure rates).
Moreover, authors argue that
with these techniques, the concerns over the tightly structured overlay
maintenance costs are no more
an open issue.
Finally, little research has been done regarding self-monitoring and data
availability. Zhang et al.
describe a arbitrary data structure on top of tightly structured overlay
\cite{zhang03somo}. They
call their proposal as \emph{data overlay}, since it supports several
fundamental data structures.
-Authors use this data overlay to build Self-Organized Metadata Overlay (SOMO),
which can be used
+Authors use this data overlay to build Self-Organized Meta data Overlay
(SOMO), which can be used
for monitoring health of tightly structured overlay. Fault tolerance of SOMO
itself is currently
unknown.
@@ -1272,7 +1272,7 @@
guidelines. This list includes \cite{zhao03api}, \cite{frise02p2pframework},
\cite{babaoglu02anthill}.
Early experiments with Peer-to-Peer benchmarking include
\cite{ratnasamy02routing} and \cite{rhea03benchmarks}.
-\subsection{Social behaviour}
+\subsection{Social behavior}
Frequent assumption in Peer-to-Peer systems is that peers are willing to
cooperate. Another belief
is that all peers would behave equally, i.e., all peers both consume resources
and contributes resources.
@@ -1281,7 +1281,7 @@
\cite{hearn02mojonation}.
Somewhat surprisingly little research has been in this area, especially when
considering
-the possible impact of this \emph{unwanted socical behaviour} to performance
of Peer-to-Peer
+the possible impact of this \emph{unwanted social behavior} to performance of
Peer-to-Peer
system. Problem is addressed by Golle et al. \cite{golle01incentivesp2p}. Some
research has been focused on semantic properties of the overlay in order to
increase
cooperation among participating peers \cite{crespo02semanticoverlay}.
Ramanathan et al.
@@ -1295,7 +1295,7 @@
Very little research has been done on simulating the \emph{global}
Peer-to-Peer system. Presumably, this
is due to complex nature of Peer-to-Peer system, which makes comprehensive
simulations very
-diffucult. Floyd et al. has been studying the simulation of the Internet in
\cite{504642}. Authors
+difficult. Floyd et al. has been studying the simulation of the Internet in
\cite{504642}. Authors
state that simulating the Internet is very challenging task, because of
Internet's heterogeneity
and rapid change. Obviously, these factors exist also in Peer-to-Peer system
even with higher
rates.
@@ -1350,7 +1350,7 @@
\parbox{90pt}{DoS attack \cite{sit02securitycons},
\cite{saia02dynamicfaultcontentnetwork}, \cite{datar02butterflies},
\cite{daswani02queryflooddos}, \cite{juels99clientpuzzles}} &
-\parbox{110pt}{Distributed, controlled burden againts specific computer(s)} &
+\parbox{110pt}{Distributed, controlled burden against specific computer(s)} &
\parbox{110pt}{Client puzzles, load balancing, traffic measurements, traffic
models, replication} &
\parbox{110pt}{Only partial solutions, traffic models most effective}
\\ \hline
@@ -1400,8 +1400,8 @@
\parbox{90pt}{Malicious nodes \cite{sit02securitycons},
\cite{castro02securerouting}} &
\parbox{110pt}{How to identify malicious nodes in the system} &
-\parbox{110pt}{Create invariants for node behaviour, verify invariants,
self-certifying data} &
-\parbox{110pt}{Partial solutions, self-certifying data most realiable}
+\parbox{110pt}{Create invariants for node behavior, verify invariants,
self-certifying data} &
+\parbox{110pt}{Partial solutions, self-certifying data most reliable}
\\ \hline
@@ -1412,7 +1412,7 @@
\\ \hline
-\parbox{90pt}{Inconsistent behaviour \cite{sit02securitycons}} &
+\parbox{90pt}{Inconsistent behavior \cite{sit02securitycons}} &
\parbox{110pt}{Hostile node could act correctly with its neighbors, but
incorrectly with others} &
\parbox{110pt}{Public keys, digital signatures} &
\parbox{110pt}{Not practical approach/working proposal created yet}
@@ -1422,13 +1422,13 @@
\parbox{90pt}{Hostile groups \cite{castro02securerouting}} &
\parbox{110pt}{Joining node may join parallel network, formed a group of
hostile nodes, hostile node(s) controls the construction of the network} &
\parbox{110pt}{Use trusted nodes, based on history information, Cryptography,
key infrastructure} &
-\parbox{110pt}{Not 100\% sure if Centreal Authority (CA) is missing, not
practical approach/working proposal created yet}
+\parbox{110pt}{Not 100\% sure if Central Authority (CA) is missing, not
practical approach/working proposal created yet}
\\ \hline
\parbox{90pt}{External security threats} &
-\parbox{110pt}{Viruses, trojans, sniffers} &
-\parbox{110pt}{Data integrity/authenticity, distributed antivirus software} &
+\parbox{110pt}{Viruses, Trojan, sniffers} &
+\parbox{110pt}{Data integrity/authenticity, distributed anti virus software} &
\parbox{110pt}{Not much research has been done on this}
\\ \hline
@@ -1472,7 +1472,7 @@
\parbox{90pt}{Efficient and scalable data discovery
\cite{lv02searchreplication}, \cite{osokine02distnetworks},
\cite{yang02improvingsearch}, \cite{lv02gnutellascalable},
\cite{ganesan02yappers}, \cite{adamic02localsearch},
\cite{adamic01powerlawsearch}, \cite{ripeanu02mappinggnutella},
\cite{milgram67smallworld}, \cite{adamic99small}, \cite{sterling95beowulf},
\cite{ramanathan02goodpeers}, \cite{kleinberg99small}, \cite{nips02-Kleinberg},
\cite{zhang02using}, \cite{watts00dynamics}} &
\parbox{110pt}{Find resources efficiently, if resource exists (loosely
structured)} &
-\parbox{110pt}{Super nodes, node clusters, caching techiques} &
+\parbox{110pt}{Super nodes, node clusters, caching techniques} &
\parbox{110pt}{More efficient, less network traffic, not comparable to DHT's
efficiency}
\\ \hline
@@ -1498,7 +1498,7 @@
\\ \hline
-\parbox{90pt}{Data availability/persistency \cite{bhagwan03availability}} &
+\parbox{90pt}{Data availability/persistence \cite{bhagwan03availability}} &
\parbox{110pt}{Data might be temporarily unavailable, or lost permanently} &
\parbox{110pt}{Data caching, data replication} &
\parbox{110pt}{Working solutions, but creates more traffic and overhead per
node}
@@ -1507,7 +1507,7 @@
\parbox{90pt}{Network proximity \cite{pias03lighthouse},
\cite{ng02predicting}, \cite{ratnasamy02ght}, \cite{eriksson03peernet},
\cite{castro02networkproximity}} &
\parbox{110pt}{Can we take account the underlying network's properties better
when forming overlay network (network-awareness for performance) ?} &
-\parbox{110pt}{Global Network Positioning, Lighthouse technique, trianqulated
heuristics} &
+\parbox{110pt}{Global Network Positioning, Lighthouse technique, triangulated
heuristics} &
\parbox{110pt}{Increases system complexity, no real world experience in a wide
scale, proposed solutions are susceptible to single point of failure}
\\ \hline
@@ -1519,15 +1519,15 @@
\\ \hline
-\parbox{90pt}{Hotspots \cite{258660}, \cite{sloppy:iptps03},
\cite{maymounkov03ratelesscodes}} &
+\parbox{90pt}{Hot spots \cite{258660}, \cite{sloppy:iptps03},
\cite{maymounkov03ratelesscodes}} &
\parbox{110pt}{What will happen if some resource is extremely popular and only
one node is hosting it ?} &
-\parbox{110pt}{Caching, multisource downloads, replication, load balancing,
sloppy hashing} &
-\parbox{110pt}{For query hotspots, caching and multisource downloads
efficiently reduces hotspots, for routing hotspots, benefits are smaller}
+\parbox{110pt}{Caching, multi source downloads, replication, load balancing,
sloppy hashing} &
+\parbox{110pt}{For query hot spots, caching and multi source downloads
efficiently reduces hot spots, for routing hot spots, benefits are smaller}
\\ \hline
\parbox{90pt}{Load balancing \cite{rao03loadbalancing},
\cite{ledlie02selfp2p}, \cite{byers03dhtbalancing}} &
-\parbox{110pt}{Random (but uniformly distributed) identifier selection could
cause systen imbalance among participants with different capabilities} &
+\parbox{110pt}{Random (but uniformly distributed) identifier selection could
cause system imbalance among participants with different capabilities} &
\parbox{110pt}{Caching, virtual server transfers} &
\parbox{110pt}{Effective, more research required in fully dynamic environment}
\\ \hline
@@ -1535,12 +1535,12 @@
\parbox{90pt}{System in flux \cite{libennowell01observations}, \cite{571863},
\cite{ledlie02selfp2p}, \cite{albert-02-statistical}} &
\parbox{110pt}{Nodes join and leave system constantly. What about load
balancing and performance ?} &
\parbox{110pt}{Half-life phenomenon (for analysis), simple overlay maintenance
and construction algorithm} &
-\parbox{110pt}{Initial theoretical analysis have been created, but not
comprehensive model for analysing different system states and its variations
(e.g. complex usage patterns)}
+\parbox{110pt}{Initial theoretical analysis have been created, but not
comprehensive model for analyzing different system states and its variations
(e.g. complex usage patterns)}
\\ \hline
\parbox{90pt}{Sudden network partition \cite{harvey03skipnet1},
\cite{harvey03skipnet2}, \cite{rowston03controlloingreliability}} &
\parbox{110pt}{Sub network is isolated from other network because of network
disconnection} &
-\parbox{110pt}{Self-tuning, environment observatorion, localized network
connection for minimun latency (backup connections)} &
+\parbox{110pt}{Self-tuning, environment observatorion, localized network
connection for minimum latency (backup connections)} &
\parbox{110pt}{Creates more overhead/space requirements per node}
\\ \hline
@@ -1554,7 +1554,7 @@
\parbox{90pt}{Byzantine faults \cite{296824}} &
\parbox{110pt}{Faulty nodes may behave arbitrarily} &
\parbox{110pt}{Byzantine replication algorithms -> get information from
multiple entities, trust majority's opinion} &
-\parbox{110pt}{Much research has been done on this field, practical solutions,
decreases the performance of system slighly}
+\parbox{110pt}{Much research has been done on this field, practical solutions,
decreases the performance of system slightly}
\\ \hline
\caption{Performance and usability problems in Peer-to-Peer.}
@@ -1619,14 +1619,14 @@
\parbox{90pt}{Comprehensive simulations/analysis of Peer-to-Peer network} &
\parbox{110pt}{Ability to simulate whole Peer-to-Peer network's usage
patterns, network traffics, flux state etc} &
-\parbox{110pt}{Use same techniques as simulating/analysing the Internet} &
-\parbox{110pt}{Only small subset of Peer-to-Peer networks has been able to
analyse, because of ad hoc properties of network, more poweful solutions needed}
+\parbox{110pt}{Use same techniques as simulating/analyzing the Internet} &
+\parbox{110pt}{Only small subset of Peer-to-Peer networks has been able to
22, because of ad hoc properties of network, more powerful solutions needed}
\\ \hline
\parbox{90pt}{Overlay management and health monitoring \cite{zhang03somo}} &
\parbox{110pt}{System is self-capable to monitor it's status and health for
better performance} &
-\parbox{110pt}{Build a metadata overlay atop of structured overlay (such as
SOMO for structured overlays), make local decisions about overlay
(unstrucured)} &
+\parbox{110pt}{Build a meta data overlay atop of structured overlay (such as
SOMO for structured overlays), make local decisions about overlay
(unstructured)} &
\parbox{110pt}{For structured overlays, efficient and simple to implement,
fault-tolerance unknowns, for unstructured, not necessarily efficient because
decisions are based on local knowledge}
\\ \hline
@@ -1653,8 +1653,8 @@
\section{Overview}
Fenfire project \cite{fenfireurl} is an effort to build a distributed,
hyperstructured user
-interface system. Fenfire is free software and it is licenced under GNU L-GPL.
Fenfire's main goal
-is to implement xanalogical storage model \cite{ted-xu-model}. Fenfire was
formely also a implementation
+interface system. Fenfire is free software and it is licensed under GNU L-GPL.
Fenfire's main goal
+is to implement xanalogical storage model \cite{ted-xu-model}. Fenfire was
formerly also a implementation
of the ZigZag\texttrademark --structure, which was originally invented
by Ted Nelson. Now, however, Fenfire uses Resource Description Framework (RDF)
\cite{w3rdfurl}
for representing internal data structures and their relationships.
@@ -1684,7 +1684,7 @@
scenario: ''the character 'D' typed by Janne Kujala on 10/8/97 8:37:18''. In
this
example, when character 'D' is is first typed in, xanalogical storage model
acquires a permanent identifier for that character and retains it when
character
-is copied to different document. Thus, the identifier distinguishes
chararacter from
+is copied to different document. Thus, the identifier distinguishes character
from
all similar characters typed in independently\footnote{Xanalogical storage
model
is not limited to text. It can support arbitrary data, e.g., pixels of picture
or
frames of video.}. The connectivity in xanalogical storage model between data
content
@@ -1725,7 +1725,7 @@
are immutable byte sequences. SHA-1\footnote{SHA-1 is considered a collision
free
hash function. Therefore, it is very unlikely that two different Storm data
blocks
would have same identifier.} cryptographic content hash \cite{fips-sha-1} is
used
-for creating locatiotion-independent, globally unique identifiers for blocks.
Additionally,
+for creating location-independent, globally unique identifiers for blocks.
Additionally,
SHA-1 \cite{fips-sha-1} is used for verifying the integrity of Storm data
blocks. Storm
blocks have much in common with regular files, except Storm blocks are
\emph{immutable} as
any change to the byte sequence would the change block's hash value, i.e.,
unique
@@ -1755,7 +1755,7 @@
pointer blocks, i.e., when a new version of scroll block is created, it
supersedes
one older version which has been created in the past. The most current pointer
block will 'obsolete' the pointer block targeting the superseded version. Next
-time, when the pointer is used for refering to a specific scroll block, only
+time, when the pointer is used for referring to a specific scroll block, only
the most recent pointer's block target is loaded.
\begin{figure}
@@ -1796,7 +1796,7 @@
\emph{direct} scroll block obtaining using globally unique identifier of Storm
scroll block,
we also must support \emph{indirect} obtaining of Storm scroll block using
pointer blocks.
-Obviously, our objectives are yet simple but hard to fulfil. First, as a
prerequisite
+Obviously, our objectives are yet simple but hard to fulfill. First, as a
prerequisite
to implementing xanalogical storage model in Peer-to-Peer environment, system
supporting data lookups must be able to perform \emph{global} scale lookups.
Thus,
we must able to locate and fetch Storm scroll/pointer block, if it exists in
the
@@ -1804,7 +1804,7 @@
one ''virtual file'' may need obtaining several data items, which are
distributed
randomly throughout the overlay; if not efficient, construction of ''virtual
file''
may take reasonable amount time while rendering system very unusable. Third,
Peer-to-Peer
-infrasctructure has to be scalable and robust againts hostile attacks.
+infrastructure has to be scalable and robust against hostile attacks.
Some research regarding to these problem has been made by Lukka et al.
\cite{lukka02freenetguids}. Authors' work is mainly based on insight of
implementing
@@ -1824,8 +1824,8 @@
\section{Evaluation of Peer-to-Peer approaches with regard to Fenfire}
In chapter 2, we discussed main differences between loosely and tightly
structured
-approaches. As stated, the most significant difference is that tighly
structured
-approach has logarithmical properties in all interal operations, while loosely
+approaches. As stated, the most significant difference is that tightly
structured
+approach has logarithmical properties in all internal operations, while
loosely
structured approach doesn't have always even linear properties. Furthermore,
the
data lookup model of tightly structured overlay scales much better than
loosely
structured overlays; tightly structured overlay supports global data lookups
@@ -1843,23 +1843,23 @@
Domain Name System (DNS) \cite{rfc1101} is widely used RRS system in the
Internet.}
\cite{balakrishnan03semanticfree}. Authors argue that next generation RRS
must be
application-independent and references itself should be \emph{unstructured}
and
-\emph{semantic free}. Finally, as said, with tightly stuctured systems, it is
feasible to
+\emph{semantic free}. Finally, as said, with tightly structured systems, it is
feasible to
perform \emph{global} data lookups in the overlay. To summarize, these aspects
may be the most important features
of Peer-to-Peer infrastructure with regard to Fenfire as a \emph{distributed}
hypermedia system.
Thus, we see the tightly structured approach the best alternative to locate
data in Peer-to-Peer
environment.
-Once located, for \emph{fetching} Fenfire related data from the overlay, we
can use reqular
+Once located, for \emph{fetching} Fenfire related data from the overlay, we
can use regular
TCP/IP-protocols, such as Hypertext Transfer protocol (HTTP) \cite{rfc2068}.
However, HTTP-protocol may
not be optimal, when obtaining large amounts of data from the Peer-to-Peer
overlay, for
-instance videos, images or music. In this case, multisource downloads can be
very useful
+instance videos, images or music. In this case, multi source downloads can be
very useful
for better efficiency \cite{maymounkov03ratelesscodes}, \cite{bittorrenturl}.
Furthermore,
-multisource downloads can be used for decreasing load of certain peer, thus
avoiding query
-hotspots in the system \cite{ratnasamy02routing}. Current mplementation of
Fenfire uses
+multi source downloads can be used for decreasing load of certain peer, thus
avoiding query
+hot spots in the system \cite{ratnasamy02routing}. Current implementation of
Fenfire uses
standard single source downloads (HTTP) and SHA-1 \cite{fips-sha-1}
cryptographic content
hash for verifying the integrity of data by recomputing the content hash
-for a scroll block. In face of multisource downloads, Fenfire must support
-tree-based hash\footnote{With multisource downloads, tree based hash functions
can be used
+for a scroll block. In face of multi source downloads, Fenfire must support
+tree-based hash\footnote{With multi source downloads, tree based hash
functions can be used
to verify fixed length segments of data. If hash value of data segment is
incorrect,
we need only to fetch \emph{segment} of data (instead of whole data, e.g., a
file) from
other source.}, such as \cite{merkle87hashtree} and \cite{mohr02thex} for
reliable and efficient
@@ -1876,24 +1876,24 @@
overlays \cite{projectirisurl}.
-\section{Fenfire system model in Peer-to-Peer enviroment}
+\section{Fenfire system model in Peer-to-Peer environment}
In this section present a proposal of Fenfire Peer-to-Peer system, which
consists
-of several techologies presented in this thesis. Then, we introduce yet
simple but
+of several technologies presented in this thesis. Then, we introduce yet
simple but
effective algorithms for obtaining Fenfire data from Peer-to-Peer environment.
\subsection{System proposal}
We see Kademlia \cite{maymounkov02kademlia} as the best algorithm for
locating data efficiently in the Peer-to-Peer overlay. There are two main
-reasons for this. First, Kamdelia's XOR-based distance function is superior
+reasons for this. First, Kademlia's XOR-based distance function is superior
over the distance functions of other systems. Second, there are already some
real-life systems (e.g., \cite{overneturl}, \cite{edonkey2kurl},
\cite{kashmirurl},
\cite{kato02gisp}), which means that Kademlia's algorithm is simple and easy
to implement.
On top of Kademlia, we propose the usage of Sloppy hashing
\cite{sloppy:iptps03} which
optimized for DOLR abstraction of tightly structured overlays. With Sloppy
hashing,
-we are able reduce of generation of query hotspots. Sloppy hashing enables to
+we are able reduce of generation of query hot spots. Sloppy hashing enables to
locate nearby data without looking up data from distant nodes. Moreover,
authors'
proposal for self-organizing clusters using network diameters may be useful,
especially within small groups of working people. Thus, with Sloppy hashing
@@ -1906,14 +1906,14 @@
Finally, for more efficient data transfer, we can use variable techniques for
this purpose.
For small amounts of data, HTTP can be used \cite{rfc2068}. For big downloads,
we can use
-multisource downloads for better efficiency and reliability. Specifically,
techology based
-on rateless erasure codes \cite{maymounkov03ratelesscodes} seems very
promising.
+multi source downloads for better efficiency and reliability. Specifically,
technology based
+on rate less erasure codes \cite{maymounkov03ratelesscodes} seems very
promising.
\subsection{Algorithms}
We use DOLR abstraction of tightly of structured approach, i.e., each
participating peer hosts
-the data and overlay maintains only the \emph{pointers} to the data. We
descided to use DOLR in our
-model, since DOLR systems locate data without specifiying a storage policy
explicity \cite{rhea03benchmarks}.
+the data and overlay maintains only the \emph{pointers} to the data. We
decided to use DOLR in our
+model, since DOLR systems locate data without specifying a storage policy
explicitly \cite{rhea03benchmarks}.
DHT based storage systems, such as CFS \cite{dabek01widearea} and PAST
\cite{rowstron01storage}, may have
critical problems with load balancing in highly heterogeneous environment.
This problem is caused by peers
which may not able to store relative great amount of data with key/value pair,
assigned randomly by
@@ -1924,7 +1924,7 @@
''virtual file'' before hand, i.e., when assembling a ''virtual file'', we
know all Storm
scroll/pointer blocks, which are required when building the ''virtual file''.
Also, we don't
respond to security issues related to Peer-to-Peer systems, since there is no
working solution
-available yet; we either assume that Fenfire has a reliable techique for
identifying invidual entities, or
+available yet; we either assume that Fenfire has a reliable technique for
identifying individual entities, or
there are no hostile entities among participating peers.
In our model, each peer maintains following data structures for local
operations: data structure for listing all
@@ -1975,7 +1975,7 @@
Figure \ref{fig:storm_query_urn5} illustrates how Storm scroll block is
located
in a tightly structured overlay using DOLR method, where pointer random string
is known.
-Each of these algortihms can locate Fenfire related data in $O(\log{n})$ time:
+Each of these algorithms can locate Fenfire related data in $O(\log{n})$ time:
$O(\log{n})$ time for query routing to pointer peer and constant time for
locating hosting peer with a given reference link. Time required for
transferring
the data is not included.
@@ -1999,7 +1999,7 @@
\subsection{Problems}
Perhaps the most biggest issue in Peer-to-Peer systems is non-maturity of
-security techologies. For instance, online entities cannot be identified
+security technologies. For instance, online entities cannot be identified
safely (e.g., the Sybil attack \cite{douceur02sybil}). For Fenfire, one
security related problem occurs when user wants to perform global data lookup
with a given
pointer random string; how user is able to verify the correctness
@@ -2009,7 +2009,7 @@
from the system. How do we are able to know if this was a spam attack, or the
data really doesn't exist in the system ? Another problem related to Fenfire's
security is that if a user downloads data from the network to local computer
-and after network disconnetcion, user wants to verify \emph{offline} the
+and after network disconnection, user wants to verify \emph{off line} the
authenticity of data. Obviously, optimal solution to all security issues would
be that digital signatures are included to every message sent to the system.
However, these problems are not only limited to Fenfire, it concerns all
@@ -2052,13 +2052,13 @@
links in a Peer-to-Peer network. Specifically, we want to find transclusions
or xanalogical links in a global scale. Preliminary analysis have showed
that these questions are rather different than locating scroll or pointer
-blocks \emph{directly} from the network. Techiques used in distributed
+blocks \emph{directly} from the network. Techniques used in distributed
database systems may prove to be useful. Some fundamental results
regarding Peer-to-Peer and database systems has already been
presented \cite{gribble01p2pdatabase}.
-As security techologies comes more mature, we wish to apply these
-techologies with Fenfire, if applicable.
+As security technologies comes more mature, we wish to apply these
+technologies with Fenfire, if applicable.
In the following months, we will implement a Fenfire Peer-to-Peer
prototype.
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., (continued)
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/05
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/06
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/07
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...,
Hermanni Hyytiälä <=
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/12
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13
- [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert..., Hermanni Hyytiälä, 2003/03/13