gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...


From: Hermanni Hyytiälä
Subject: [Gzz-commits] gzz/Documentation/misc/hemppah-progradu mastert...
Date: Wed, 05 Mar 2003 05:25:37 -0500

CVSROOT:        /cvsroot/gzz
Module name:    gzz
Changes by:     Hermanni Hyytiälä <address@hidden>      03/03/05 05:25:36

Modified files:
        Documentation/misc/hemppah-progradu: masterthesis.tex 

Log message:
        Problem overview

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/gzz/Documentation/misc/hemppah-progradu/masterthesis.tex.diff?tr1=1.112&tr2=1.113&r1=text&r2=text

Patches:
Index: gzz/Documentation/misc/hemppah-progradu/masterthesis.tex
diff -u gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.112 
gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.113
--- gzz/Documentation/misc/hemppah-progradu/masterthesis.tex:1.112      Wed Mar 
 5 02:45:33 2003
+++ gzz/Documentation/misc/hemppah-progradu/masterthesis.tex    Wed Mar  5 
05:25:36 2003
@@ -1,5 +1,5 @@
 %***********************
-%   Käytetään gradu2-tyyliluokkaa
+%   K?ytet??n gradu2-tyyliluokkaa
 %***********************
 \documentclass[a4paper,12pt, english]{gradu2}
 
@@ -15,13 +15,13 @@
 \usepackage{verbatim}
 
 %***********************
-%   Tyyliluokan pakolliset määritykset
+%   Tyyliluokan pakolliset m??ritykset
 %***********************
 \title{Fenfire in Peer-to-Peer Enviroment}
 
-\translatedtitle{Fenfire vertaisverkko ympäristössä}
+\translatedtitle{Fenfire vertaisverkko ymp?rist?ss?}
 
-\author{Hermanni Hyytiälä}
+\author{Hermanni Hyyti?l?}
 
 \linja{Software Engineering}
 
@@ -29,13 +29,13 @@
 
 \keywords{Peer-to-Peer, P2P, security, Distributed systems, Hypermedia systems}
 
-\avainsanat{Vertaisverkot, P2P, tietoturva, hajautetut järjestelmät, 
hypermedia-järjestelmät}
+\avainsanat{Vertaisverkot, P2P, tietoturva, hajautetut j?rjestelm?t, 
hypermedia-j?rjestelm?t}
 
 \contactinformation{\\
-Hermanni Hyytiälä\\
+Hermanni Hyyti?l?\\
 Huhtalammentie 5 as. 17\\
-40640 JYVÄSKYLÄ\\
-sähköposti: address@hidden
+40640 JYV?SKYL?\\
+s?hk?posti: address@hidden
 
 
 \abstract{
@@ -51,23 +51,23 @@
 related data from Peer-to-Peer network.
 }
 \tiivistelma{
-Tässä opinnäytetyössä arvioimme olemassaolevia vertaisverkkoja, protokollia ja
+T?ss? opinn?ytety?ss? arvioimme olemassaolevia vertaisverkkoja, protokollia ja
 niiden erityisominaisuuksia. Teemme yhteenvedon olemassa olevista ongelmista
-vertaisverkoissa ja jaamme ongelmat kolmeen alakategoriaan. Havaitsemme, että
+vertaisverkoissa ja jaamme ongelmat kolmeen alakategoriaan. Havaitsemme, ett?
 on olemassa useita ongelmia, joihin ei ole ratkaisua lainkaan, tai on ehdotelma
-ongelman ratkaisemiseksi, mikä käytännössä on kuitenkin mahdotonta toteuttaa.
+ongelman ratkaisemiseksi, mik? k?yt?nn?ss? on kuitenkin mahdotonta toteuttaa.
 
-Tämän jälkeen annamme yleiskuvan Fenfire-järjestelmästä.
-Arvioimme olemassaolevia vertaisverkkoarkkitehtuureja-- löyhästi ja tiukasti
-rakennettuja päällysverkkoja-- Fenfiren vaatimusten valossa. Lopuksi ehdotamme
-yksin-kertaisia algoritmeja, joiden avulla voidaan tehokkaasti löytää
+T?m?n j?lkeen annamme yleiskuvan Fenfire-j?rjestelm?st?.
+Arvioimme olemassaolevia vertaisverkkoarkkitehtuureja-- l?yh?sti ja tiukasti
+rakennettuja p??llysverkkoja-- Fenfiren vaatimusten valossa. Lopuksi ehdotamme
+yksin-kertaisia algoritmeja, joiden avulla voidaan tehokkaasti l?yt??
 vertaisverkosta Fenfire:n kannalta olennaista tietoa
 }
 
 \begin{document}
 
 %***********************
-%   Sisällysluettelo
+%   Sis?llysluettelo
 %***********************
 
 \mainmatter
@@ -338,6 +338,8 @@
 Skip Graphs \cite{AspnesS2003}, SkipNet \cite{harvey03skipnet2}, 
 Symphony \cite{gurmeet03symphony}, SWAN \cite{bonsma02swan}, Tapestry 
 \cite{zhao01tapestry}, Viceroy \cite{malkhi02viceroy} and others 
\cite{freedman02trie}. 
+The biggest difference compared to loosely structured approach is that with 
tightly structured systems,
+it is now feasible to perform \emph{global} data lookups in overlay.
 While there are significat differences among proposed systems, they all have 
in common
 that participating peers are assigned \emph{peer identifiers} from
 a large \emph{identifier space}. Furthermore, application-specific
@@ -496,7 +498,7 @@
 \exists s \in S$, $ip = map(identifier(s)) \wedge (provider(s) = p)$\}., 
 which means that resources which peer provides into the system are not kept 
locally.
 Every $p$ has neighbor(s), named as $neighbor$, which are $P$ = \{$p \in P: 
\exists neighbor$, 
-where $difference(p,p_neighbor)= ''close''$, where  $''close''$ is small 
difference $d$ in $(IS,d)$\}.
+where $difference(p,p_neighbor)= ''close''$, where ?$''close''$ is small 
difference $d$ in $(IS,d)$\}.
 
 
 \section{Summary}
@@ -1751,57 +1753,37 @@
 
 \chapter{Evaluation of Peer-to-Peer for Fenfire}
 
-In this chapter we evaluate Fenfire in Peer-to-Peer enviroment.
-We start by giving an overview of Fenfire Peer-to-Peer system. Then, 
-we define our objectives and special needs and consider possible
-benefits over existing Peer-to-Peer filesharing systems. Finally, we
-evaluate different peer-to-peer systems with regard to Fenfire, make 
-initial analsysis and discuss possible problems. In the following
-sections, we don't respond security issues. In fact, we assume 
-that system has a working distributed Public Key Infrastructure
-for identifying invidual entities.
-
-
-Fething videos, pictures, music and other media:
-
--BitTorrent is not a lookup system-- it's a tool for *distributing* data
--MIT License, written in Python
--resembles Overnet's/ed2k's MFTP
--supports resumed downloads
--supports upload-download ratios (if this is a built in feature, this might be 
a problem for us (if used as out-of-box))
--however, upload-download ratios reduces DoS attacks: one cannot publish a 
large amount hokum data in the network if ratios are used
--reduces query hotspots (DHTs), however routing hotspots remain
--when more than one person is downloading from a server/node/computer, 
BitTorrent sends blocks of the files to downloaders
--this reserves the server's/node's/computer's bandwidth
--supports simultaenous downloads
--since each new downloader introduces a new upload capacity, eventually the 
upload overhead for the original server/node/computer bandwidth is reasonable 
small
--uses SHA-1 for checking data integrity (suits for us very well!)
--we don't have to create 'mini-blocks' for Fenfire p2p, since bitTorrent 
itself partitions data into several blocks for us
-
-5.1 The Merkle Hash Tree
-
-The Merkle Hash Tree, invented by Ralph Merkle, is a hash construct that has 
very nice properties for 
-verifying the integrity of files and file subranges in an incremental or 
out-of-order fashion. This 
-document describes a binary serialization format for hash trees that is 
compact and optimized for both 
-sequential and random access. This memo has two goals:
-
-   1. To describe Merkle Hash Trees and how they are used for file integrity 
verification.
-   2. To describe a serialization format for storage and transmission of hash 
trees.
-   
-   
-   
-   
-   \cite{overneturl}
-   \cite{edonkey2kurl}
-   
-
-
-\section{Overview}
-
-As we mentioned in previous chapter (ref), xanalogical document is a virtual
-document, in where parts of the document (i.e. scroll blocks) are fetched from 
a 
-global, large scale data pool. To be practical and useful, system implementing
-xanalogical model have to be distributed. 
+In this chapter we evaluate Fenfire in Peer-to-Peer environment.
+We start by giving a problem overview when considering Fenfire in Peer-to-Peer
+environment. We define Fenfire's objectives and special needs. Finally, we
+evaluate different peer-to-peer approaches with regard to Fenfire, and propose 
+initial algorihms for obtaining Fenfire specific data from Peer-to-Peer overlay
+network.  
+
+In the following sections, we don't respond to security issues. We assume 
+that either system has a reliable techique for identifying invidual entities, 
or
+there are no hostile entities in the system.
+
+
+\section{Problem overview}
+
+As already mentioned in chapter 4, xanalogical document is a ''virtual
+file'', in which parts of the document (i.e. scroll blocks) are fetched from a 
+\emph{global} data repository. Thus, system implementing xanalogical model 
\emph{must}
+support global data lookups efficiently in order to assemble a ''virtual file''
+from fragments of data. 
+
+In the xanalogical storage model, each fragment of data is identified by a 
globally 
+unique identifier. As we discussed already in chapter 4, Fenfire's Storm 
design 
+uses SHA-1 \footnote{SHA-1 is considered a collision free hash function. 
Therefore, it is 
+very unlikely that two different Storm scroll blocks would have same 
identifier.} 
+\cite{fips-sha-1} hash over the contents of a block for creating globally 
unique 
+identifiers.  In our scenario, fragments of data is distributed throughout the 
Peer-to-Peer
+overlay. Moreover, our task is to locate and fetch (i.e. obtain) \emph{all}
+Storm scroll blocks, associated to a specific ''virtual file'', from 
Peer-to-Peer
+overlay as efficiently as possible. In addition to \emph{direct} scroll block 
obtaining using
+globally unique identifier of Storm block, we also must support 
\emph{indirect} obtaining of 
+Storm scroll block using pointer blocks.
 
 We have decided to use Peer-to-Peer network as a Fenfire's communication 
layer. For
 motivations and discussion, see \cite{lukka02freenetguids, 
fallenstein03storm}. 
@@ -1812,10 +1794,7 @@
 data lookup to locate and fetch scroll blocks by using 
\emph{location-independent} 
 identifiers.
 
-Storm uses SHA-1 \cite{fips-sha-1} hash over the contents of a block for 
creating globally 
-unique identifiers. SHA-1 is considered a collision free hash function, so 
it's very unlikely 
-that two different scroll blocks would have same identifiers. Obviously, Storm 
has to 
-efficiently find a specific scroll block if it is available in the network.
+
 
 \cite{thompson01hypermedia}
 \cite{wiil02p2phypertext}
@@ -1824,53 +1803,8 @@
 
 \section{Objectives}
 
-We use
 
-2.1  "Searching for a specific Storm block"
-
-2.1.1 Facts
--specific Storm block can be identified with an ID, generated by SHA-1 
algorithm
--block ID is globally *unique*, e.g. "Front page of New York Times newspaper 
on 10.10.2002"
--each Storm node can host multiple blocks
--in my thesis, we can assume that there is working PKI
-
-2.1.2. Objectives
--if block exists in the network, algorithm *will* find it
--algorithm will find the specific Storm block as quicly as possible (and 
return it to the user) based
-on the the block's ID
--simple pseudo thinking: "based on block ID, find the specific block fast and 
return it."
-
-2.2. "Searching for most recent Storm block associated with specific urn-5 
name, where
-     the block has been signed with a given key"
-
-2.2.1. Facts     
--urn-5 is random "unique keyword", e.g. "Front page of New York Times 
newspaper"
--urn-5 can't be udated, but we can associate a new block with the urn-5 name
--urn-5 name is saved in the header of Storm block (|block urn-5: "Front page 
of New York Times newspaper"|)
--urn-5 names can be created before Storm blocks
--key signings are saved in separate Storm blocks
--finding key blocks for data block can be performed locally ("fast enough")
--node has an internal index for key blocks and associated data blocks
--for every urn-5 name, there is zero, one or more Storm blocks associated with 
it
-%-in every block's header, there is a timestamp for date&time
-
-2.2.2. Objectives
-
--Find the specific (and the most recent) Storm block as quicly as possible 
(and return it to the user) based on the given urn-5 with a block's ID
--simple pseudo thinking: "find all Storm blocks from the network, which uses 
specific urn-5 name. Compare blocks and 
-return the most recent block, if the signing key is "valid"."
-
-2.3.1. Facts
--same as for 2.2.1.
-
-2.3.2. Objectives
--Find the specific (most recent) Storm block as quicly as possible (and return 
it to the user) based on the given urn-5 with a block's ID
-*and* a given date (range ?)
--simple pseudo thinking: "find all Storm blocks from the network, which uses 
specific urn-5 name. Compare blocks and 
-return the most recent block, if the signing key and block's timestamp are 
"valid"."
-
-
-\section{Special needs}
+\section{Requirements}
 
 Storm (and therefore Fenfire) has several unique features which postulates 
 different kind of requirements for Peer-to-Peer system. First, Storm stores 
data
@@ -1929,7 +1863,7 @@
 \emph{unstructured} and \emph{semantic free}. Indeed, these cases are one of 
the 
 objectives of our Storm design.
 
-On the other hand, however, tightly structured approach doesn't have large 
scale,
+On the other hand, however, tightly structured approach doesn't have large 
scale
 real world experiments yet. Therefore, we can't say for sure, how well tightly
 structured overlay would perform in every day use. The main concerns are 
decreased
 performance, support for heterogeneity and load balancing in system in flux (as
@@ -1954,90 +1888,6 @@
 which are essential to xanalogical model to be usable in distributed 
environment.
 Table \ref{table_comparison_approach} lists the key feature of both approaches.
 
-
-
-
-
-
-
-
-
-
-
-2.2.3. Existing approaches and this specific research problem
-
--First, *all* blocks (with the specific urn-5 name) have to checked and 
analysed 
--In this case (urn-5), there is no benefit from the block's ID at all (DHT, 
SWN)
-
-2.2.4. Open questions:
--in this case, is it sensible to maintain two different key-value mappings 
key-value based systems (DHT, SWN) ?l
-       -one fore block IDs
-       -one for urn-5 names, which are associated with block IDs
-       -is this approach too difficult to maintain ?
-
--is there possibility, in the specific urn-5 name, to maintain information 
about most recent block's ID for better search performance (or moreover, 
-tree/list based structure for all blocks for specific urn-5 name) ?
--How CAs' data should be saved ?
-
-2.2.5. Answers to research problem
--compared to previous research problem, the "obvious" benefits of DHTs and 
SWNs in this research problem are smaller
--we don't beforehand know which block is the most recent associated with a 
specfic urn-5 name
--in theory, though, the most efficient algorithm is O(log n), *assuming* that 
we already know the most recent block's ID (see previous research problem)
--the most important question is, how we can efficiently associate urn-5 names 
with the most recent block / to all blocks which are associated with it
--for DHTs and SWNs:
-
-Please notice: In this approach, DHT doesn't store the actual block, only the 
values for locating the data from the system
-
-       Req. 1:
-       -each node maintains a local hash-table based data structure (urn-5 
name -> most recent local block ID) for every urn-5 names
-       which node hosts. The most recent block is topmost --> we don't have to 
check all blocks and their urn-5 associations to get the most recent
-       Req. 2:
-       -all urn-5 name mappings are stored as <key, value[ ]>, where the key 
is urn-5 name's hash and value is a record containing
-       block ID and timestamp of that block. So, when we store a block in our 
system first time, we have to create a new key-value: 
-       %<hash_of_urn_5_name, [block id, block timestamp]> and route this 
mapping to node which is "closest" to a hash value. Now when we want to find 
the most 
-       recent block associated with a specific urn-5 name, we do:
-       
-               1. locally compute a hash for given urn-5 string
-               2. route the query to a node which hosts the given hash of 
urn-5 ("closest" node)
-               3. get most recent block's ID using the idea from previuos idea
-               4. use ID to get the specific block using normal DHT/SWN 
operation
-               
-       In this approach, we don't have to perform additional searching and 
sorting of mappings. And of course, we know that for given urn-5, only one node
-       hosts *all* the block information ("block history") for the urn-5, 
since mappings are mapped to a single node, closest to urn-5 hash value.
-       Again, this should work fine under existing DHTs (and SWTs ?).
-       
-       Some simple analysis: 
-       -there are more key-value pairs in the system for additional urn-5 --> 
block associations
-       -however, I don't think this is an issue, since data's size is small
-       -efficiency: find node which hosts urn-5 names + find node which hosts 
blocks associated with urn-5 name: logn + logn = 2logn (logarithmical)
-       
--for FBS and others:
-       -there is no very efficient (simple) methods for finding urn-5 name 
associated with the most recent block
-       -one simple proposal is that when we visit to each node (first idea 
above), get only the most recent one and compare them (or greedy approach: 
dismiss currently 
-       most recent block as we visit to nodes, if newer block have been found)
-
-       
-       2.3.3. Existing approaches and this specific research problem
-
--First, *all* blocks (with the specific urn-5 name) have to checked and 
analysed 
--In this case (urn-5), there is no benefit from the block's ID at all (DHT, 
SWN)
-
-2.3.4 Open questions:
--in this case, is it sensible to maintain two different key-value mappings 
key-value based systems ?
-       -one mapping fore block IDs
-       -one mapping for urn-5 names, which are associated with block IDs
-       -is this approach too difficult to maintain ?
-       
--is there possibility, in the specific urn-5 name, to maintain information 
about most recent block's ID for better search performance (or moreover, 
-tree based structure for all blocks for specific urn-5 name) ?
--How CAs' data should be saved ?
-
-2.3.5. Answers to research problem
-This is quite similar to previous research problem's answer. There are slight 
differences, though.
-
--for DHTs and SWNs:
-
-
       
 \section{Analysis}
 
@@ -2118,7 +1968,7 @@
 \label{fig:storm_query_urn5}
 \end{figure}
 
-\section{Open issues and future work}
+\chapter{Open issues and future work}
 
 One of the most important issues in Peer-to-Peer networks is the fact that
 online entities cannot be identified and verifyied safely. For our purposes
@@ -2131,6 +1981,13 @@
 based computer systems. However, we believe this problem is solved in a near 
 future as Peer-to-Peer networks will come more and more important in 
 computing world.
+
+-'get me block XYZ'
+-there is no reply, how do we are able to know if this was a spam attack, or 
the
+data really no exist in the system ?
+
+-digital signature on messages (no spam attacks) ?
+-lack of working PKI architecture
 
 Our future work includes support for searching transclusions and xanalogical
 links in a Peer-to-Peer network. Specifically, we want to find transclusions




reply via email to

[Prev in Thread] Current Thread [Next in Thread]