gzz-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gzz-commits] manuscripts/storm short-paper.rst


From: Benja Fallenstein
Subject: [Gzz-commits] manuscripts/storm short-paper.rst
Date: Mon, 26 May 2003 21:52:38 -0400

CVSROOT:        /cvsroot/gzz
Module name:    manuscripts
Changes by:     Benja Fallenstein <address@hidden>      03/05/26 21:52:38

Modified files:
        storm          : short-paper.rst 

Log message:
        restart short paper after realizing what the main point needs to be

CVSWeb URLs:
http://savannah.gnu.org/cgi-bin/viewcvs/gzz/manuscripts/storm/short-paper.rst.diff?tr1=1.5&tr2=1.6&r1=text&r2=text

Patches:
Index: manuscripts/storm/short-paper.rst
diff -u manuscripts/storm/short-paper.rst:1.5 
manuscripts/storm/short-paper.rst:1.6
--- manuscripts/storm/short-paper.rst:1.5       Sun May 25 07:40:49 2003
+++ manuscripts/storm/short-paper.rst   Mon May 26 21:52:38 2003
@@ -1,57 +1,61 @@
-========================================================================
-Storm: Supporting data mobility through location-independent identifiers
-========================================================================
-
-.. Main point of this paper:
-   Location-independent identifiers support data mobility;
-   DHT allows location-independent identifiers
+====================================================
+Storm: Using P2P to make the desktop part of the Web
+====================================================
 
 Abstract
 ========
 
-- data mobility
-- problems
-- location-independent identifiers such as hashes
-- resolvable through DHT
-- our implementation (Storm) is beginning to be deployed
-
-.. In this paper, we define data mobility as a collective term for the
-   movement of documents between computers, different locations 
-   on one computer and movement of content between documents.
-   We identify dangling links and alternative versions as major
-   obstacles for the free movement of data. This paper presents the Storm 
-   (STORage Module) design as one possible solution to these problems.
-   Storm uses location-independent globally unique 
-   identifiers, append-and-delete-only storage and peer-to-peer networking to 
-   resolve problems raised by data mobility. Moreover, we discuss some 
-   specific use scenarios related to ad hoc networks, unreliable network 
-   connections and mobile computing, in which the need for data mobility 
-   is obvious. Our current prototype implementation works on a single system;
-   peer-to-peer networking is in an early prototype stage.
-
-.. raw:: latex
-
-   \category{H.5.4}{Information Interfaces and 
Presentation}{Hy\-per\-text/Hy\-permedia}[architectures]
-   \category{H.3.4}{Information Storage and Re\-trie\-val}{Systems and 
Software}[distributed systems, information networks]
-
-   \terms{Design, Reliability, Performance}
-
-   \keywords{versioned hypermedia, dangling links, 
-   peer-to-peer,
-   location-independent identifiers}
+Linking personal documents like we link Web pages is inconvenient
+enough that users rarely ever do it. A major reason is that
+links break when documents are moved around or sent by mail.
+We argue that while non-breaking links would be a convenience 
+on the Web, they are a necessity for making Web-like hyperlinks
+useful on the local desktop.
+
+We propose Storm, a storage system identifying documents by
+cryptographic hashes and signatures, independently of their
+location. Our system automatically finds link targets wherever
+they are, on the local system or on the network.
+On the network, our identifiers are resolved 
+through a peer-to-peer distributed hashtable.
+Thus, links continue to work unchanged when documents are emailed
+or published on the network.
+
+Our system uses URIs to integrate with the Web. We have so far
+extended KDE and Netscape Communicator 4 to understand
+our experimental URN namespace. Most other systems can use Storm 
+through an HTTP gateway.
 
 
 Introduction
 ============
 
+.. documents can be linked like web pages,
+   which would make them part of the web:
+
+Documents written with OpenOffice or Microsoft Word
+can nowadays be hyperlinked just like web pages--
+but nobody does it. XXX
+
+.. links needed that don't break when documents are moved:
+
+.. using location-independent identifiers for
+   non-breaking links:
+
+.. non-breaking links seem not globally resolvable:
+
 Several hypermedia systems assume that identifiers either have to
-include location information or cannot be resolved globally.
+say where a document can be found on the network, or they
+cannot be resolved globally.
 URLs, location-dependent identifiers, break when documents are
 moved. Link services often query only a select set of link
 servers, not the whole network [hill94extending-andalso-carr95dls]_.
 
-Berners-Lee [name-myth]_ argues that unique random identifiers 
-are not globally feasible for this reason.
+Berners-Lee [name-myth]_ argues that for this reason,
+using unique, random-looking numbers to identify documents
+is not possible on a global scale.
+
+.. but DHTs can do it:
 
 However, recent developments in peer-to-peer systems have
 rendered this assumption obsolete. Structured overlay networks
@@ -62,158 +66,22 @@
 This, we believe, may be the most important result of peer-to-peer 
 research with regard to hypermedia.
 
-- location-dependent identifiers cause broken links
-
-- alternative versions on independent systems hard to synchronize
+.. Freenet's cryptographical identifiers:
 
-- creating a location-independent namespace, resolve through DHT
+.. structure of this paper:
 
 
-Storm block storage
-===================
-
-.. as blocks, independent of network location:
-
-In Storm, all data is stored
-as *blocks*, immutable byte sequences identified by a SHA-1 
-cryptographic content hash [fips-sha1]_. 
-Purely a function of a block's content, block ids
-are completely independent of network location.
-
-.. similar to files, but immutable:
-
-Blocks are similar to files, but they cannot be modified.
-Any change in the data would cause the identifier to change too.
-
-.. identifiers self-certifying:
-
-Storing data in immutable blocks
-has a number of advantages. Firstly, it makes identifiers
-self-certifying. 
-
-After downloading a block, we are can check whether the data
-matches the cryptographic hash in the identifier. 
-Therefore, we can safely download blocks from an untrusted peer.
-
-.. link targets cannot be changed on us:
-
-When we make a reference to a block, we can be sure
-that even the original author of the target will not be able 
-to change it (unlike with e.g. digital signatures).
-For example, if a newspaper refers to a letter
-to the editor this way, the letter's sender won't be able to change 
-the reference into an advertisement for a pornographic web page.
-
-.. caching trivial:
-
-Secondly, caching becomes trivial, since it is
-never necessary to check for new versions of blocks.
-
-.. flash crowds alleviated:
-
-If peers make the blocks in their caches available on the network,
-the flash crowd problem could be alleviated: The more users
-request a block, the more locations there are to download it from.
-This resembles e.g. the Squirrel
-web cache [iyer02squirrel]_; however, downloads can be
-from *any* peer since the source does not need to be trusted.
-On the other hand, there are privacy 
-concerns with exposing one's cache to the outside world.
-
-.. replication easy:
-
-To replicate all data from computer A
-on computer B, it suffices to copy all blocks from A to B that B
-does not already store. This can be done through a simple 'copy'
-command. Different versions of a single document
-can coexist on the same system without naming conflicts, since
-each version will be stored in its own block with its own id.
-
-.. web links resolvable to local copies:
-
-The same namespace is used for local data and data
-retrieved from the network. When an online document has been
-permanently downloaded to the local harddisk, it can be found
-by a browser just as data from the network. This is convenient 
-for offline browsing, for example in mobile environments:
-After a block has been downloaded, references to it will *never*
-cease to work, online or offline.
-
-.. append-only, bugs don't lose old data:
-
-Thirdly, immutable blocks increase *reliability*. 
-When saving a document, an application will only *add* blocks,
-never overwrite existing data. When a bug causes an application
-to write malformed data, only the changes from one session
-will be lost; the previous version of the data will still
-be accessible. This makes Storm well suited as a basis
-for implementing experimental projects (such as ours, Gzz).
-Even production systems occasionally corrupt existing data
-when an overwriting save operation goes awry; for example,
-one of the authors has had this problem with
-Microsoft Word many times.
-
-.. (was footnote) Unfortunately, efficient versioned storage (Section 6)
-   makes matters a little more complicated; still,
-   the basic assertion holds.
-
-.. mirrors trivial:
-
-Links to a block will work as long as *any* peer
-holds a copy. Thus, providing mirrors is trivial.
-Even after failure of all dedicated mirrors,
-a document may still be available from peers that have
-downloaded it. An archive of published blocks, in the spirit
-of the Web archive [waybackmachine]_, would only be yet another backup:
-normal links to a block would work as long as the archive
-holds a copy.
-
-.. more durable:
-
-Finally, because blocks are easy to move from system
-to system, we hope that block storage will be more *durable* than files.
-When users own multiple systems, or buy new systems
-to replace old ones, files are often on one harddisk
-and not the other, or moved to a floppy disk but not back
-to the harddisk. How many files you created in the 80s
-do you still keep around on your harddisk today? With block storage,
-each time a user buys a new computer, they could
-transfer all blocks from their existing systems to the new one,
-and blocks from old floppies could be copied to the harddisk
-without thinking about issues like which directory 
-to keep them in. By making it easy to collect
-blocks produced on a diverse number of systems, it would be easier
-to keep old data around.
-
-.. persistency commitment:
-
-Of course, to meet this goal it is necessary that the block
-system remains backwards compatible at all times. We have therefore
-decided to enter a *persistency commitment* when we finalize
-the Storm design before the next release of Gzz: Any future version
-of the Storm specification thereafter will be able
-to handle any block created according to this version of the spec.
-This means that no matter how much we'll regret our current choices
-in the future, we commit to providing backward compatibility for them.
-
-.. incompatibility with existing systems:
-
-The advantages we have outlined are bought by an utter incompatibility with
-the dominant paradigms of file names and URLs. We hope that
-it would be possible to port existing applications to use Storm
-without too much effort, but we have not investigated
-the issue closely. This is because Storm was developed
-for the experimental Gzz system, a platform explicitly developed
-to overcome the limitations of traditional file-based applications.
+Storm
+=====
 
-- versioning, pointers
+.. a general storage system, using Freenet-like identifiers:
 
+.. part of the web -- URN scheme (so far experimental, 
+   targetting registration):
 
 Web integration
 ===============
 
-- URN scheme (so far experimental, 
-  targetting registration)
 - HTTP gateway
 - Konqueror and Netscape 4 understand Storm URNs
 - KDE programs can load from and save to Storm URNs
@@ -222,34 +90,7 @@
 Conclusions
 ===========
 
-We have introduced the Storm design to address two important issues 
-raised by data mobility, dangling links and keeping track 
-of alternative versions. In Storm, all data is stored as immutable blocks
-identified a SHA-1 hash. Application-specific indices of these blocks can be 
kept.
-
-Storm is not limited to network publishing;
-it can be also used for private document repository. Our present 
implementation 
-does not support peer-to-peer distribution yet, but the Gzz project has used 
it 
-for local storage and server-based collaboration for one and a half years.
-Currently, we are working on a GISP-based peer-to-peer
-implementation.
-
-We have written an HTTP gateway and plan integration with KDE.
-
-Work is also needed on user interfaces for Storm.
-
-.. Besides these issues with the backend, we are facing user interface issues
-   as well -- for example the conventions for listing, moving and deleting
-   blocks. Also conventions for which zones e.g. new blocks should be stored in
-   must be resolved. Often they will be private, but when making changes to
-   documents that are shared with a project group, the changes should be
-   visible to others.
-
-We see Storm as a case study in
-the potentials of a system that does not use
-location-dependent identifiers at all. We hope to raise awareness
-for the prospects of location-independent systems based
-on structured overlay networks such as DHTs.
+..
 
 
 Acknowledgements




reply via email to

[Prev in Thread] Current Thread [Next in Thread]