magellan-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Magellan-users] Initial clarifications


From: Tryggvi Björgvinsson
Subject: [Magellan-users] Initial clarifications
Date: Tue, 18 Mar 2008 17:50:16 -1000
User-agent: Thunderbird 2.0.0.12 (X11/20080227)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi everybody,

(Warning! Long email, take your time)

I just wanted to inform the list members that the first two versions of the source code have been uploaded to the source code repository. So magellan has now officially become free and open source software, meant for global collaboration. Magellan has been released under the GNU General Public License version 3 or later.

There are of course many modifications needed before it can be considered to be a stable, working software but it has basic functionality at the moment. Tasks which must be worked on before it becomes a working software which can be used without much tweaking have been identified and are listed in a specific TODO file.

Before continuing some clarifications are perhaps necessary. For those who are not familiar with common free and open source software terminology here are some explanations.

Free Software
- -------------

Free software is software which gives its users the four following freedoms:
0. The freedom to use the software for any purpose
1. The freedom to study the software and adapt it to ones needs
2. The freedom to distribute the software, to help ones neighbours
3. The freedom to improve the software and distribute the modifications so that the whole community benefits

For software to become free software one must release it under a specific license. The most well known and used by around 70% of free software projects is the GNU General Public License or simply the GPL.


Open Source Software
- --------------------

Building upon free software is open source software. The difference is that open source software does not focus on the four user freedoms but the pragmatic value of the software by focusing on an effective development method for the software. This method builds upon the above freedoms but expands on the idea that users should be able to modify the software and distribute the modifications. It is possible to say that open source software defines the public as the user and releases the source code on a public site, accessible to everybody, giving users the possibility to help with the development.

Since there is a slight difference between the two types of software (one ideological and the other pragmatic) there can be different licenses applied to each, however most often a free software license is also an open source software license, e.g. the GPL is both. Software released under a free and open source software license is generally just referred to as free and open source software or abbreviated FOSS.


Source repository and version control
- -------------------------------------

These terms are not only affiliated with free and open source software, but software development in general. However source repositories and version control are central in FOSS development. A source repository is a place where the source code of the software is kept and can be accessed by developers. The source code of free and open source software is kept in public repositories accessible by the everybody.

Version control is often correlated with software repositories and management of the source code. Version control systems keep track of all updates (often called "patches") to the source code. That way, it is easy to revert back to a working version of the software when a patch renders the software (or a part of it) unusable (i.e. when an update to the software accidentally makes the software unusable). Version control has a lot of other benefits which are not important to know of at this moment, but it is perhaps good to know of the two main types of version control systems: central version control or distributed version control.

Central version control is controls updates to a software repository via a central location. Each developer must "check out" the code from a central repository to obtain the most recent version of the software. All updates are then are then submitted (or committed) to that specific central repository. Distributed version control is another school of thought where the developers check out the latest version. But, instead of only having a copy of the code where changes are committed back to a central repository, the checked out version is a repository of its own. So different versions of the software are distributed in many repositories which can then be merged back into the main repository.


Magellan
- --------

So having some explanations of underlying concepts we can (finally) start discussing magellan.

Magellan is a free and open source software released, as stated above, under the GNU General Public License version 3 or above. By this we intend for magellan to be a collaborative software development project where users are invited to submit patches to the source code and help with the development of magellan.

It is important to understand that it takes a lot of work and time before the project gains a significant core of users which are willing to help development and for a project like magellan, it might take years. However, development of magellan will from the beginning put effort into making it easy for everybody, especially scientists, to start improving magellan. This means that every decision must be carefully thought through ranging from technical decisions such as the programming language to non-technical aspects such as documentation. The rest of this e-mail will go through the major decisions made for magellan.

Programming language
- --------------------

The programming language chosen for magellan is Python. It can be argued that this is not the most optimal language to use since it is not normally taught (especially to scientists) and is not as quick as some other languages. However, the arguments for Python are better suited for a project like magellan. There are many reasons why Python was chosen but the major influences to the decision are:

* It is an easy to learn and use programming language so everybody should be able to quickly learn how to program Python.

* It forces the developer to program "beautifully", that is structure the code in a humanly readable way and avoid obfuscations. Readable syntax and correct indentation are important so users will be quick to read the code and start working on it.

* It is dynamically typed so one does not have to play around with many different types of variables (integers, floats, doubles, etc.).

* It is often said that Python comes with batteries included. This means that the standard library of functions (i.e. functions built in to Python and ready to be called by programs) is quite large and extensive. Having such an extensive library makes the coding easier and more understandable.

* It might not be the fastest programming language around but it is fast enough. If for some reason one wants to optimize a specific feature for performance, Python provides a mechanism for extensions written in C or C++, giving the possibility of rewriting certain parts for performance.

* It is multiplatform, meaning that it can run on almost any operating system (one must of course take some care in the programming phase). This means that Python programs like magellan will be able to run on GNU/Linux, Windows, MacOS X, and many other operating systems.


Project management
- ------------------

The management of the project is currently run through Savannah which is "a central point for development, distribution and maintenance of Free Software that runs on free operating systems." The project site on Savannah is:

http://savannah.nongnu.org/projects/magellan

There you can find a description of magellan, the mailing list (currently only magellan-users, to whom this mail is posted), the bug tracker (management of defects in magellan, because every software contains defects), repositories (for both the website and the source code), and other things.

The source code repository uses distributed version control called "git". The reason for choosing distributed version control is to allow different research institutes to create their own in-house repositories where they can adapt it to specific in-house research which might not follow the direction of magellan. Their modifications could then relatively easily be merged with the official magellan repository if there is general interest by magellan users.

The website (documentation) repository is centrally controlled using a version control system called "CVS". There is not as strong a reason for different versions of documentation to be in a distributed version control system.


Program structure
- -----------------

Currently the magellan directory which one can check out from the source code repository is structured in the following way.

The main directory, magellan includes the following

AUTHORS
~  File containing the names and emails of all of magellan's authors
HACKING
~  File describing how to do work on magellan. In FOSS communities the term
~  hacking refers to the joyful act of creating wonderful programs while
~  cracking is used to describe the act of breaking into computers
INSTALL
~  File explaining how to install magellan
LICENSE
~  File containing the GNU General Public License used for magellan
MAINTAINERS
~  File containing the names of the active maintainers of magellan
MANIFEST
~  File used when packaging different versions of magellan
MANIFEST.in
~  File used when packaging different versions of magellan
README
~  File containing basic descriptions
setup.py
~  File used for building, packaging and installing magellan
src
~  Directory containing the source code of magellan
TODO
~  File containing the list of tasks to do


The magellan/src directory contains one file and one folder:

magellan
~  The main program file, the one used to execute magellan
Magellan
~  The directory containing the module files used by magellan.
~  This name will be changed in upcoming versions since this will cause
~  problems with operating systems with a case-insensitive file system,
~  e.g. Microsoft Windows. The name will have to be describing so perhaps
~  python-magellan will do?


The magellan/src/Magellan module directory contains a few files and one directory. All of the files in the directory ending with .py have a corresponding file with the .pyc suffix. The .pyc files are just pre-compiled versions of the .py files so that the running of the program will take less time. So one shouldn't worry about the .pyc files. The files and directory of interest in magellan/src/Magellan are:

calc.py
~  The module which performs all core calculations for magellan. This is the
~  heavyweight module where most of the work will take place. Uses numpy for
~ the computations. Numpy is a Python module which is very similar to MATLAB
~  and can be used for heavy computations
data
~ Directory containing data used by magellan. Currently it only contains one ~ file called 'candekent.dat' which contains the Cande & Kent (1995) reversed
~  time-scale.
data.py
~ The module which gathers data from parameter files or configuration files.
__init__.py
~  A file which Python requires to be in the Magellan directory, so that it
~  will be seen as a module package.
plot.py
~  The module which plots the data graphically and presents it to the user.
~  Uses matplotlib for plotting. Matplotlib is a python module which gives
~  Python developers syntax similar to MATLAB to plot different graphs.


This is all there is to the magellan directory structure.

Flow of magellan
- ----------------

The flow of magellan is very simple. Calling the main program file, magellan, with specific parameters causes it to read data using the Magellan/data.py package. The output of the data gathering is sent to Magellan/calc.py which performs some computations and returns plotable data structures. These data structures are then sent to Magellan/plot.py which plots the data structures in a readable form. So basically the flow is:

magellan ->
~                     Magellan/data.py
~                <-

~                ->
~                     Magellan/calc.py
~                <-

~                ->
~                     Magellan/plot.py


Development of magellan
- -----------------------

Every computation in Magellan/calc.py must be backed up by theoretical computations, described in documentation which will be available through the Magellan website. The theoretical computations should be followed by a computer algorithm (described using an easy to understand semi-programming language called pseudo-code). In the source code each computation will point to the corresponding file or chapter in the documentation which explains the theoretical computations.

Providing users with the theory, algorithm explanation, and a working implementation should make it easier for users to start contributing. Scientists who are perhaps not good programmers or do not know Python can still contribute with theory by submitting the theory which eager algorithm designers or programmers can then follow and implement. Therefore, lack of programming skills should not be a valid excuse for not contributing to magellan.

Furthermore, each and every file in magellan must be commented and easy to read. Commenting the code means that hard to understand code is explained so that new users will never have a hard time understanding what is being done. Variable names must be describing, so variable names like a or b should be avoided. In the current version of magellan there are variable names like Jx and P which are used in the corresponding theoretical computations, but they must be replaced to make it easier to read for others.


Well this long email should be a good start to describe the intentions and structure of the magellan project. The above text will of course be put into the magellan documentation where it will be more accessible than on a mailing list, but until then this email will have to do.

/Tryggvi

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH4I14TfUwC3N5Fj0RAtWXAJ4qkbKr4SmY9U16nKQK+cbdMt4zGACdFX/b
lzlxvobD0GTrYKrweMjMJpQ=
=eNN0
-----END PGP SIGNATURE-----





reply via email to

[Prev in Thread] Current Thread [Next in Thread]