gnunet-developers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?


From: Schanzenbach, Martin
Subject: Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?
Date: Sun, 10 Feb 2019 10:06:48 +0100

Maybe let me wrap this up for now because I do not see a point in arguing 
further and there does not seem to be consensus:

From a GNUnet app/service developer perspective (i.e. not GNUnet core services) 
I have made the experience that the use of proper tooling and separation 
benefits the overall experience both for developer and user.
If you are interested in what I mean, you can look at 
https://gitlab.com/reclaimid.

The thing is that the core of this application still resides within the GNUnet 
monolith (src/reclaim).
This is a pain for the following reasons:
- No working CI/Testing
- If CI/Testing is working, it takes too f**ng long if I only want to test my 
changes / work on this component
- I do not need the other sources especially fs and social and they only bloat 
the final result (the GNUnet binaries are around 70MB btw)
- Development of other things in GNUnet (e.g. transport update, fs) should not 
have an immediate effect on my development flow. But they have.
- For deployments, I need a stable GNUnet base (this is why there is a 
gnunet-docker in there that builds a lean GNUnet image ~70MB)

As such I will, as proposed, simply try to improve this for _myself_.
If the move to proper infrastructure (which has been opposed) and proper 
separation is not shared, I do not really have a problem with moving the code I 
am working on out of the gnunet.git and into the gitlab repo above.
Since it is, like fs or social, a component that does not have any horizontal 
or vertical (upwards) dependencies, this will not affect the status quo in any 
way.
Especially since the current "CI" doesn't even build and test most of this 
(=reclaim).

Doing this with social/fs/etc as well would have the benefit of allowing me to 
create images which are even smaller and do not contain functionality that 
reclaim does not need, but this does not seem to have consensus so I have no 
solution for this atm.

BR

> On 10. Feb 2019, at 09:25, Schanzenbach, Martin <address@hidden> wrote:
> 
> Signed PGP part
> 
> 
>> On 10. Feb 2019, at 08:46, Florian Dold <address@hidden> wrote:
>> 
>> From my experience with working on GNU Taler, I tend to agree with the
>> arguments *against* multirepos, especially for GNUnet.
>> 
>> Multirepos tend to work well if either:
>> 
>> a) The language you're using has support for project-local package
>> dependencies.  This is the case for JavaScript (with npm/yarn), and IMHO
>> a reason why many small repos *can* work well for JS/TypeScript projects
>> and other languages supporting this approach.  But you do *not* want
>> developers to build multirepos from source, that's just very painful.
>> If we are heading towards multirepos with GNUnet, extra custom tooling
>> will IMHO be inevitable to avoid too much manual building, and I don't
>> think we want to maintain these custom tools.  An example for such
>> tooling would be Googles's Repo tool
>> (https://source.android.com/setup/develop /
>> https://source.android.com/setup/develop/repo).
> 
> 
> Actually, google is an example for a proponents of monorepos. So your point 
> is moot here.
> They need all this tooling _because_ they use a single repo.
> 
>> 
>> b) The sub-projects are loosely coupled with a relatively stable API.
>> This is (mostly) the case for GNU Taler, where we have (semi-)stable
>> HTTP APIs, and there are no real source-level dependencies.  There are
>> exceptions for this (the Taler merchant depends on the headers of the
>> Taler exchange), but they are so few that it's still manageable.
>> 
>> c) There is some stable set of base-components, and the "ecosystem"
>> around it is compiled against the stable release version of this
>> component.  Maybe it can be argued that gnunet-gtk targets the more
>> stable features of GNUnet, and thus shall be separate?
>> 
>> Neither of these conditions are fulfilled for GNUnet.  I don't really
>> see what benefit we get from separating out components that have
>> "distinct use cases" into separate repos.  It doesn't make CI faster
>> (with BuildBot you can just trigger test cases also based on what files
>> changed).  Also if you look at FreeBSD, that's an example of a project
>> where the whole OS (including user space tools) is in one repo.  Why
>> shouldn't this work for the GNUnet?
> 
> Did you even read my last comment? Do you really consider all of the 
> applications as one "GNUnet" that every
> user (and developer!) actually cares about?
> I can tell you the number of times I used / developed something for fs / 
> social: 0.
> And, of course smaller repos make CI faster. It will result in smaller builds 
> (and, more importantly, builds which actually build things that have changed).
> And please no arguments for stateful builds/runners. I hope we can at least 
> agree that tests and builds should be done in clean environments every time.
> Else you will not catch a lot of stuff that can go wrong (I experienced this 
> myself when I setup my docker builds for GNUnet which, unlike BB, actually 
> build from scratch)
> 
>> 
>> I am sure with appropriate tooling, both mono- and multirepos can be a
>> pleasure to work with.  But as it stands now, moving to multirepos would
>> negatively impact productivity more than just keeping the monorepo.
>> 
>> Just another random aspect:  Configuration hell.  What if I (as a
>> developer) want to build my GNUnet component with a certain ./configure
>> feature/compiler flag enabled/disabled?  Then I have to do this in all
>> GNUnet dependencies manually, unless there's special tooling to do just
>> that.
>> 
> 
> How would that happen? Can you give a _concrete_ example in fs/social/reclaim 
> where this is true?
> It is exactly the point that it is completely unclear what effects a 
> configuration switch on what components.
> If we separate this, we might get _some_ overhead in the configure.ac's but 
> from my experience I expect this to be very little.
> And our configure.ac is not something I would consider "developer friendly" 
> through its sheer size and complexity.
> 
>> We should also keep in mind that for the end user, it doesn't matter how
>> the repo(s) are/is structured, they can always get the same set of
>> packages in their distro.  Multiple source repos can produce a single
>> package, or a single repo can produce multiple distro-level packages.
> 
> As discussed in this thread multiple times already we need to distinguish 
> between the two.
> Of course you can create multiple packages from a single repo source build. 
> The maintainer of the packages will be very "happy" about this I can tell you.
> But that is not the point. The point is that development does not make sense 
> in this way.
> Regarding "you can teach buildbot to only run tests on the files that have 
> changed":
> As I said in a mail before, OF COURSE YOU CAN DO THIS.
> But it means that not only will we have an incredibly complex build 
> (configure.ac) but we will have an equally complex and error prone test!
> Which, btw, cannot even be modified by devs atm because the buildbot 
> configuration and architecture is silly.
> 
> And think about it this way: If a new developer decides to write a new 
> service / application on top of GNUnet, what will he be faced with?
> Image this app needs only GNS and maybe CADET.
> The dev will need to integrate and understand the full build and test in 
> order to properly setup this project.
> If we had a few separate examples of how this can be done in a separate repo, 
> this would go a long way.
> 
> 
>> 
>> - Florian
>> 
>> On 2/10/19 1:02 AM, Amirouche Boubekki wrote:
>>> I think splitting the codebase will be a pain for gnunet.
>>> 
>>> The only *good* reasons for manyrepos are social or ego politics "this
>>> is my lawn" or legal. The only one that applies to gnunet is legal
>>> because one needs to fill a gnu form to be able to contribute.
>>> 
>>> I am biased toward monorepo by experience dealing with big project
>>> (100k+ SLOC) and the only time it made sens to split the project into
>>> many repositories because it was different teams / workflow (social) and
>>> different legal terms for the various services/daemons, at previous
>>> $WORK, they had to fork gentoo to make it work.
>>> 
>>> Otherwise, each time I saw another repository it was a source of pain:
>>> 
>>> - Need to manage several versions
>>> - git submodule workflow is not good enough, it doesn't track branch, I
>>> personally I never remember how to know the branch of a commit, plus it
>>> requires some more git-fu to bump the submodule.
>>> - refactoring anyone?
>>> - generally speaking manyrepos at small scale is more work
>>> 
>>> And again, it requires somehow to track down every versions (what works
>>> with what) and you end up with another repository (or distribution) with
>>> another build system that puts everything together. Continuous
>>> Integration can do that? Where is the code of the CI? Another repo? More
>>> versions, more git clone more grep across repositories / directories not
>>> even in sync.
>>> 
>>> Popularity arguments:
>>> 
>>> a) Ok, everybody know GAFAM love monorepos and that is a also a source
>>> of pain (dedicated team and software). That said, gnunet is not the size
>>> of any GAFAM, hence it will not suffer from monorepo pain points.
>>> 
>>> b) Github and Javascript made the manyrepos popular for various ego
>>> reasons and because JavaScript is not good. I won't take inspiration
>>> from that part of the JavaScript noosphere. gnunet-leftpad anyone?
>>> 
>>> c) Now, there is GNOME. GNOME is famous for its bazaar model of
>>> development and also famous for the adoption of meson (maybe even its
>>> inception) or its previous incarnation jhbuild. Anyway, even if GNOME
>>> and GNU (which is also a bazaar) success is appealing, gnunet is not GNU
>>> or GNOME. From my point of view the bazaar development model scales
>>> better / more easily in a socially distributed setting. Also why Linux
>>> is still a single repository?
>>> 
>>> Le sam. 9 févr. 2019 à 18:16, Schanzenbach, Martin
>>> <address@hidden <mailto:address@hidden>> a écrit :
>>> 
>>> 
>>> 
>>>> On 9. Feb 2019, at 17:13, Christian Grothoff <address@hidden
>>>   <mailto:address@hidden>> wrote:
>>>> 
>>>> On 2/9/19 5:04 PM, Schanzenbach, Martin wrote:
>>>>> I have some inline comments as well below, but let us bring this
>>>   discussion down to a more practical consensus maybe.
>>>>> I think we are arguing too much in the extremes and that is not
>>>   helpful. I am not saying we should compartmentalise
>>>>> GNUnet into the tiniest possible components.
>>>>> It's just that I think it is becoming a bit bloated.
>>>>> 
>>>>> That being said, _most_ of what is in GNUnet today is perfectly
>>>   fine in a single repo and package.
>>>>> For now, at least let us not add another one (gtk) as well?
>>>>> 
>>>>> Then, we remain with
>>>>> 
>>>>> - reclaim (+the things reclaim needs wrt libraries)
>>>>> - conversation (+X)
>>>>> - secureshare (+X)
>>>>> - fs (+X)
>>>>> 
>>>>> as components/services on my personal "list".
>>>>> I suggest that _if_ I find the time, I could extract reclaim into
>>>   a separate repo as soon as we have a CI and I can
>>>>> test how it works and we can learn from the experience.
>>>>> Then, we can discuss if we want to do the same with other
>>>   components, one at a time, if there is consensus and a person that
>>>>> would be willing to take ownership (I am pretty sure we talked
>>>   about this concept last summer as well).
>>>> 
>>>> Maybe you could start with extracting the SecuShare components? That
>>>> should do for a first "experience", and be a bit more effective at
>>>> reducing bloat as well ;-).
>>> 
>>>   Well, I could, but our secushare people are quite active so maybe
>>>   there are volunteers (if they agree with the proposal at all).
>>>   Regarding "bloat". If we want to effectively eliminate bloat than
>>>   let's look at numbers:
>>> 
>>>   File Sharing:
>>>   src/fs: 36918 (!) LOC in .c files
>>>   src/datastore/cache: ~15k LOC in .c files
>>> 
>>>   Conversation:
>>>   src/conversation: 10538 LOC in .c files
>>> 
>>>   SecuShare:
>>>   src/psyc* : ~17000 LOC in .c files (altough I am not sure about this
>>>   because theoretically psyc is a general use protocol, no?)
>>>   src/social: 9447 LOC in .c files
>>>   src/multicast: 5633 LOC in .c files
>>> 
>>>   Reclaim:
>>>   src/reclaim* : ~6500 LOC in .c files
>>> 
>>>   Now, considering that fs is practically always built for everybody
>>>   and SecuShare and reclaim are experimental, it hurts the most for
>>>   devs that actually compile from source.
>>>   Everything combined are 110000+ LOC which is 22% of the codebase
>>>   (~500k, oO). Considering that there is a significant redundancy in
>>>   transport/ (75k) at the moment, this number is probably closer to 25%.
>>>   Granted, this is a lot less than I expected ;), but maybe
>>>   illustrates the dimensions.
>>> 
>>> 
>>>> 
>>>> That said, splitting of reclaim seems also much less problematic than
>>>> fs/conversation, and if you then integrate reclaim with the libgabe
>>>> tree, the overall number of downloads/installation for reclaim
>>>   wouldn't
>>>> go up, so that would certainly kill my argument of making the
>>>> installation more complex (might indeed simplify it, as one
>>>   doesn't have
>>>> to remember to install libgabe before GNUnet to get reclaim).
>>> 
>>>   Could do, but libgabe has some nasty additional deps (libpbc and
>>>   gmp) which we _might_ eventually get rid of completely by
>>>   implementing GNS-based encryption.
>>> 
>>> 
>>>   _______________________________________________
>>>   GNUnet-developers mailing list
>>>   address@hidden <mailto:address@hidden>
>>>   https://lists.gnu.org/mailman/listinfo/gnunet-developers
>>> 
>>> 
>>> _______________________________________________
>>> GNUnet-developers mailing list
>>> address@hidden
>>> https://lists.gnu.org/mailman/listinfo/gnunet-developers
>>> 
>> 
>> _______________________________________________
>> GNUnet-developers mailing list
>> address@hidden
>> https://lists.gnu.org/mailman/listinfo/gnunet-developers

Attachment: signature.asc
Description: Message signed with OpenPGP


reply via email to

[Prev in Thread] Current Thread [Next in Thread]