bug-global
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Added support for file list in single file update


From: dhruva
Subject: Re: Added support for file list in single file update
Date: Wed, 25 Jun 2014 09:57:18 -0700

On Tue, Jun 24, 2014 at 10:31 PM, Shigio YAMAGUCHI <address@hidden> wrote:
>> Having the prefix is not mandated. It allows you to define
>> adding/removing tags for files without depending on stat(). We have
>
> It means that files with no prefix need stat()?

Yes

>> ~61500 files in the repository around 1000 files change per day (rough
>> estimate). We know exactly the changes to the file system
>> (added/modified/deleted) and hence can generate a file with the
>> prefixes. This will avoid calling stat() on those ~1000 files.
>>
>> Since the build daemon builds for different branches, the effect
>> multiplies. This was one of the features requested by the team that
>> owns the build to help reduce the overall build. Hence, I decided to
>> look at the GNU global code and hack. I always wanted to hack on this
>> code, I like using it and now want to enhance it.
>
> By the prefixes, what percent average does the execution time of
> gtags decrease? If possible, would you please show me the data?
>

Running over NFSv3

Input files (4):
[1142]$ wc -l prefix.files
4 prefix.files

[1143]$ wc -l no-prefix.files
4 no-prefix.files

Original without prefix:
[1138]$ time for ii in `cat no-prefix.files` ; do
/usr/software/bin/gtags --single-update $ii; done

real    0m27.290s
user    0m5.768s
sys     0m4.357s

Modified without prefix:
[1139]$ time for ii in `cat no-prefix.files` ; do gtags
--single-update $ii; done

real    0m40.861s
user    0m6.149s
sys     0m4.705s

=> Degraded due to checking if the file is a source file
(issourcefile) or a list of files (can be fixed by having a separate
flag)

Modified batch operation without prefix:
[1140]$ time gtags --single-update no-prefix.files

real    0m7.145s
user    0m1.438s
sys     0m1.229s

Modified batch with prefix:
[1141]$ time gtags --single-update prefix.files

real    0m7.081s
user    0m1.496s
sys     0m1.129s <-- reduction in time by avoiding stat (not
significant though due to file system caching)

=> There is a visible benefit in batch processing of files

Ran under valgrind and find 'strtol()' via calls to 'atoi()' as one of
the biggest contributors to performance overheads. I am looking at
storing the integer in DB and fetching it instead of storing the
integer as char and having to convert it back to get fid. That will be
a separate patch. Wish this was under git... (I will try to import it
into git)

with best regards,
dhruva



reply via email to

[Prev in Thread] Current Thread [Next in Thread]