View Issue Details

IDProjectCategoryView StatusLast Update
0000030Cinelerra-GGWebsitepublic2019-01-01 15:42
Reporterferdnyc Assigned Togoodguy  
PrioritynormalSeverityfeatureReproducibilityalways
Status closedResolutionreopened 
Summary0000030: [GITWEB] Feature Request: Support .tar.xz (and .tar.bz2?) format snapshots
DescriptionCurrently the gitweb instance at https://git.cinelerra-gg.org/git/ allows snapshots to be downloaded in .tar.gz format, which can be requested via a link such as (for the currently-latest commit):

https://git.cinelerra-gg.org/git/?p=goodguy/cinelerra.git;a=snapshot;h=230f4fdce9d4b62d8780702bb406a1f5f629a3c3;sf=tgz

The "sf=tgz" part of the URL selects the snapshot format, with .tar.gz being the default. The gitweb.conf(5) man page indicates that qq{Supported formats include "tgz", "tbz2", "txz" (gzip/bzip2/xz compressed tar archive) and "zip"}, however only "tgz" is offered by default and, indeed, appears to be the only available option.

Attempting to use sf=zip or sf=tbz2 produces an error "403 - Unsupported snapshot format"

Attempting to use sf=txz, OTHO, results in "403 - Snapshot format not allowed"

Since .xz offers superior compression to .bz2, which offers superior compression to .gz (downloading the above commit as .tar.gz (135MB) and recompressing to .tar.bz2 and .tar.xz reduced the size to 131MB and 126MB, respectively) it'd be nice to have it available as a supported snapshot format.
Additional InformationAccording to the same man page,
========

Features in %feature
       The gitweb features that are configurable via %feature hash are listed
       below. This should be a complete list, but ultimately the authoritative
       and complete list is in gitweb.cgi source code, with features described
       in the comments.
[...8<...snip...8<...]
       snapshot
           Enable and configure the "snapshot" action, which allows user to
           download a compressed archive of any tree or commit, as produced by
           git-archive(1) and possibly additionally compressed. This can
           potentially generate high traffic if you have large project.

           The value of 'default' is a list of names of snapshot formats,
           defined in %known_snapshot_formats hash, that you wish to offer.
           Supported formats include "tgz", "tbz2", "txz" (gzip/bzip2/xz
           compressed tar archive) and "zip"; please consult gitweb sources
           for a definitive list. By default only "tgz" is offered.

           This feature can be configured on a per-repository basis via
           repository’s gitweb.blame configuration variable, which contains a
           comma separated list of formats or "none" to disable snapshots.
           Unknown values are ignored.
========

I am 100% sure that per-repo config name is a copy-pasteo/typo, and they meant "This feature can be configured on a per-repository basis via repository’s gitweb.snapshot configuration variable", as gitweb.snapshot is one of the variables listed in the git-config(1) manpage, and gitweb.blame is the same variable documented to configure the previous feature listed in this man page.
TagsNo tags attached.

Activities

ferdnyc

2018-12-07 21:58

reporter   ~0000083

More explicit help, from the man page's EXAMPLES section:
========
EXAMPLES
       To enable blame, pickaxe search, and snapshot support (allowing
       "tar.gz" and "zip" snapshots), while allowing individual projects to
       turn them off, put the following in your GITWEB_CONFIG file:

           $feature{'blame'}{'default'} = [1];
           $feature{'blame'}{'override'} = 1;

           $feature{'pickaxe'}{'default'} = [1];
           $feature{'pickaxe'}{'override'} = 1;

           $feature{'snapshot'}{'default'} = ['zip', 'tgz'];
           $feature{'snapshot'}{'override'} = 1;

       If you allow overriding for the snapshot feature, you can specify which
       snapshot formats are globally disabled. You can also add any
       command-line options you want (such as setting the compression level).
       For instance, you can disable Zip compressed snapshots and set gzip(1)
       to run at level 6 by adding the following lines to your gitweb
       configuration file:

           $known_snapshot_formats{'zip'}{'disabled'} = 1;
           $known_snapshot_formats{'tgz'}{'compressor'} = ['gzip','-6'];
========

Sam

2018-12-11 16:38

administrator   ~0000138

I'll get to look at this point the day after tomorrow.

PhyllisSmith

2018-12-12 01:21

manager   ~0000157

Thank you for providing a good amount of information for us to look into this and test it out. Goodguy modified /etc/gitweb.conf to add the lines something like as follows:
   $feature{'snapshot'}{'default'} = ['tgz','txz','tbz2','zip'];
   $feature{'snapshot'}{'override'} = 1;
   $known_snapshot_formats{'txz'}{'disabled'} = 0;
Then tested zip and tbz2 which worked. Something was preventing txz from working but he did run the xz command on the server and found that it is a significant server cpu load.

He does not think using txz is a good idea; very slow, almost no real gain; it requires less total time (including download time) to use tgz. txz is a significant server cpu load and only produces a few percent gain over tgz.
cpu: tar=0.5 sec/231MB,
         gz=7.5 sec/134MB,
     bz2=36.5 sec/131MB
       xz=85.5 sec/126MB

With your agreement, we would like to close this issue. It might be possible to revisit this in the future or alternatively, create one txz file that can be downloaded by everyone rather than creating a txz file for every single download on a weekly basis.


Sam

2018-12-24 11:46

administrator   ~0000364

Since I haven't heard anything new anymore, I assume that the topic is finished, otherwise please contact me again.

ferdnyc

2018-12-24 22:21

reporter   ~0000370

My apologies for not responding to this! I just double-checked, and I was never sent an email notification of Phyllis's last comment -- I think it was during the period when the notification configs were in flux.

I agree that if xz increases the server load significantly, it doesn't make sense to support on your end; it's a fairly minor gain over bzip2. Supporting tgz and tbz2 sounds like a perfectly fine solution.

However, I just attempted to load
https://git.cinelerra-gg.org/git/?p=goodguy/cinelerra.git;a=snapshot;h=230f4fdce9d4b62d8780702bb406a1f5f629a3c3;sf=tbz2

and still received a "403 - Unsupported snapshot format" error. Is there something I missed?

PhyllisSmith

2018-12-24 23:55

manager   ~0000371

We thought the request was only specifically for .tax.xz and that the ? after tar.bz2 was rhetorical ! in the quote "Support .tar.xz (and .tar.bz2?) format snapshots" AND "Since .xz offers superior compression ... it'd be nice to have it available as a supported snapshot format."
Then when gg did the timing tests and saw the server time for bz2 was at least 5x more cpu and only 3 MB smaller as in:
       gz= 7.5 sec/134MB
    bz2=36.5 sec/131MB
I am not sure that I could convince the server gurus, including him, to change this. But they are flexible if necessary.

ferdnyc

2018-12-25 06:47

reporter   ~0000372

My apologies, the original request was imprecisely worded. It assumed tbz2 would be redundant, since having xz enabled would make it silly to use bzip2. (And then I read too much into "Goodguy modified /etc/gitweb.conf ...Then tested zip and tbz2 which worked.")

bzip2 does consume more resources than gzip, agreed, and xz even more, all things being equal. But that last qualifier is important. While it's true bzip2 is always slower than gzip, xz can often be far faster than bzip2 (and unxz can be nearly as fast as gunzip) because it's multi-threaded.

On my 4-core system, in a tmpfs-mounted ramdisk:
$ time gzip 0a6f0e7.tar : 6.542 total seconds
$ time bzip2 0a6f0e7.tar : 33.576 total seconds
$ time xz 0a6f0e7.tar : 1:03.47 total
$ time xz -T0 0a6f0e7.tar : 19.990 total

From the downloader's perspective:
$ time gunzip 0a6f0e7.tar.gz : 1.181 total seconds
$ time bunzip2 0a6f0e7.tar.bz2 : 9.895 total seconds
$ time unxz 0a6f0e7.tar.xz : 3.173 total seconds

Now, if the server performing the compression doesn't have multiple cores available, or those resources are being used to serve other requests, then the multi-threaded compression features of xz would of course be irrelevant on the server side.

I have no objections to sticking with straight gzip, if that's the decision. I just wanted to explain the thinking behind the request (and why it wasn't _entirely_ a request to waste resources for no discernible benefit... only mostly). Happy holidays!

PhyllisSmith

2018-12-30 06:01

manager   ~0000440

Ferdnyc: appreciate your last note with explanations. GG prefers to leave it as it is for now.

PhyllisSmith

2018-12-30 16:59

manager   ~0000450

An Arch repo builder provided gg with a good reason why this may be useful after all. You can now use tar.gz, tar.bz2, and zip. With the caveat that if this causes problems, it may be reversed.
BTW: here tar.gz took about 5 seconds; tar.bz2 took 35 seconds to create, and zip only took about 3 or 4 seconds.

Issue History

Date Modified Username Field Change
2018-12-07 21:41 ferdnyc New Issue
2018-12-07 21:58 ferdnyc Note Added: 0000083
2018-12-08 00:51 PhyllisSmith Assigned To => PhyllisSmith
2018-12-08 00:51 PhyllisSmith Status new => assigned
2018-12-11 16:38 Sam Note Added: 0000138
2018-12-11 16:38 Sam Assigned To PhyllisSmith => Sam
2018-12-12 01:21 PhyllisSmith Status assigned => acknowledged
2018-12-12 01:21 PhyllisSmith Note Added: 0000157
2018-12-24 11:46 Sam Status acknowledged => closed
2018-12-24 11:46 Sam Resolution open => fixed
2018-12-24 11:46 Sam Note Added: 0000364
2018-12-24 22:21 ferdnyc Status closed => feedback
2018-12-24 22:21 ferdnyc Resolution fixed => reopened
2018-12-24 22:21 ferdnyc Note Added: 0000370
2018-12-24 23:55 PhyllisSmith Note Added: 0000371
2018-12-25 06:47 ferdnyc Note Added: 0000372
2018-12-25 06:47 ferdnyc Status feedback => assigned
2018-12-30 06:01 PhyllisSmith Status assigned => closed
2018-12-30 06:01 PhyllisSmith Note Added: 0000440
2018-12-30 16:59 PhyllisSmith Assigned To Sam => goodguy
2018-12-30 16:59 PhyllisSmith Status closed => resolved
2018-12-30 16:59 PhyllisSmith Note Added: 0000450
2019-01-01 15:42 Sam Status resolved => closed