guix: gnuboot-trisquel-preseed.img: Make it reproducible.

The "4.7 Mcopy" section inside the mtools info manual explains that
mcopy's '-m' argument "Preserve the file modification time.".

So in the commit 9cc02ddde1 ("packages:
roms: Start adding automatic tests."), I vaguely recall having used it
to workaround some reproducibility issues.

Guix 1.4.0 uses mtools 4.0.42. So after retrieving the source with
'guix time-machine --commit=v1.4.0 -- build --system=i686-linux
--source mtools' we have that in the writeit function in mcopy.c (with
arg->preserveTime being set by -m):
	/* preserve mod time? */
	if (arg->preserveTime)
		now = date;
	else
		getTimeNow(&now);

And date is set by the following in mtools 4.0.42:
	if (Source->Class->get_data(Source, &date, &filesize,
				    &type, 0) < 0 ){
		fprintf(stderr, "Can't stat source file\n");
		return -1;
	}

Since Guix is supposed to make images reproducible somehow, and that
mtools isn't patched by Guix to do that, and that it takes the time
from the source file, I used '-m'.

Since I was confident enough that gnuboot-trisquel-preseed.img was
reproducible, in the commit 9cc02ddde1
("packages: roms: Start adding automatic tests."), I also added the
checksum and checked it at build time to make sure the image is really
reproducible.

But when building this image again few days ago the checksum was
different. So I used the Guix diffoscope package to investigate the
issue.

Note that at the time of writing, you either need to use Guix's
diffoscope or to disable guestfs support in diffoscope for it to work,
otherwise diffoscope 277-1 (the version in the Parabola at the time of
writing) produce a python error probably because the partition table
size is 0, and it contains a FAT12 filesystem according to fdisk, but
then the FAT12 filesystem contained within also contains that
partition table. See the upstream bugreport at
https://salsa.debian.org/reproducible-builds/diffoscope/-/issues/390
for more details.

Here the preseed.img.old file corresponds to the checksum in the
commit 9cc02ddde1 ("packages: roms:
Start adding automatic tests."), and preseed.img.new to the one I got
by building again few days ago:
    $ sha512sum preseed.img.old preseed.img.new
    f12a4a941afc9e24288481ed1b44fbfedf52d706e9e8aa01cfb26bf5ccd54ca52afe9ef5497faf2966ba730c1200d8b8691ebb87e6a75cd8966e0edd49bcb3c0  preseed.img.old
    5613d9a5cdd8847d5a688d56c77b8cf8881baa5eef7f373bb05a5ec601e383204e6a57b399d3de913c29386b18e7e3903c9511037922204744e3234cadc8671b  preseed.img.new

And by using diffoscope we have:
    $ diffoscope preseed.img.old preseed.img.new
    --- preseed.img.old
    +++ preseed.img.new
    │┄ Format-specific differences are supported for ext2/ext3/ext4/btrfs/fat filesystems but no file-specific differences were detected; falling back to a binary diff. file(1) reports: DOS/MBR boot sector, code offset 0x3c+2, OEM-ID "mkfs.fat", sectors/cluster 4, root entries 512, sectors 2048 (volumes <=32 MB), Media descriptor 0xf8, sectors/FAT 2, sectors/track 16, serial number 0x1234abcd, label: "MEDIA      ", FAT (12 bit)
    │┄ Installing the 'guestfs' Python module may produce a better output.
    @@ -157,23 +157,23 @@
     000009c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     000009d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     000009e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     000009f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00000a00: 4d45 4449 4120 2020 2020 2008 0000 5a4b  MEDIA      ...ZK
     00000a10: 6e46 6e46 0000 5a4b 6e46 0000 0000 0000  nFnF..ZKnF......
     00000a20: 5052 4553 4545 4420 4346 4720 1800 0000  PRESEED CFG ....
    -00000a30: 21ec 21ec 0000 0000 21ec 0200 f50d 0000  !.!.....!.......
    +00000a30: 21ec 2859 0000 0000 21ec 0200 f50d 0000  !.(Y....!.......
     00000a40: 4365 0000 00ff ffff ffff ff0f 0000 ffff  Ce..............
     00000a50: ffff ffff ffff ffff ffff 0000 ffff ffff  ................
     00000a60: 0272 002d 0062 006f 006f 000f 0000 7400  .r.-.b.o.o....t.
     00000a70: 2e00 7300 6500 7200 7600 0000 6900 6300  ..s.e.r.v...i.c.
     00000a80: 0173 0068 0075 0074 0064 000f 0000 6f00  .s.h.u.t.d....o.
     00000a90: 7700 6e00 2d00 6100 6600 0000 7400 6500  w.n.-.a.f...t.e.
     00000aa0: 5348 5554 444f 7e31 5345 5220 0000 0000  SHUTDO~1SER ....
    -00000ab0: 21ec 21ec 0000 0000 21ec 0400 3002 0000  !.!.....!...0...
    +00000ab0: 21ec 2859 0000 0000 21ec 0400 3002 0000  !.(Y....!...0...
     00000ac0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00000ad0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00000ae0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00000af0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00000b00: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00000b10: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00000b20: 0000 0000 0000 0000 0000 0000 0000 0000  ................

Here it really look like a timestamp, and since mdir gave no
difference between the 2 files inside the 2 images, I patched mdir
with the following patch:
@@ -438,6 +438,18 @@ static int list_file(direntry_t *entry, MainParam_t *mp UNUSEDP)
                if(*mdir_longname)
                        printf(" %s", mdir_longname);
                printf("\n");
+
+               printf("-> ctime_ms: 0x%hhx\n", entry->dir.ctime_ms);
+               printf("-> ctime[0]: 0x%hhx\n", entry->dir.ctime[0]);
+               printf("-> ctime[1]: 0x%hhx\n", entry->dir.ctime[1]);
+               printf("-> cdate[0]: 0x%hhx\n", entry->dir.cdate[0]);
+               printf("-> cdate[1]: 0x%hhx\n", entry->dir.cdate[1]);
+               printf("-> adate[0]: 0x%hhx\n", entry->dir.adate[0]);
+               printf("-> adate[1]: 0x%hhx\n", entry->dir.adate[1]);
+               printf("-> time[0]: 0x%hhx\n", entry->dir.time[0]);
+               printf("-> time[1]: 0x%hhx\n", entry->dir.time[1]);
+               printf("-> date[0]: 0x%hhx\n", entry->dir.date[0]);
+               printf("-> date[1]: 0x%hhx\n", entry->dir.date[1]);
        } else {
                char tmp[4*MAX_VNAMELEN+1];

And this then gives  the following diff:
 -> ctime[1]: 0x0
 -> cdate[0]: 0x21
 -> cdate[1]: 0xec
--> adate[0]: 0x21
--> adate[1]: 0xec
+-> adate[0]: 0x28
+-> adate[1]: 0x59
 -> time[0]: 0x0
 -> time[1]: 0x0
 -> date[0]: 0x21
@@ -20,8 +20,8 @@
 -> ctime[1]: 0x0
 -> cdate[0]: 0x21
 -> cdate[1]: 0xec
--> adate[0]: 0x21
--> adate[1]: 0xec
+-> adate[0]: 0x28
+-> adate[1]: 0x59
 -> time[0]: 0x0
 -> time[1]: 0x0
 -> date[0]: 0x21

This means that the access date difers. This also explains why it was
not spoted during the creation of the commit
9cc02ddde1 ("packages: roms: Start
adding automatic tests.") as tests were done at the same date.

So this time I created a build VM by adding the following service to
my Guix system configuration (I also had to remove hacks I had that
set the kvm group id to the same ID used by Trisquel run 'guix system
reconfigure' and rebooted):
    (service virtual-build-machine-service-type
            (virtual-build-machine
             (cpu "host")
             (cpu-count 2)
             (auto-start? #f)))

This created a VM whose clock is set to 'a few years ago' according to
the Guix manual[1].

[1]https://guix.gnu.org/manual/devel/en/html_node/Virtualization-Services.html#Virtual-Build-Machines

I then ran built the image as usual:
    $ guix time-machine --commit=v1.4.0 -- build -L resources/guix/ \
      gnuboot-trisquel-preseed.img
      --without-tests=gnuboot-trisquel-preseed.img

I then copied the resulting image, started the build VM with 'herd
start build-vm', deleted the old image from the store (with 'guix gc
-D') and then re-built it (it used the VM to offload the build as
shown in the build logs).

And now both resulting files are now the same despite being built on a
different date.

See also the following blog post for more context into use cases for
this build VM[2]:

[2]https://hpc.guix.info/blog/2024/03/adventures-on-the-quest-for-long-term-reproducible-deployment/

Bug: https://savannah.gnu.org/bugs/?66224
Signed-off-by: Denis 'GNUtoo' Carikli <GNUtoo@cyberdimension.org>
Acked-by: Adrien 'neox' Bourmault <neox@gnu.org>
This commit is contained in:
Denis 'GNUtoo' Carikli 2024-09-23 16:42:55 +02:00 committed by Adrien 'neox' Bourmault
parent 40fcb94e2f
commit 4c3de49fbb
Signed by: neox
GPG Key ID: 57BC26A3687116F6
2 changed files with 11 additions and 7 deletions

View File

@ -16,6 +16,7 @@
#:use-module (gnu packages base)
#:use-module (gnu packages bootloaders)
#:use-module (gnu packages cdrom)
#:use-module (gnu packages check)
#:use-module (gnu packages compression)
#:use-module (gnu packages debian)
#:use-module (gnu packages disk)
@ -226,6 +227,7 @@ manually.")
(native-inputs (list
coreutils
dosfstools
libfaketime
mtools))
(arguments
(list
@ -262,12 +264,14 @@ manually.")
"-n" "MEDIA"
"--invariant"
"preseed.img")
(invoke "mcopy"
(invoke "faketime" "@315532800" ; FAT epoch.
"mcopy"
"-i" "preseed.img"
"-m"
"preseed.cfg"
"::/preseed.cfg")
(invoke "mcopy"
(invoke "faketime" "@315532800" ; FAT epoch.
"mcopy"
"-i" "preseed.img"
"-m"
"shutdown-after-boot.service"

View File

@ -1 +1 @@
f12a4a941afc9e24288481ed1b44fbfedf52d706e9e8aa01cfb26bf5ccd54ca52afe9ef5497faf2966ba730c1200d8b8691ebb87e6a75cd8966e0edd49bcb3c0 preseed.img
d3dc630d1b3971a2f859ec2c72e1bc7b7ab59ea7c3932735868c9709567d58f9e7b4bd1af4ccb1e4f2931f974757cccb4ec2804baa446d540904de0cbf567a48 preseed.img