Following on from last weeks post, I forgot to mention building on OpenBSD/sparc64 via a LDOM running on a Sun T5210, this was even more painful than the Solaris counterpart and took the best part of a month, some of this delay was initially caused by problematic packages which held up the build, not parallelising the builds and again issues with FTP mirrors.
devel/electric-fence
was another of packages which was responsible for holding up the build that I didn’t mention in the previous post. During the build it runs a binary called eftest
and that’s it, it’s stuck there until killed.
The LDOM I was running in was allocated 4 vCPUs but the build was running as a single threaded build. Defining MAKE_JOBS=4
in pkg/etc/mk.conf
and recompressing the bootstrap kit (bootstrap.tar.gz
) helped this situation. To work around FTP issues, bulkbuilds were switched to HTTP only thanks to a pointer from Joerg Sonnenberger. As defined in pkgsrc/mk/defaults/mk.conf
#MASTER_SORT_REGEX= ftp://.*/
# Same as MASTER_SORT, but takes a regular expression for more
# flexibility in matching. Regexps defined here have higher priority
# than MASTER_SORT. This example would prefer ftp transfers over
# anything else.
# Possible: Regexps as in awk(1)
# Default: none
Setting MASTER_SORT_REGEX= http://.*/
in pkg/etc/mk.conf
and recompressing the bootstrap kit ensured builds use HTTP from there on.
The bulkbuild report showed lots of fallout from packages which hadn’t been updated yet to support LibreSSL e.g. net/wget
expects DES support.
Rodent@ fixed lang/python27
with the changes due for subsequent releases of Python.
Bernard Spil fixed Heimdal which failed due to the lack of RAND_EGD in LibreSSL, these fixes will be in the next release of Heimdal (1.6.0?), back porting the changes to 1.5.3 which is the current release available resolved the issue with lack of RAND_EGD but then failed at building a kerberised telnet due to changes in the OpenBSD IPv6 stack which removed functionality telnet was expecting to be there. There is no fix for the issue in the OpenBSD ports as Heimdal is set to build without legacy and insecure protocols such as telnet and rsh.
Due to the connectivity issues on the OpenCSW build cluster, I erased the error report and restarted the bulkbuild on Solaris 10 SPARC and 11 x86 to re-attempt everything that had failed during the previous run for whichever reason. The Solaris 10 SPARC bulkbuild has now finished with a total of 7389 packages built, previously 5701. I discovered a particularly nasty bug with lang/gcc3-c++
which cost 3 days as the configure stage ran over and over again before being killed manually.
My access to the AIX LPAR expired, taking with it what I had previously tried, I requested access again but this time also requested access to a LPAR running SUSE 12 on Power8 as well.
Still no further with the AIX LPAR but managed to getting bulkbuild going on the SUSE 12 one with a little bit of assistance.
The first thing which needed to be done was to specify the ABI and the suffix applied to the library search path. This is because the system is 64 bit without any 32bit libraries installed and by default pkgsrc opts for 32bit unless set otherwise. When attempting to bootstrap initially, it failed with ERROR: bin/digest: missing library: libc.so.6
. I initially set about the wrong path of trying to locate the glibc-32bit rpm for SUSE on Power before realising what was actually required. This may have been a knee-jerk reaction from the past before the days of yum
and such on Linux. With the necessary change to pkgsrc/mk/platform/Linux.mk
the bulkbuild environment setup continued before hanging on the installation of pkgtools/pkg_install
. pkg_add
would hang and CPU utilisation would spike to 100%.
A backtrace of the running process in gdb revealed it was stuck on mpool_get()
.
(gdb) bt
#0 0x0000000010096650 in mpool_get ()
#1 0x0000000010093658 in __bt_search ()
#2 0x000000001009318c in __bt_put ()
#3 0x000000001000b614 in pkgdb_store ()
#4 0x000000001000430c in extract_files ()
#5 0x0000000010006fd0 in pkg_do ()
#6 0x00000000100075a4 in pkg_perform ()
#7 0x0000000010005650 in main ()
Turns out the issue also affects pkgsrc on Linux/ARM and was previously reported in a bug report from 2013 with a workaround. Setting the GCC optimisation level to 0 for pkgtools/libnbcompat
and pkgtools/pkg_install
allowed mk/pbulk/pbulk.sh
to setup a buklbuild environment and a bulkbuild is currently in progress. The bulkbuild was initially aborted to added some critical missing components which caused major breakage.
zypper install libxshmfence-devel gettext-tools gcc-c++
.
With Suse Linux on Power8, that bumps my operating system count to 9 across 5 architectures. Just need to get AIX going to round off the OS count. 🙂