A week of pkgsrc #10

Following on from last weeks post, I forgot to mention building on OpenBSD/sparc64 via a LDOM running on a Sun T5210, this was even more painful than the Solaris counterpart and took the best part of a month, some of this delay was initially caused by problematic packages which held up the build, not parallelising the builds and again issues with FTP mirrors.
devel/electric-fence was another of packages which was responsible for holding up the build that I didn’t mention in the previous post. During the build it runs a binary called eftest and that’s it, it’s stuck there until killed.
The LDOM I was running in was allocated 4 vCPUs but the build was running as a single threaded build. Defining MAKE_JOBS=4 in pkg/etc/mk.conf and recompressing the bootstrap kit (bootstrap.tar.gz) helped this situation. To work around FTP issues, bulkbuilds were switched to HTTP only thanks to a pointer from Joerg Sonnenberger. As defined in pkgsrc/mk/defaults/mk.conf
#MASTER_SORT_REGEX= ftp://.*/
# Same as MASTER_SORT, but takes a regular expression for more
# flexibility in matching. Regexps defined here have higher priority
# than MASTER_SORT. This example would prefer ftp transfers over
# anything else.
# Possible: Regexps as in awk(1)
# Default: none

Setting MASTER_SORT_REGEX= http://.*/ in pkg/etc/mk.conf and recompressing the bootstrap kit ensured builds use HTTP from there on.

The bulkbuild report showed lots of fallout from packages which hadn’t been updated yet to support LibreSSL e.g. net/wget expects DES support.
Rodent@ fixed lang/python27 with the changes due for subsequent releases of Python.
Bernard Spil fixed Heimdal which failed due to the lack of RAND_EGD in LibreSSL, these fixes will be in the next release of Heimdal (1.6.0?), back porting the changes to 1.5.3 which is the current release available resolved the issue with lack of RAND_EGD but then failed at building a kerberised telnet due to changes in the OpenBSD IPv6 stack which removed functionality telnet was expecting to be there. There is no fix for the issue in the OpenBSD ports as Heimdal is set to build without legacy and insecure protocols such as telnet and rsh.

Due to the connectivity issues on the OpenCSW build cluster, I erased the error report and restarted the bulkbuild on Solaris 10 SPARC and 11 x86 to re-attempt everything that had failed during the previous run for whichever reason. The Solaris 10 SPARC bulkbuild has now finished with a total of 7389 packages built, previously 5701. I discovered a particularly nasty bug with lang/gcc3-c++ which cost 3 days as the configure stage ran over and over again before being killed manually.

My access to the AIX LPAR expired, taking with it what I had previously tried, I requested access again but this time also requested access to a LPAR running SUSE 12 on Power8 as well.
Still no further with the AIX LPAR but managed to getting bulkbuild going on the SUSE 12 one with a little bit of assistance.
The first thing which needed to be done was to specify the ABI and the suffix applied to the library search path. This is because the system is 64 bit without any 32bit libraries installed and by default pkgsrc opts for 32bit unless set otherwise. When attempting to bootstrap initially, it failed with ERROR: bin/digest: missing library: libc.so.6. I initially set about the wrong path of trying to locate the glibc-32bit rpm for SUSE on Power before realising what was actually required. This may have been a knee-jerk reaction from the past before the days of yum and such on Linux. With the necessary change to pkgsrc/mk/platform/Linux.mk the bulkbuild environment setup continued before hanging on the installation of pkgtools/pkg_install. pkg_add would hang and CPU utilisation would spike to 100%.

A backtrace of the running process in gdb revealed it was stuck on mpool_get().
(gdb) bt
#0 0x0000000010096650 in mpool_get ()
#1 0x0000000010093658 in __bt_search ()
#2 0x000000001009318c in __bt_put ()
#3 0x000000001000b614 in pkgdb_store ()
#4 0x000000001000430c in extract_files ()
#5 0x0000000010006fd0 in pkg_do ()
#6 0x00000000100075a4 in pkg_perform ()
#7 0x0000000010005650 in main ()

Turns out the issue also affects pkgsrc on Linux/ARM and was previously reported in a bug report from 2013 with a workaround. Setting the GCC optimisation level to 0 for pkgtools/libnbcompat and pkgtools/pkg_install allowed mk/pbulk/pbulk.sh to setup a buklbuild environment and a bulkbuild is currently in progress. The bulkbuild was initially aborted to added some critical missing components which caused major breakage.

zypper install libxshmfence-devel gettext-tools gcc-c++.

With Suse Linux on Power8, that bumps my operating system count to 9 across 5 architectures. Just need to get AIX going to round off the OS count. 🙂

A week of pkgsrc #9

The past few weeks have been pretty hectic, as the time for BSDcan gets shorter and shorter, I’m thinking about my talk and testing more and more in pkgsrc. Rodent@ added support for Bitrig to pkgsrc-current last month, his patches highlighted an issue with the autoconf scripts (which should be shared across core components) not being pulled in automatically. Joerg Sonnenberger resolved this issue and I regenerated the patch set again. With the system bootstrapped the next thing which was broken was Perl, applying the changes needed for OpenBSD resolved any remaining issues and the bulk build environment was ready. After three days, the first bulkbuild attempt on Bitrig was complete and a report was published. There is now a bulkbuild in progress with devel/gettext-tools and archivers/unzip fixed, that should free over 8400 packages to be attempted to be built.
For Solaris, my first bulkbuild on Solaris 10 completed after 22 days. Mid-April I also started off bulkbuilds on Solaris 11 (x86 and SPARC) using the SunStudio compilers (It’s not possible to use GCC at the moment due to removed functionality that was previously deprecated). The Solaris 11 SPARC bulkbuild is still in progress and the x86 bulkbuild is running. Unfortunately the build cluster had some connectivity issues and needed rebooting during the bulkbuild but not until lots of packages had failed to fetch distfiles, hence the figures look a lot worse than they could be. Solaris 10 SPARC report, Solaris 11 x86 report.

Through bulk building on multiple operating systems another issue that’s surfaced is problematic packages that hold the build up. On Bitrig mail/fml4 is an issue, on OpenBSD www/wml, FTP mirror issues for ruby extension on Solaris, Xorg FTP mirror issues on OmniOS. Things need regular kicking, a brief glance into pkgsrc/mk didn’t reveal any knobs which would allow the preference of HTTP for fetching distfiles. On Bitrig & OpenBSD I’ve excluded these packages from being attempted via NOT_FOR_PLATFORM statement in their Makefile until I have a look into the issue.

sevan.mit.edu completed another bulkbuild, pkgsrc-current now ships with MesaLib 10.5.3 as graphics/MesaLib, version 7 has now been re-imported as graphics/MesaLib7 by tnn@, the new MesaLib needed a patch for FreeBSD, similar to NetBSD to build successfully, due to ERESTART not being defined. At present, it’s still broken on Tiger as I’ve not looked into yet.

I revisited AIX again to test out pkgsrc once again, this has turned into a massive yak shaving session. I’ve yet to run a bulkbuild successfully as the scan stage ends with a coredump.
I originally started off with using the stock system shell, bootstrap completed successfully but scan stage of a bulkbuild would just stop without anything being logged. Manually changing the shell used to shells/pdksh in pkg/etc/mk.conf and pbulk/etc/mk.conf resulted in the following error message:
bmake: don't know how to make pbulk-index. Stop
pbulk-scan: realloc failed:

This turned to be a lack of RAM, my shell account was to a AIX 7.1 LPAR running on a Power8 host with 2 CPUs and 2GB of RAM committed, unfortunately the OS image IBM provided came with Tivoli support enabled and a bug in the resource management controller which meant RMC was consuming way more resource than it needed to. I was running with less than 128MB of RAM.
Stopping Tivoli & RMC freed up about 500MB of RAM, attempting to bulkbuild again, caused the process to fail once again at the same stage. With a heads up from David Brownlee & Joerg Sonnenberger, I bumped the memory and data area resource limits to 256MB.
This allowed the scan to finish with a segfault.
/usr/pkgsrc/pbulk/libexec/pbulk/scan[54]: 11272416 Segmentation fault(coredump).
pscan.stderr logged multiple instances of
bmake: don't know how to make pbulk-index. Stop.
The segfault generated a coredump but it turned out that dbx, the debugger in AIX was not installed. IBMPDP on twitter helped by pointing to the path where some components are available for installation, unfortunately, while the dbx package was available there, some of its dependencies were not. Waiting on IBMPDP to get back to me, I fetched a new pkgsrc-current snapshot (I couldn’t update via CVS because it wouldn’t build) and re-setup my pbulk environment via mk/pbulk/pbulk.sh.
I should mention that initially when I setup, I’d explicitly set CC=/usr/bin/gcc last time, then while trying to get various things to build subsequently, I’d symlink /usr/bin/cc to /usr/bin/gcc. When I came to set thing up with the new snapshot, I did not pass CC=/usr/bin/gcc this time round and found that I was unable to link Perl, not sure if this was the Perl build files assuming if on AIX & /usr/bin/cc exists, it’s XLC or if ld(1) takes on different behaviour but I had to remove this symlink.
Once everything was setup, the bulkbuild failed agin at the same place, except this time I had a different message logged.
/bin/sh: There is no process to read data written to a pipe..
I edited the bootstrap/bootstrap script & devel/bmake/Makefile to set shells/pdksh as a dependency & rerun bulkbuild.
The scan stage again completed with a coredump with this time pscan.stderr just contained Memory fault (core dumped).
I’ve committed these changes so pkgsrc-current now defaults to using shells/pdksh as its shell but have not been able to try anything else as this weekend the system is unaccessible due to maintenance.

At present, I’m attempting to bulkbuild pkgsrc-current on 8 Operating systems
OpenBSD (5.6-RELEASE & -current), FreeBSD, Bitrig (current), Mac OS X (Tiger), Solaris (10 & 11), OmniOS on 4 architectures (i386, AMD64, SPARC, PowerPC).
If I could get AIX going that would bump the OS & arch could up by 1. Maybe by the next post perhaps. 🙂

Thanks to Patrick Wildt for access to host running Bitrig and Rodent@ for adding support to pkgsrc.

A week of pkgsrc #4

AnyConnect login banner

Shortly after the last blog post I had access to a couple of AIX LPAR. This would be my first time on a IBM PowerPC system and AIX, I’d applied for two AIX 7.1 instances, one defined as “AIX 7.1 Porting Image” and the other as plain “AIX 7.1”. The difference at a glance seemed to be the porting image had more gnu / common open source tools e.g GNU/Tar though both images had a version of GCC installed.

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/bin/../libexec/gcc/powerpc-ibm-aix7.1.0.0/4.6.0/lto-wrapper
Target: powerpc-ibm-aix7.1.0.0
Configured with: ./configure --disable-multilib --with-cpu=powerpc --enable-debug=no --with-debug=no --without-gnu-ld --with-ld=/usr/bin/ld --with-pic --enable-threads=aix --with-mpfr=/opt/freeware/lib --with-gmp=/opt/freeware/lib --with-system-zlib=/opt/freeware/lib --with-mpc=/opt/freeware/lib --with-mpicc=mpcc_r --with-libiconv-prefix=/usr --disable-nls --prefix=/software/gnu_c/bin --enable-languages=c,c++
Thread model: aix
gcc version 4.6.0 (GCC)

The stock version came with GCC 4.2 built on AIX 6.1 whereas the porting image came with GCC 4.6.
Alongside the open source tools each instance also had proprietary tools installed including IBM’s compiler XLC, cc without any options invokes a man page which describes the different commands that represent a language at a level.

c99 – Invokes the compiler for C source files, with a default language level of STDC99 and specifies compiler option -qansialias (to allow type-based aliasing). Use this invocation for strict conformance to the ISO/IEC 9899:1999 standard..

The pkgsrc bootstrap process didn’t work too well by trying to allow it to workout things out for itself via cc so opted to use GCC specifically.

export CC=gcc

pkgsrc happily bootstrapped without privilege and I proceeded to install misc/tmux and shells/pdksh on AIX.

pkgsrc pkg_info on AIX

security/openssl comes with 4 different configuration settings for AIX, a pair of settings for the XLC & GCC compilers with a 32bit or 64bit target. It turned out that in pkgsrc it just defaulted to aix-cc (XLC with a 32bit target), pkg/49131 is now committed so the correct configuration is used, XLC successfully builds OpenSSL with a 32bit or 64bit ABI but GCC is only able to manage a 32bit target.

To switch compiler to xlc, declare it as the value to PKGSRC_COMPILER in your mk.conf.

Over the week I attempted to compile components of GCC 4.8 without much success, starting off with lang/gcc48-cc++ & falling back to lang/gcc48-libs.
The build process was very unstable, again as with the Tiger/PowerPC, the build would spin off & hang, pegging the CPU until killed. Attempting to restrict the processor time via ulimit didn’t have much effect.

Alongside trying to get GCC built on AIX, I kicked off building meta-pkgs/bulk-medium on sevan.mit.edu, the previously reported unfixed components prevented some of the packages from building again (ruby, MesaLib, cmake).

I began looking into fixing devel/cmake so that it would link against the correct version of curl libraries & use the matching header files, Modules/FindCURL.cmake in the cmake source references 4 variables which provide some control but I was unsuccessful in being able to pass these to the pkgsrc make process. While trying to resolve this issue I also discovered that on more recent version of Mac OS, the dependencies from pkgsrc ignored, opting for the use of the Apple supplied versions even though the pkgsrc version would be installed.

-- Found ZLIB: /usr/lib/libz.dylib (found version "1.2.5")
-- Found CURL: /usr/lib/libcurl.dylib (found version "7.30.0")
-- Found BZip2: /usr/lib/libbz2.dylib (found version "1.0.6")
-- Looking for BZ2_bzCompressInit in /usr/lib/libbz2.dylib
-- Looking for BZ2_bzCompressInit in /usr/lib/libbz2.dylib - found
-- Found LibArchive: /usr/lib/libarchive.dylib (found version "2.8.4")
-- Found EXPAT: /usr/lib/libexpat.dylib (found version "2.0.1")
-- Looking for wsyncup in /usr/lib/libcurses.dylib
-- Looking for wsyncup in /usr/lib/libcurses.dylib - found
-- Looking for cbreak in /usr/lib/libncurses.dylib
-- Looking for cbreak in /usr/lib/libncurses.dylib - found

mail/mailman had a missing README in PLIST which was handled differently between Tiger & newer releases. pkg/49143 was committed to fix that.