A week of pkgsrc #10

Following on from last weeks post, I forgot to mention building on OpenBSD/sparc64 via a LDOM running on a Sun T5210, this was even more painful than the Solaris counterpart and took the best part of a month, some of this delay was initially caused by problematic packages which held up the build, not parallelising the builds and again issues with FTP mirrors.
devel/electric-fence was another of packages which was responsible for holding up the build that I didn’t mention in the previous post. During the build it runs a binary called eftest and that’s it, it’s stuck there until killed.
The LDOM I was running in was allocated 4 vCPUs but the build was running as a single threaded build. Defining MAKE_JOBS=4 in pkg/etc/mk.conf and recompressing the bootstrap kit (bootstrap.tar.gz) helped this situation. To work around FTP issues, bulkbuilds were switched to HTTP only thanks to a pointer from Joerg Sonnenberger. As defined in pkgsrc/mk/defaults/mk.conf
#MASTER_SORT_REGEX= ftp://.*/
# Same as MASTER_SORT, but takes a regular expression for more
# flexibility in matching. Regexps defined here have higher priority
# than MASTER_SORT. This example would prefer ftp transfers over
# anything else.
# Possible: Regexps as in awk(1)
# Default: none

Setting MASTER_SORT_REGEX= http://.*/ in pkg/etc/mk.conf and recompressing the bootstrap kit ensured builds use HTTP from there on.

The bulkbuild report showed lots of fallout from packages which hadn’t been updated yet to support LibreSSL e.g. net/wget expects DES support.
Rodent@ fixed lang/python27 with the changes due for subsequent releases of Python.
Bernard Spil fixed Heimdal which failed due to the lack of RAND_EGD in LibreSSL, these fixes will be in the next release of Heimdal (1.6.0?), back porting the changes to 1.5.3 which is the current release available resolved the issue with lack of RAND_EGD but then failed at building a kerberised telnet due to changes in the OpenBSD IPv6 stack which removed functionality telnet was expecting to be there. There is no fix for the issue in the OpenBSD ports as Heimdal is set to build without legacy and insecure protocols such as telnet and rsh.

Due to the connectivity issues on the OpenCSW build cluster, I erased the error report and restarted the bulkbuild on Solaris 10 SPARC and 11 x86 to re-attempt everything that had failed during the previous run for whichever reason. The Solaris 10 SPARC bulkbuild has now finished with a total of 7389 packages built, previously 5701. I discovered a particularly nasty bug with lang/gcc3-c++ which cost 3 days as the configure stage ran over and over again before being killed manually.

My access to the AIX LPAR expired, taking with it what I had previously tried, I requested access again but this time also requested access to a LPAR running SUSE 12 on Power8 as well.
Still no further with the AIX LPAR but managed to getting bulkbuild going on the SUSE 12 one with a little bit of assistance.
The first thing which needed to be done was to specify the ABI and the suffix applied to the library search path. This is because the system is 64 bit without any 32bit libraries installed and by default pkgsrc opts for 32bit unless set otherwise. When attempting to bootstrap initially, it failed with ERROR: bin/digest: missing library: libc.so.6. I initially set about the wrong path of trying to locate the glibc-32bit rpm for SUSE on Power before realising what was actually required. This may have been a knee-jerk reaction from the past before the days of yum and such on Linux. With the necessary change to pkgsrc/mk/platform/Linux.mk the bulkbuild environment setup continued before hanging on the installation of pkgtools/pkg_install. pkg_add would hang and CPU utilisation would spike to 100%.

A backtrace of the running process in gdb revealed it was stuck on mpool_get().
(gdb) bt
#0 0x0000000010096650 in mpool_get ()
#1 0x0000000010093658 in __bt_search ()
#2 0x000000001009318c in __bt_put ()
#3 0x000000001000b614 in pkgdb_store ()
#4 0x000000001000430c in extract_files ()
#5 0x0000000010006fd0 in pkg_do ()
#6 0x00000000100075a4 in pkg_perform ()
#7 0x0000000010005650 in main ()

Turns out the issue also affects pkgsrc on Linux/ARM and was previously reported in a bug report from 2013 with a workaround. Setting the GCC optimisation level to 0 for pkgtools/libnbcompat and pkgtools/pkg_install allowed mk/pbulk/pbulk.sh to setup a buklbuild environment and a bulkbuild is currently in progress. The bulkbuild was initially aborted to added some critical missing components which caused major breakage.

zypper install libxshmfence-devel gettext-tools gcc-c++.

With Suse Linux on Power8, that bumps my operating system count to 9 across 5 architectures. Just need to get AIX going to round off the OS count. 🙂

A week of pkgsrc #9

The past few weeks have been pretty hectic, as the time for BSDcan gets shorter and shorter, I’m thinking about my talk and testing more and more in pkgsrc. Rodent@ added support for Bitrig to pkgsrc-current last month, his patches highlighted an issue with the autoconf scripts (which should be shared across core components) not being pulled in automatically. Joerg Sonnenberger resolved this issue and I regenerated the patch set again. With the system bootstrapped the next thing which was broken was Perl, applying the changes needed for OpenBSD resolved any remaining issues and the bulk build environment was ready. After three days, the first bulkbuild attempt on Bitrig was complete and a report was published. There is now a bulkbuild in progress with devel/gettext-tools and archivers/unzip fixed, that should free over 8400 packages to be attempted to be built.
For Solaris, my first bulkbuild on Solaris 10 completed after 22 days. Mid-April I also started off bulkbuilds on Solaris 11 (x86 and SPARC) using the SunStudio compilers (It’s not possible to use GCC at the moment due to removed functionality that was previously deprecated). The Solaris 11 SPARC bulkbuild is still in progress and the x86 bulkbuild is running. Unfortunately the build cluster had some connectivity issues and needed rebooting during the bulkbuild but not until lots of packages had failed to fetch distfiles, hence the figures look a lot worse than they could be. Solaris 10 SPARC report, Solaris 11 x86 report.

Through bulk building on multiple operating systems another issue that’s surfaced is problematic packages that hold the build up. On Bitrig mail/fml4 is an issue, on OpenBSD www/wml, FTP mirror issues for ruby extension on Solaris, Xorg FTP mirror issues on OmniOS. Things need regular kicking, a brief glance into pkgsrc/mk didn’t reveal any knobs which would allow the preference of HTTP for fetching distfiles. On Bitrig & OpenBSD I’ve excluded these packages from being attempted via NOT_FOR_PLATFORM statement in their Makefile until I have a look into the issue.

sevan.mit.edu completed another bulkbuild, pkgsrc-current now ships with MesaLib 10.5.3 as graphics/MesaLib, version 7 has now been re-imported as graphics/MesaLib7 by tnn@, the new MesaLib needed a patch for FreeBSD, similar to NetBSD to build successfully, due to ERESTART not being defined. At present, it’s still broken on Tiger as I’ve not looked into yet.

I revisited AIX again to test out pkgsrc once again, this has turned into a massive yak shaving session. I’ve yet to run a bulkbuild successfully as the scan stage ends with a coredump.
I originally started off with using the stock system shell, bootstrap completed successfully but scan stage of a bulkbuild would just stop without anything being logged. Manually changing the shell used to shells/pdksh in pkg/etc/mk.conf and pbulk/etc/mk.conf resulted in the following error message:
bmake: don't know how to make pbulk-index. Stop
pbulk-scan: realloc failed:

This turned to be a lack of RAM, my shell account was to a AIX 7.1 LPAR running on a Power8 host with 2 CPUs and 2GB of RAM committed, unfortunately the OS image IBM provided came with Tivoli support enabled and a bug in the resource management controller which meant RMC was consuming way more resource than it needed to. I was running with less than 128MB of RAM.
Stopping Tivoli & RMC freed up about 500MB of RAM, attempting to bulkbuild again, caused the process to fail once again at the same stage. With a heads up from David Brownlee & Joerg Sonnenberger, I bumped the memory and data area resource limits to 256MB.
This allowed the scan to finish with a segfault.
/usr/pkgsrc/pbulk/libexec/pbulk/scan[54]: 11272416 Segmentation fault(coredump).
pscan.stderr logged multiple instances of
bmake: don't know how to make pbulk-index. Stop.
The segfault generated a coredump but it turned out that dbx, the debugger in AIX was not installed. IBMPDP on twitter helped by pointing to the path where some components are available for installation, unfortunately, while the dbx package was available there, some of its dependencies were not. Waiting on IBMPDP to get back to me, I fetched a new pkgsrc-current snapshot (I couldn’t update via CVS because it wouldn’t build) and re-setup my pbulk environment via mk/pbulk/pbulk.sh.
I should mention that initially when I setup, I’d explicitly set CC=/usr/bin/gcc last time, then while trying to get various things to build subsequently, I’d symlink /usr/bin/cc to /usr/bin/gcc. When I came to set thing up with the new snapshot, I did not pass CC=/usr/bin/gcc this time round and found that I was unable to link Perl, not sure if this was the Perl build files assuming if on AIX & /usr/bin/cc exists, it’s XLC or if ld(1) takes on different behaviour but I had to remove this symlink.
Once everything was setup, the bulkbuild failed agin at the same place, except this time I had a different message logged.
/bin/sh: There is no process to read data written to a pipe..
I edited the bootstrap/bootstrap script & devel/bmake/Makefile to set shells/pdksh as a dependency & rerun bulkbuild.
The scan stage again completed with a coredump with this time pscan.stderr just contained Memory fault (core dumped).
I’ve committed these changes so pkgsrc-current now defaults to using shells/pdksh as its shell but have not been able to try anything else as this weekend the system is unaccessible due to maintenance.

At present, I’m attempting to bulkbuild pkgsrc-current on 8 Operating systems
OpenBSD (5.6-RELEASE & -current), FreeBSD, Bitrig (current), Mac OS X (Tiger), Solaris (10 & 11), OmniOS on 4 architectures (i386, AMD64, SPARC, PowerPC).
If I could get AIX going that would bump the OS & arch could up by 1. Maybe by the next post perhaps. 🙂

Thanks to Patrick Wildt for access to host running Bitrig and Rodent@ for adding support to pkgsrc.

A week of pkgsrc #8

Or should that be a month of pkgsrc!
Since I wrote my last post I received the good news that I’ll be presenting at BSDCan this year in Ottawa. I’m really stoked about this and looking forward to presenting there. In preparation for the big day, I’ll be speaking about different aspects of what I’ve been working on at various events/meetings. My first speaking opportunities will be at Oracle’s SolarisSIG (LOSUG in a past life) tomorrow on cross platform packaging with pkgsrc.
I didn’t have any hosts running Solaris to attempt bulkbuilds on for this, so I asked for shell access at various places, this landed me with access to Solaris 9 to 11 on x86 & SPARC and OmniOS.
Shell access to a Solaris 10u9 host came first and I quickly found that mandoc would not build due to the use of mkdtemp(), dirfd() and vasprintf() whilst attempting to setup a bulkbuild environment using the mk/pbulk/pbulk.sh script in pkgsrc.
Following up with the mandoc developers regarding the issue meant the issue was fixed by Ingo Schwarze within a couple of days with access to Solaris via OpenCSW, the next release of mandoc due Summer time will contain these fixes. I’m undecided whether to back port the fixes to the current release or not. If you have a Solaris 10 host with a “recent” patch cluster (unsure how recent) you should be unaffected by this issue.

On OmniOS, the shells/standalone-tcsh package reared its head again. OmniOS ships with tcsh as standard which shells/standalone-tcsh clobbers again during bulkbuild. The solution to this may be that pkgsrc becomes aware of the various Illumos distros instead of treating everything as SunOS, but nothing has been decided for definite and needs further discussion.
The bulkbuilds on OmniOS using the toolchain published in IPS for that release show issues in pkgsrc where libgomp bundled with GCC is not being found by gettext, this causes lots of packages to fail.
The first bulkbuild which completed showed it was possible to build 9547 out 15245 packages without issue but the inability to build databases/shared-mime-info due to the libgomp issue caused 2274 packages to fail. This was the large impact of the issue mentioned but there are more packages which suffer from the same problem.

With the resumption of bulk building pkgsrc on FreeBSD, I found lots of packages which would fail to package due to use of the -o flag with pax(1) which is not present in the FreeBSD version. I’ve not yet addresses the issue with those packages but I did raise a patch for FreeBSD-current so that it will be available in future versions of FreeBSD. It is now committed and will treacle down to 10-STABLE at some point for inclusion in FreeBSD 10.2-RELEASE.

Bulk builds have also resumed on OpenBSD/amd64 and Sparc64, there is a third bulkbuild in progress on amd64 while the sparc64 host is still running the first attempt. The use of unprivileged user account and the lack of nullfs support prevents the setup of parallel runs to speed up the process. Building on OpenBSD was extremely useful for indicating what software in pkgsrc is effected by LibreSSL without having to put effort into integrating LibreSSL into pkgsrc first. It also raised an instance of copying overlapping buffers destructively (using memcpy() vs memmove()), a fix for this issue is yet to be committed pending further investigation. MAKE_JOBS was raised to 8 on OpenBSD/sparc64 to improve performance whilst compiling, the build environment is hosted on a Sun T5210 with a dedicated disk mounted with soft writes & the kern.bufcachepercent value bumped from 20 to 50.

With the pkgsrc-2015Q1 release announced, the figures for packages indicate that Darwin/PowerPC exceeded Darwin/x86.

11224 binary packages built with gcc for Darwin 8.11.0/powerpc
10019 binary packages built with gcc for Darwin 10.8.0/i386

My first bulkbuild report sent to to the pkgsrc-bulk mailing list back in September 2014 shows a total of 8501 packages could be built. There is more work required to keep this position in the future, as a new version of MesaLib is to be committed at some point soon. There is a bulkbuild in progress on sevan.mit.edu, upon completion I’ll be placing a request for the host to be wiped and setup again, the bulk builds are getting slower and slower and I suspect the IDE disk may be failing.

Thanks to Phil Stracchino, Sascha Curth and the OpenCSW Project for access to hosts running Solaris. Ian Kremlin, Brandon Mercer & Rodent @ NetBSD for access to hosts running OpenBSD.

UNIX/WORLD Review Sun 2 Workstation, October 1984

I was looking around for information on Lions’ Commentary when I came across this site which happened to be of a journalist who reviewed the Sun 2 workstation for the UNIX/WORLD magazine back in 1984.
The reference to a Sun employee demonstrating the dual head capability made me smile & reminded me of the Project Looking Glass demonstration.

http://www.youtube.com/watch?v=UKZ6yOeVghM

What took me there was his derivative copy of the UNIX v6 source code but I also grabbed the v7 documentation which he posted up as updated PDF (changelog on the page)

One of the following problems exists: Hardware failure Unformatted disk

If you’re trying to install Solaris (SPARC) on a disc which previously had another O/S (*BSD) on it, you will more then likely come across the following error after the system identification stage is complete:
One or more disks are found, but one of the following problems exists:
> Hardware failure

> Unformatted disk.
The problem is to do with the disc label, at the shell fireup format with the e switch
format -e
select the disc which you are trying to install onto, if you are told that the disc doesnt not contain a disk label, would you like to write a lablel, say yes & specify the type as SMI Label, If youre not asked about the label & presented with a menu, goto the partition section, choose label, specify the type as SMI.
quit & reboot & all should be well 🙂

Running Solaris 9 on Mac Virtual PC

Solaris 9 normally wont allow you to install onto a virtual machine on Virtual PC, it bombs out with the error
486 Processor Detected
This Processor is not supported by this release of Solaris

Thanx to this simple hack found on Kernel Thread its possible to install Solaris 9 without a hitch on VPC

Simply run the following commands against the .iso of the install CD or Software #1 CD:
perl -pi -e 's#Pentium II#ff/08.....#g' filename.iso
&
VPC 6 users: perl -pi -e 's#GenuineIntel#ConnectixCPU#g' filename.is
or
VPC 7 users: perl -pi -e 's#GenuineIntel#Virtual CPU #g' filename.iso