Former things – OpenIndiana Hackathon

Gates at EveryCity’s former office, near London Bridge – 31/07/2010

At July’s London OpenSolaris User Group (LOSUG) meeting, Alasdair Lumsden invited various folks to the EveryCity office for a hackathon to start building a new distro on Saturday 31st of July, 2010. The hackathon was attended by folks from the LOSUG community in person and remotely on IRC from the OpenSolaris community at large (transcripts for day 1, 1&2. On the day, work began to attempt to build a distro from the published sources. I spent some time setting up the mailman instance (on Solaris 10 with Exim!) and creating various lists for use by the project that day.

Illumos webinar presented by Garrett D’Amore on 3/8/2010 18:00(BST)

Over the course of the next few weeks with the help with the help of Alan Coopersmith and Rich Lowe we attempted to build and document the various consolidations, Chris Ridd suffered the XNV (X11) consolidation which was the most painful to build due to missing components which were never pushed out publicly, if I recall correctly. I initially started on building the Sun FreeWare (SFW) consolidation and when that was done and documented, moved on to OSNET? (terminology might be wrong, I’m referring to the core-os consolidation). The confluence wiki wasn’t archive friendly it seems so trying to fish the initial documentation out from archive.org proved to be a bit of a challenge since the navigation menu wont load.

At some point Garrett D’Amore showed up on IRC and began lording it over us hanging out and I just wondered who is this person? as I saw OpenIndiana as Alasdair’s thing.

This was really my first actual involvement as a member of an open source project and I was very green, I recall that my patches to the SFW repo were committed by someone else but it wasn’t just being green on a technical ability level, it was also in my thinking. Thinking that another project that’s been around a little longer have a process down, why does anyone need to do anything different. That sort of rigid thinking would go on to receive heckles on more than one occasion 🙂 , many years later by Phil Harmon when a couple of us robots would attend to what followed LOSUG as Solaris SIG (Specialist Interest Group?) in London.

According to my mail archive, I handed over the mailman details to Alasdair a few months later, and left the project on 23 Nov 2010. During the few months on the project I made friends many of whom I still have contact with today. Chris Ridd, James O’Gorman, Andrew Watkins, Alan Coopersmith, Jeppe Fihl-Pearson, Peter Tribble, Andrzej Szeszo to name a few, but there were many more folks involved.

Apologies if I upset or annoyed anyone with my “obligatory F*ee*SD reference” comments in those days, I must admit it must’ve been quite nauseating.
Over the years, since leaving OpenIndiana, I have often wondered if things would’ve worked out differently if Garrett had been told to fuck off.

A week of pkgsrc #12

To fill in the gap since the last post, I thought I’d get the notes which had been collecting up, posted here. pkgsrc got a mention in the Quarterly FreeBSD status report. My bulkbuild effort started on FreeBSD/amd64 10.1-RELEASE but thanks to my friend James O’Gorman, I was able to expand to FreeBSD 11-CURRENT and recently switched over from 10.1-RELEASE to 10.2-RELEASE.
I got the idea to try to pkgsrc on Android after someone posted a screenshot of their Nexus 7 tablet with the bootstrap process completed.

There are several projects on the google play store for running the user land built from a Linux/arm distro in a chroot on Android.
The first project I tried was Debian noroot (based on the tweet that inspired me), it spawned a full X11 desktop to run & so the process was painfully slow.

Switching to GNUroot Debian which just ran a shell in the chroot was much faster at extracting the pkgsrc archive though bootstrap still took long. The best result was with Linux deploy using an Arch Linux user land, everything was very snappy.

On Mac OS X Tiger PowerPC, GCC 5 appears to no longer require switching off multilib support when building on a 32-bit PowerPC CPU, my hardware has changed but the CPU is still a G4. The same changes to force dwarf2 and removing the space in-between flags and paths fed to the linker were otherwise required, as with previous versions of GCC.

I spent a little time with OmniOS and “addressed” the outstanding issues which prevented it from working out of the box. shells/standalone-tcsh was excluded on OmniOS which prevented the version of tcsh shipped with the OS from being clobbered during bulkbuilds. The other issue was what appeared to be a problem with gettext but turned to be an issue with the compiler shipped with OmniOS. This became a topic of discussion on what the correct solution to the problem is. The GCC provided with OmniOS is built with Fortran support and includes the OpenMP libraries (I’m guessing this is the reason for the libraries) in its private lib directory inside /opt/gcc-4.8.1/lib, it turns out that gettext will make use of OpenMP libraries if it detects them during configure stage which I’ve not been able to find a concrete answer for why, the GCC documentation don’t say more than a paragraph about the OpenMP libraries themselves (libgomp) either. The problem was that GCC was exposing its private library in the link path but not in the run path, this meant you could produce binaries which would compile fine but would not run without having to play around with the runtime linker. In my case I’d previously added the private library locate to the runtime linkers search path as a workaround, I disabled the OpenMP support in devel/gettext-tools and that’s where the discussion began. Basically, it’s not possible to expose the private library location to the linker because that would cause issues with upgrades. The location should not be exposed by the compiler in the first place (I guess this was for the convenience of building the actual release of OS?). Richard Palo pursued the issue further and I’m informed that future releases of OmniOS will move libgomp out from this private location to /usr/lib so that it’s in the default library search path.

With the introduction of the GPLv3 license, GNU projects have been switching to the new license. This causes problems for projects outside the GNU eco-system which utilise them if the terms of the new license are unacceptable for them. Each project has dealt with it differently, for OpenBSD they maintain the last version which was available under GPLv2 & extend the functionality it provides. Bitrig has inherited some of this through the fork. Through the bulkbuilds it was revealed that the upstream version of binutils has no support for OpenBSD/amd64 or Bitrig at all. Adding rudimentary support was easily achieved by lifting some of the changes from the OpenBSD CVS repo. While at present I’m running bulkbuilds against a patched devel/binutils which I’ve not upstreamed or committed for both OpenBSD & Bitrig, I am thinking that for OpenBSD we should actually just use the native version and not attempt to build the package. For Bitrig, there is already a separate package in their ports tree for a newer version of binutils, it’s pulled in alongside other modern versions of tools under the meta/bitrig-syscomp package so it makes sense to mimic that behaviour.

Coming to the realisation that stock freedesktop components were not going to build on OpenBSD, I switched to using X11_TYPE=native to utilise what’s provided by Xenocara. Despite the switch, pkgsrc still attempted to ignore the native version of MesaLib and try to build its own, the build would fail and prevent a couple of thousand packages from building.
This turned out to be because of a test to detect the presence of X11 in mk/defaults/mk.conf, it was testing for the presence of an old path which no longer exists. As this test would fail, the native components would be ignored & pkgsrc components would be preferred. The tests for OpenBSD & Bitrig were removed & now default to a default of an empty PREFER_PKGSRC variable. The remaining platforms need to be switched over after testing now.

As Mac OS X on PowerPC gets older and older with time, the requirement for defining MACOSX_DEPLOYMENT_TARGET grows ever more redundant, Ruby now ships with it & unless it’s defined, you will find that it’s not possible to build the ruby interpreter any more. I am considering setting MACOSX_DEPLOYMENT_TARGET="10.4" for PowerPC systems running Tiger or Leopard so that packages could be shared between the two but have not had a chance to test on Leopard yet to commit it. I somehow ended up on a reply list for a ticket in the Perl RT for dealing with this exact issue there. They opted to cater for both legacy & modern version of OS X by setting the necessary variables where necessary.

Getting through backlogged notes to be continued

A week of pkgsrc #11

It’s been a while since the last post in the series, the details of what was covered in these posts was the partial basis of my talk at BSDCan and I got to repeat the talk again in Berlin, I was much less nervous the second time, not having a fire alarm going off during the talk may have helped. I will cover briefly some things that were mentioned in the talks which I hadn’t written up here, for the sake of completeness.
Thanks to the DragonFlyBSD folks, I have access to a build server for doing regular bulkbuilds on. As I’m running these as a unprivileged user, there’s not much parallelism in the package builds, it’s one package at a time. The system aptly named Monster is a 48 Opteron CPU server with a 128GB of RAM so I can at least run with MAKE_JOBS set to 96. At the start of the bulkbuilds some deadlock issues in DragonFlyBSD were revealed by pkgsrc which Mat Dillon addressed promptly.

On the Bitrig front, I managed to add support for the OS to lang/python27 which was the package causing the biggest breakage and now in the process of trying to get the support added upstream, there appears to be a bug report from 2013 in the Python bug tracker to add support ubut it was marked as won’t fix, I’m hoping the decision will be changed but will have to wait and see.
With Python 2.7 built successfully it was onto the next set of breakages, gettext!
I had taken a patch from OpenBSD ports for getting devel/gettext-tools building but was asked to back it out as it was not the correct solution to the problem. I decided to reapply the fix in my build just to progress to the next hurdle. The next major breakage was with devel/p5-gettext which needed to be told to include libiconv, I’m now stuck at getting converters/help2man building.
During this process I found that we were missing some necessary flags for creating shared libraries which were highlighted by clang:
relocation R_X86_64_32S can not be used when making a shared object; recompile with -fPIC

This turned out to be bug in the platform support, the necessary fPIC flags were defined but under a if statement for version of OS running with a.out binaries still. mk/platform/Bitrig.mk was stripped of anything related to a.out and everything was rebuilt again from scratch.

OpenBSD and Bitrig probably have many more breakages due to the fact that their architecture is detected as amd64 and not under the x86_64 banner by the build system. One example is x11/libdrm which is set to add sysutils/libpciaccess as a dependency if the host is a i386 or x86_64.
At present libdrm fails at the configure stage with
checking for PCIACCESS... no
configure: error: Package requirements (pciaccess >= 0.10) were not met:

No package 'pciaccess' found

Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.

Alternatively, you may set the environment variables PCIACCESS_CFLAGS
and PCIACCESS_LIBS to avoid the need to call pkg-config.
See the pkg-config man page for more details.

Trying to add OpenBSD to the x86_64 arch list revealed a problem in pkgsrc, the culprit being devel/bmake.
The problem is that there are three separate points where the architecture is defined. In the bootstrap script, in BSDMake’s on source and the settings it passes onto pkgtools/pkg_install. Unfortunately the settings defined at the start in bootstrap are ignored at the bootstrap stage & are not necessarily what pkg_install is built with. To add to this, it’s possible that BSDMake may need to work out what the system is for itself rather than to be expected to have settings passed to itself. That is they should build with settings passed down in succession or independently.
With severe bludgeoning of code between devel/bmake and pkgtools/pkg_install, I managed to get it to
pkg_add: OpenBSD/x86_64 5.7 (pkg) vs. OpenBSD/amd64 5.7 (this host)
pkg_install performs a check of the OS it’s running on against the settings it was built with (the settings bmake passed it during bootstrap), removing the check revealed there was nothing else preventing things from working but the check needs to be there.

For OmniOS, a major components components in the OS which caused many packages to break was the bundled gettext, failing during builds as it could not find the libgomp from (the also bundled) GCC. As a temporary work around to see how the build would progress if libgomp could be found, I added the lib directory to the search path of ld using crle(1).

Configuration file [version 4]: /var/ld/ld.config
Platform: 32-bit LSB 80386
Default Library Path (ELF): /lib:/usr/lib:/opt/gcc-4.8.1/lib
Trusted Directories (ELF): /lib/secure:/usr/lib/secure (system default)

Command line:
crle -c /var/ld/ld.config -l /lib:/usr/lib:/opt/gcc-4.8.1/lib

It was possible to build 13398 packages out of 16536 possible packages with this workaround in place.

With the help of Joerg Sonnenberger, at pkgsrcCon I added support for fetching the OS version info in OmniOS & SmartOS for use in build build reports, this should mean that these operating systems will be reported correctly rather than as SunOS 5.11.

sevan.mit.edu is back online as a G4 Mac Mini with 128GB SSD. It’s yet to complete its first bulkbuild since the rebuild but it’s nearly finished as I type this.

It’s now possible to build more than 14100 packages on FreeBSD 10.1-RELEASE with pkgsrc.

A week of pkgsrc #9

The past few weeks have been pretty hectic, as the time for BSDcan gets shorter and shorter, I’m thinking about my talk and testing more and more in pkgsrc. Rodent@ added support for Bitrig to pkgsrc-current last month, his patches highlighted an issue with the autoconf scripts (which should be shared across core components) not being pulled in automatically. Joerg Sonnenberger resolved this issue and I regenerated the patch set again. With the system bootstrapped the next thing which was broken was Perl, applying the changes needed for OpenBSD resolved any remaining issues and the bulk build environment was ready. After three days, the first bulkbuild attempt on Bitrig was complete and a report was published. There is now a bulkbuild in progress with devel/gettext-tools and archivers/unzip fixed, that should free over 8400 packages to be attempted to be built.
For Solaris, my first bulkbuild on Solaris 10 completed after 22 days. Mid-April I also started off bulkbuilds on Solaris 11 (x86 and SPARC) using the SunStudio compilers (It’s not possible to use GCC at the moment due to removed functionality that was previously deprecated). The Solaris 11 SPARC bulkbuild is still in progress and the x86 bulkbuild is running. Unfortunately the build cluster had some connectivity issues and needed rebooting during the bulkbuild but not until lots of packages had failed to fetch distfiles, hence the figures look a lot worse than they could be. Solaris 10 SPARC report, Solaris 11 x86 report.

Through bulk building on multiple operating systems another issue that’s surfaced is problematic packages that hold the build up. On Bitrig mail/fml4 is an issue, on OpenBSD www/wml, FTP mirror issues for ruby extension on Solaris, Xorg FTP mirror issues on OmniOS. Things need regular kicking, a brief glance into pkgsrc/mk didn’t reveal any knobs which would allow the preference of HTTP for fetching distfiles. On Bitrig & OpenBSD I’ve excluded these packages from being attempted via NOT_FOR_PLATFORM statement in their Makefile until I have a look into the issue.

sevan.mit.edu completed another bulkbuild, pkgsrc-current now ships with MesaLib 10.5.3 as graphics/MesaLib, version 7 has now been re-imported as graphics/MesaLib7 by tnn@, the new MesaLib needed a patch for FreeBSD, similar to NetBSD to build successfully, due to ERESTART not being defined. At present, it’s still broken on Tiger as I’ve not looked into yet.

I revisited AIX again to test out pkgsrc once again, this has turned into a massive yak shaving session. I’ve yet to run a bulkbuild successfully as the scan stage ends with a coredump.
I originally started off with using the stock system shell, bootstrap completed successfully but scan stage of a bulkbuild would just stop without anything being logged. Manually changing the shell used to shells/pdksh in pkg/etc/mk.conf and pbulk/etc/mk.conf resulted in the following error message:
bmake: don't know how to make pbulk-index. Stop
pbulk-scan: realloc failed:

This turned to be a lack of RAM, my shell account was to a AIX 7.1 LPAR running on a Power8 host with 2 CPUs and 2GB of RAM committed, unfortunately the OS image IBM provided came with Tivoli support enabled and a bug in the resource management controller which meant RMC was consuming way more resource than it needed to. I was running with less than 128MB of RAM.
Stopping Tivoli & RMC freed up about 500MB of RAM, attempting to bulkbuild again, caused the process to fail once again at the same stage. With a heads up from David Brownlee & Joerg Sonnenberger, I bumped the memory and data area resource limits to 256MB.
This allowed the scan to finish with a segfault.
/usr/pkgsrc/pbulk/libexec/pbulk/scan[54]: 11272416 Segmentation fault(coredump).
pscan.stderr logged multiple instances of
bmake: don't know how to make pbulk-index. Stop.
The segfault generated a coredump but it turned out that dbx, the debugger in AIX was not installed. IBMPDP on twitter helped by pointing to the path where some components are available for installation, unfortunately, while the dbx package was available there, some of its dependencies were not. Waiting on IBMPDP to get back to me, I fetched a new pkgsrc-current snapshot (I couldn’t update via CVS because it wouldn’t build) and re-setup my pbulk environment via mk/pbulk/pbulk.sh.
I should mention that initially when I setup, I’d explicitly set CC=/usr/bin/gcc last time, then while trying to get various things to build subsequently, I’d symlink /usr/bin/cc to /usr/bin/gcc. When I came to set thing up with the new snapshot, I did not pass CC=/usr/bin/gcc this time round and found that I was unable to link Perl, not sure if this was the Perl build files assuming if on AIX & /usr/bin/cc exists, it’s XLC or if ld(1) takes on different behaviour but I had to remove this symlink.
Once everything was setup, the bulkbuild failed agin at the same place, except this time I had a different message logged.
/bin/sh: There is no process to read data written to a pipe..
I edited the bootstrap/bootstrap script & devel/bmake/Makefile to set shells/pdksh as a dependency & rerun bulkbuild.
The scan stage again completed with a coredump with this time pscan.stderr just contained Memory fault (core dumped).
I’ve committed these changes so pkgsrc-current now defaults to using shells/pdksh as its shell but have not been able to try anything else as this weekend the system is unaccessible due to maintenance.

At present, I’m attempting to bulkbuild pkgsrc-current on 8 Operating systems
OpenBSD (5.6-RELEASE & -current), FreeBSD, Bitrig (current), Mac OS X (Tiger), Solaris (10 & 11), OmniOS on 4 architectures (i386, AMD64, SPARC, PowerPC).
If I could get AIX going that would bump the OS & arch could up by 1. Maybe by the next post perhaps. 🙂

Thanks to Patrick Wildt for access to host running Bitrig and Rodent@ for adding support to pkgsrc.

A week of pkgsrc #8

Or should that be a month of pkgsrc!
Since I wrote my last post I received the good news that I’ll be presenting at BSDCan this year in Ottawa. I’m really stoked about this and looking forward to presenting there. In preparation for the big day, I’ll be speaking about different aspects of what I’ve been working on at various events/meetings. My first speaking opportunities will be at Oracle’s SolarisSIG (LOSUG in a past life) tomorrow on cross platform packaging with pkgsrc.
I didn’t have any hosts running Solaris to attempt bulkbuilds on for this, so I asked for shell access at various places, this landed me with access to Solaris 9 to 11 on x86 & SPARC and OmniOS.
Shell access to a Solaris 10u9 host came first and I quickly found that mandoc would not build due to the use of mkdtemp(), dirfd() and vasprintf() whilst attempting to setup a bulkbuild environment using the mk/pbulk/pbulk.sh script in pkgsrc.
Following up with the mandoc developers regarding the issue meant the issue was fixed by Ingo Schwarze within a couple of days with access to Solaris via OpenCSW, the next release of mandoc due Summer time will contain these fixes. I’m undecided whether to back port the fixes to the current release or not. If you have a Solaris 10 host with a “recent” patch cluster (unsure how recent) you should be unaffected by this issue.

On OmniOS, the shells/standalone-tcsh package reared its head again. OmniOS ships with tcsh as standard which shells/standalone-tcsh clobbers again during bulkbuild. The solution to this may be that pkgsrc becomes aware of the various Illumos distros instead of treating everything as SunOS, but nothing has been decided for definite and needs further discussion.
The bulkbuilds on OmniOS using the toolchain published in IPS for that release show issues in pkgsrc where libgomp bundled with GCC is not being found by gettext, this causes lots of packages to fail.
The first bulkbuild which completed showed it was possible to build 9547 out 15245 packages without issue but the inability to build databases/shared-mime-info due to the libgomp issue caused 2274 packages to fail. This was the large impact of the issue mentioned but there are more packages which suffer from the same problem.

With the resumption of bulk building pkgsrc on FreeBSD, I found lots of packages which would fail to package due to use of the -o flag with pax(1) which is not present in the FreeBSD version. I’ve not yet addresses the issue with those packages but I did raise a patch for FreeBSD-current so that it will be available in future versions of FreeBSD. It is now committed and will treacle down to 10-STABLE at some point for inclusion in FreeBSD 10.2-RELEASE.

Bulk builds have also resumed on OpenBSD/amd64 and Sparc64, there is a third bulkbuild in progress on amd64 while the sparc64 host is still running the first attempt. The use of unprivileged user account and the lack of nullfs support prevents the setup of parallel runs to speed up the process. Building on OpenBSD was extremely useful for indicating what software in pkgsrc is effected by LibreSSL without having to put effort into integrating LibreSSL into pkgsrc first. It also raised an instance of copying overlapping buffers destructively (using memcpy() vs memmove()), a fix for this issue is yet to be committed pending further investigation. MAKE_JOBS was raised to 8 on OpenBSD/sparc64 to improve performance whilst compiling, the build environment is hosted on a Sun T5210 with a dedicated disk mounted with soft writes & the kern.bufcachepercent value bumped from 20 to 50.

With the pkgsrc-2015Q1 release announced, the figures for packages indicate that Darwin/PowerPC exceeded Darwin/x86.

11224 binary packages built with gcc for Darwin 8.11.0/powerpc
10019 binary packages built with gcc for Darwin 10.8.0/i386

My first bulkbuild report sent to to the pkgsrc-bulk mailing list back in September 2014 shows a total of 8501 packages could be built. There is more work required to keep this position in the future, as a new version of MesaLib is to be committed at some point soon. There is a bulkbuild in progress on sevan.mit.edu, upon completion I’ll be placing a request for the host to be wiped and setup again, the bulk builds are getting slower and slower and I suspect the IDE disk may be failing.

Thanks to Phil Stracchino, Sascha Curth and the OpenCSW Project for access to hosts running Solaris. Ian Kremlin, Brandon Mercer & Rodent @ NetBSD for access to hosts running OpenBSD.

Configuring OpenSolaris with IPv6 connectivity

To configure OpenSolaris to use IPv6 NDP (neighbour discovery protocol) create an empty file named in the following convention:
/etc/hostname6.interface#:#
first hash being the interface number & the second being a user defined number for a logical interface
eg
/etc/hostname6.e1000g0:1

If you’re having DNS resolution issues, do
cp /etc/nsswitch.dns /etc/nsswitch.conf

To configure OpenSolaris to use a static IPv6 address
create a file using the same convention as mentioned during the NDP stage above & inside it add
addif ipv6address/mask up
eg
addif 2a01:300:200::1/64 up

To configure your default IPv6 router on OpenSolaris
create a file named /etc/defaultrouter6 & add the ip address inside

The instructions above make the changes persist across reboots, if you’d like to make changes to a current session, the configuring an IPv6 network section of the IP services Solaris administration guide is a handy reference.
These instructions should also apply to Solaris as well though I haven’t tested it.
The source of information for this article was the IPv6hostsolaris wiki article.