Former things – NetBSD

I assume most people know but only a few heard it from me, so here it is: I resigned from the NetBSD project (and by association, pkgsrc) around about a year ago. It’s not that I’ve been inundated with questions about it but more that on a rare occasion over the past year there would be some reference made which I would fall silent to and I sort of felt bad about that, hence, this post.

Lone Lady – Former Things (WARP331) on Warp Records

LFS, round #6

It’s been a while since I wrote a technical post in this series, since the last post I made a build of what I called Viewpoint Linux Distribution available. This post will cover the time between the last post (round #5) and the launch of the distro.

By the time I’d written the previous post, things had roughly taken shape and I was thinking about what would sit on top via packaged software. Being interested in Guix from afar I thought about using that as there had been some interesting talks about it at FOSDEM‘s Declarative and Minimalistic Computing devroom a month prior. Didn’t end up going down that route as Guix requires GNU Guile, GnuTLS, and various extensions for Guile. It’s not so much of a problem what its requirements are but that I would have to ship and maintain copies of these in the base OS and I didn’t want to do that so I stuck with what I knew. I’ve spent a lot of time with pkgsrc and am comfortable working with it. pkgsrc gives you control where it satisfies dependencies and as long as you have a shell & compiler installed it can get itself to a working state. Unless specified, the bootstrap process on Linux opts to satisfy all dependencies from itself and ignore anything already installed on the system. This behaviour can be overridden by specifying --prefer-native yes when bootstrapping and in this scenario it was preferable since the OS was using recent if not latest available versions of things. Despite preferring native components, when it came to building packages, things that were present on the OS were being built again anyway, specifically readline.

$ cd /usr/pkgsrc/shells/bash ; bmake show-var VARNAME=IS_BUILTIN.readline
no

After some investigation it turned out the builtin detection mechanism was not working and so dependencies would always get built, this was due to a difference between where libraries are installed when following the LFS guide and where pkgsrc expects to find them. The instructions in the LFS guide specify /usr/lib for libdir and pkgsrc expects to find them in /usr/lib${LIBABISUFFIX} which in this case would expand to /usr/lib64. Just to move thing along I patched pkgsrc/mk/platform/Linux.mk to include /usr/lib for _OPSYS_SYSTEM_RPATH / _OPSYS_LIB_DIRS and builtin detection then started working. With a working packaging system, I began packaging BCC and bpftrace though opted to use the bpftrace binary which the project produces with every release in the end. This made things easier as there is a working environment out of the box to start with and if BCC is needed, it can be installed, but since the BPF Performance Tools book is largely about using bpftrace, you get to start off without dealing with packaging. By keeping the packaging system a separate component, it also saves on shipping a bootstrap kit for the packaging system with every release and likely stale packages depending on how quickly things evolve. I dislike the idea of having to run a package update on first boot to shed the stale packages which are shipped with the OS.

After testing various things out I set out to make a new build of the distro to publish, this time opting to use lib64 as the libdir to reduce the need for changes to pkgsrc, I have not attempted any large runs of bulkbuilds but Emacs 21 package was definitely not happy as it expected to find some things in /usr/lib.

There are various packages which ship with DTrace USDT probes which bpftrace can also make use of. This involves building those packages with DTrace support enabled and using SystemTap which provides a Python script called dtrace to do the relevant work, on Linux. I created a package but since it require Python, it created a circular dependency when using Python 3, as Python 3 has USDT probes. As a workaround to sidestep the issue, my SystemTap package uses Python 2, which is still supported by SystemTap. To enable building with DTrace support I introduced a “dtrace tool” which pulled in SystemTap as dependency on Linux when USE_TOOLS+=dtrace was specified, and nothing on other platforms. I then added USE_TOOLS+=dtrace across the tree where dtrace was a supported option.

bpftrace listing the USDT probes found in libpython built from the Python 3.8 package in pkgsrc

With the OS rebuild, I dropped nscd(8) from the system, the thought of having up to three caching resolvers seemed a bit excessive (nscd/systemd-resolved/unbound). This post highlights why you might not want nscd support on your system. As part of the rebuild I began populating the repository with sources for everything that would ship with distro, it was a tedious process that slowed that as I progressed through the build and imported more and more components, because on the initial import I would roll the tree back to the start to import into a branch, update to the tip of the tree, merge the branch, repeat. I used the hg-git mercurial plugin to convert and push the tree to a Git mirror

The kernel config used started life as the default config which gets created when you run make defconfig and built up from there to cover what the LFS guide suggests and those required by BCC / bpftrace. Testing that X11 worked ok revealed that I was missing various options, from lack of mouse support to support for emulated graphics, the safe bet being the VMware virtual card to use on VirtualBox (VMSVGA which is default) and QEMU, other options resulted in offset problems with the cursor where it would appear on one place on the screen but clicks and drags would register at a different location. Everything works out of the box with VMware option.

I’ve been really impressed by how quickly the system boots and shuts down (not having an initrd image to load and minimal drivers to probe for, account for that), I hope I don’t end up loosing that. I used the work leading up to the release as an excuse to start using org-mode on Emacs. Following the beginners guide I now have a long list of todo items which I work through. The next big item is build infrastructure so I can turn around releases quicker.

Something blogged (on pkgsrcCon 2019)

pkgsrcCon 2019 was held in Cambridge this year. The routine as usual was social on the Friday evening, day of talks on the Saturday, hacking on things on the Sunday.
Armed with a bag of bits to record the talks, I headed up to Cambridge on Friday evening to meet up with sborrill before heading to the Castle Inn. Not much to report from Friday night, drinks were drunk, food was ate, PowerPC hardware was passed on.

Saturday morning started off with a brief intro by sborrill and pr1w1, followed by the first talk of the day by nia about audio(9) in NetBSD and her work to improve support in 3rd party software. [slides]

bsiegert was next, speaking about spellcheckers and how without careful consideration when developing an API, one can prevent downstream consumers from moving forward with a project. [slides]

This was preceeded by agc who gave the first of two talks on the work and methodoligies the team working on the OpenConnect appliances at Netflix use – life consuming FreeBSD-CURRENT and their development model (regular releases from the bleeding edge). The release they produce caters for several generation of hardware which is continuosly evolving, ranging from appliances based on conventional hard disk drives to all flash SSD appliances, and now, nvme based appliances.

We breaked for lunch at this point and went for a wonder to find the canteen.
First talk after lunch was by Natasa Milic-Frayling on Software preservation and digital continuity and the challenges with keeping legacy software running. Having successfully virtualised Windows based systems running all the way back to Windows 95, there were other challenges such as archiving distributed systems where proprietory software to be archived ran on one workstation, but data was sourced from other systems such as via ODBC or Active Directory (LDAP), increasing the scope of systems required to be archived in order for an application to function. As we move forward in time, legacy hardware becomes more fragile and harder to source parts for, unmaintained software which is no longer supported becomes a growing security risk, so, system get decommissioned but the data that lived on such systems may still be useful for reference.

After Natasa’s talk it was my turn to speak, I gave an update on the state pkgsrc support on OS X Tiger, followed by various books I’d read over the last 8 months and what I’d been working on in $dayjob, linking things back to Cambridge one way or another, turns out I had been reading about folks in Cambridge doing incredible work, it just happened that they were in Cambridge, Massachusetts along with the Mac Mini I was working form 🙂 [slides]

wiz followed on after me and spoke about the various groups in The NetBSD Foundation and the role they play in the day to day running of the project. From the board to the pkg-bug-handler teams and many others. [slides]

schmonz was our remote speaker, he gave a status update on the qmail distribution in pkgsrc and a new project called notqmail which provides a home as the upstream of this initiative. [demo and further info]

khorben revisited the topic of package signing with the work that happened in the past such as in EdgeBSD, improvements in NetBSD which could ease it such as developments in ptrace(2), and then opened up to a discussion around what’s required to provide signed packages.

agc wrapped up the day with the final talk, “large scale packaging” about how the packages which are shipped to the OpenConnect appliances are put together, using a cut back version of FreeBSD ports containing just the packages required.

Saturday night I headed back to London as I had a club night to attend before heading back on the Sunday. Somewhat drained, I made it home with the kit I’d carried up, showered, changed, and was back in Farringdon before midnight.

Come 2:30am, there was a powercut and that was the end of the night. I was a bit miffed as I missed the same event the previous year due to a clash of events so I was really looking forward to this.

1/08/19 – The recording of the set is now available

I got some rest and made it back to Cambridge for lunch on Sunday. We then headed back to the Centre for Computing History for the last part of the con. I spent some time looking at the machines on display, played Sonic the Hedgehog and Street Fighter 2 both of which I hadn’t done in a long while and then helped debug an issue with booting NetBSD on QEMU as a guest when QEMU is invoked with --cpu=host. The host system in this case was flashed with libreboot which lacks microcode for the CPU and NetBSD crashes on probing the CPU. Explicitly specifying a CPU as a workaround in this case allowed the system to boot e.g --cpu=coreduo.

IBM ThinkPad X61s without microcode update

cpu0 at mainbus0 apid 0
cpu0: Genuine Intel(R) CPU 1400 @ 1.66GHz, id 0x6e8
cpu1 at mainbus0 apid 1
cpu1: Genuine Intel(R) CPU 1400 @ 1.66GHz, id 0x6e8

IBM ThinkPad X61s with microcode update

cpu0 at mainbus0 apid 0
cpu0: Genuine Intel(R) CPU L2400 @ 1.66GHz, id 0x6e8
cpu1 at mainbus0 apid 1
cpu1: Genuine Intel(R) CPU L2400 @ 1.66GHz, id 0x6e8

Post con, I spent a couple of days gallivanting around London with wiedi, weather was really hot but we managed to get a lot of milage down, from Brick Lane to Notting Hill Gate via the South Bank and Maida Vale, followed by a visit to Wembley for the London Hack Space. As always, it was a fun few days.

pkgsrcCon 2019 group photo

A week of pkgsrc #13

With the lead up to the release of pkgsrc-2019Q2 I picked up the ball with the testing on OS X Tiger again. It takes about a month for a G4 Mac Mini to attempt a bulk build of the entire pkgsrc tree with compilers usually taking up most days without success. To reduce the turnaround time, I switched to attempting a small subset of packages for a quicker turnaround using meta-pkgs/bulk-large. After a couple of days of compiling I received a report in my mailbox showing breakages in key packages such as OpenSSL and libxml2.

security/openssl issue was a regression upstream which was resolved by bringing the packages up to date.

textproc/libxml2 breakage was due to -Wno-array-bounds being passed to the compiler and the ancient version of GCC in Tiger not supporting it, resulting in a hard cc1: error: unrecognized command line option "-Wno-array-bounds" error. The use of BUILDLINK_TRANSFORM.Darwin here allowed the option to be removed dynamically, on the fly, confined to only being applied during builds on Darwin.

security/libgpg-error needed the definition of __DARWIN_UNIX03 to make use of unsetenv(3) which returns an integer, unsetenv is called inside an if statement and is unable to test the result as Tiger still used the old implementation by default, which returns void. This results in the error sysutils.c:178: error: void value not ignored as it ought to be when building. The fix for this came from macports.

There was many breakages due to the lack of support of strnlen(3) in Tiger, Apple introduced support for this function in Lion, as a workaround, pkgtools/libnbcompat now includes an implementation which will be used in place for packages which specify strnlen as a requirement using USE_FEATURES and the OS is marked as missing the features using _OPSYS_MISSING_FEATURES.

databases/sqlite3 has issues with readline included with Tiger, as a workaround locally, I switched to using gnu readline.

As find(1) in Tiger lacks support for the + parameter for -exec, the ability to install Python egg modules is currently broken, I worked around this locally to progress with the bulk build. Next step possibly is to make pkgsrc aware of find as a tool and substitute legacy versions with a version from sysutils/coreutils perhaps.

devel/re2c was broken due to the ageing version of GNU Bison being called, which again lacked support for a feature.

  GEN      src/ast/parser.cc
/usr/bin/bison: option `--defines' doesn't allow an argument
Usage: /usr/bin/bison [-dhklntvyV] [-b file-prefix] [-o outfile] [-p name-prefix]
       [--debug] [--defines] [--fixed-output-files] [--no-lines]
       [--verbose] [--version] [--help] [--yacc]
       [--no-parser] [--token-table]
       [--file-prefix=prefix] [--name-prefix=prefix]
       [--output=outfile] grammar-file

Making use of the version in pkgsrc instead resolved the issue by specifying bison in USE_TOOLS.

With the advancement of language development and new standards being defined, pkgsrc grew support for specifying which versions of C & C++ language standards a package may require e.g USE_LANGUAGES=c++03. This in turn passes the relevant standard to the compiler using --std= option. If the compiler being used doesn’t support the standard specified all tests in GNU configure to determine the compiler and language support will fail, resulting in a cryptic configure: error: C++ preprocessor "/lib/cpp" fails sanity check (cpp is in /usr/bin on Tiger). As a local workaround I have commented out the block in pkgsrc/mk/compiler.mk so that the standard is not set. Not sure whether a knob to ignore setting the stand standard is worth it or to move forward by enforcing the use of a new (external) toolchain.

With these change, the bulk build result went from 673 packages out of 1878 to 1067 out of 1878. The resulting packages and bootstrap kit are now up on sevan.mit.edu.

Next step is to address support for find in the pkgsrc tools infrastructure, remove setting 64bit mode on G5 macs as a bootstrap mode as Tiger really doesn’t support it.

Thanks to viva PowerPC for the plug.

A glimpse of sevan.mit.edu as it runs currently:

Something blogged (on pkgsrcCon 2018)

pkgsrcCon 2018 badge

For this years pkgsrcCon, the baton was passed on to Pierre Pronchery & Thomas Merkel, location Berlin. It wasn’t clear whether I would be able to attend this year until the very last minute, booking plane tickets and accommodation a couple of days before. The day before I flew out was really hectic and I did not get any sleep. I left home around 1:30am and made my way to St Pancras for the train to the  airport as It was an early morning flight. Once we’d boarded the flight, I passed out without any recollection and came to just before we were going to descend.  With less than two hours sleep and plenty of time to spare until the evening social I slowly made my way from the airport to the city centre. I roamed the city centre looking at the street art and jumping on and off metro stations between Jannowitzbruke and Westkreuz.

In the evening I headed to the social event and met up with others.  I heard about the state of The Unleashed Operating System, the latest buzzword soup to add to current projects for instant success (block chain, AI and something else I forgot the to note down), debug work flow and lots of other things over food and drinks. I had taken my ThinkPad X60s alongside a 12″ iBook G4 to Berlin, with the plan to see if jdolecek with the machine in person could shed any light on a deadlock issue I experience when hammering the CPU with a kernel build while 2 or more CVS operations are in progress and to also get setup with Yubikey on NetBSD so I could commit from any machine with a USB port without having to copy my keys around to different machines.

On the first day of talks, between preparing slides, spz pointed out what I needed to get the Yubikey working with SSH on NetBSD. The current Yubikeys support multiple features (U2F, OTP, CCID). In order to use a Yubikey as a CCID device with pcscd, pcscd expects a ugen(4) device which doesn’t work on NetBSD because the USB keyboard driver ukbd(4) attaches to the Yubikey and prevents other modes of access. The workaround is to roll a new kernel with the device explicitly hard coded to use ugen(4). With that in place, I was able to checkout the developers source repo on the iBook. For the X60s, jdolecek rolled a fresh kernel with with the DIAGNOSTIC and LOCKDEBUG which we booted and reproduced the deadlock, unfortunately, there was no new information provided by these option, possibly indicating that it’s not a locking issue. A new kernel was rolled with an extra printf() which I tested with, however I couldn’t reproduce the issue with this change. Instead the CVS update just got slower and slower while the system operated normally (without deadlock). Unsure if it was the lack of bandwidth or a new behaviour, I gave up after several hours of the cvs update running. To rule out connectivity, my next plan is to test locally using a mirror of the repository using rsync and perform the CVS operations from that source instead.

I gave a talk titled “Something Old, Something New, Something Borrowed”. Something Old was about NetBSD/macppc, a homage to my first pkgsrcCon back in 2014 where I gave a short talk about pkgsrc on Tiger PowerPC with my 12″ PowerBook, Something New was about Upspin, Something Borrowed was about Minix3. Slides.

For coverage of the different talks and leot‘s experience in Berlin, see this blog post.

After a day of presentations we headed over to a restaurant where we chatted some more over dinner. While we waited for our food to arrive, we were given a demo of  the J programming language by Martin, he described J as a pocket calculator on drugs. I’d previously heard of the K programming language and its cryptic syntax but had not seen anything with such a minimal and cryptic syntax in action.

For the following morning, uwe gave a talk about the Forth programming language and experiences with it, especially with OpenFirmware which originates from Sun Microsystems and was also used by Apple on the PowerPC based Macs. The cool thing about OpenFirmware is that hardware can itself contain a driver for OpenFirmware which gets loaded when the device is initialised, making OpenFirmware aware of how to interact with the device without any manual loading of software by the operator.uwe mentioned NetBSD/ofppc which builds on that by having a kernel without drivers for on-board hardware, instead relying on OpenFirmware to communicate with devices instead. He also described how it is possible to learn about how a system works by using the ccommand in OpenFirmware to disassemble modules, though he warned that unfortunately this didn’t work so well on Macs as symbols had been stripped from OpenFirmware environment. There was lots of references to follow up material to read up on, from The Evolution of Forth paper to uwe‘s notes.

After lunch time, we packed up and headed to a computer museum. On the way we spoke about emacs and workflows like using M-x make-frame-on-display.

On the Monday, I met up with sborrillfor breakfast before heading to the spy museum and doing some sightseeing. I headed back to c-base where I met Pierre, Youri & Sebastian. As it approached the evening, It was time to make way to the airport again. I was back home by around 1am, Tuesday morning, so glad I went 🙂

21/07/18 10:20BST – There has been backlash from residents to stop Google from setting up a new campus in Kreuzberg, Berlin.

Lessons learnt from adding OpenBSD/x86_64 support to pkgsrc

Before even getting into the internals of operating systems to learn about differences among a group of operating systems, It’s fairly evident that something as simple as naming is different between operating systems.

For example, the generations of trusty 32bit x86 PC is commonly named i386 in most operating systems, FreeBSD may also refer to it as just pc, Solaris & derivatives refer to it as i86pc, Mac OS X refers to it as i486 (NeXTSTEP never ran on a 386, it needed a minimum of a 486 and up until Sierra, machine(1) would report i486 despite being on a Core i7 system), this is one of the many architectures which needed to hadled within pkgsrc. To simplify things and reduce lengthy statements, all variants for an arch are translated to a common name whiche is then used for reference in pkgsrc. This means that all the examples above are grouped together under the MACHINE_ARCH i386. In the case of 64bit x86 or commonly referred to as amd64, we group under x86_64 or at least tried to. The exception to this grouping was OpenBSD/amd64, this resulted in the breakage of many packages because any special attention required was generally handled under the context of MACHINE_ARCH=x86_64. In some packages, developers had added a new exception for MACHINE_ARCH=amd64 when OPSYS=OPENBSD but it was not a sustainable strategy because to be affective, the entire tree would need to be handled. I covered the issue at the time in A week of pkgsrc #11 but to summarise, $machine_arch may be set at the start in the bootstrap script but as the process works through the list of tasks, the value of this variable is overriden despite being passed down the chain at the begining of a step. After some experimentation and the help of Jonathan Perkin, the hurdles were removed and thus OpenBSD/x86_64 was born in pkgsrc 😉

The value of this exercise for me was that I learnt the number of places within the internals of pkgsrc I could set something (by the nature of coupling components which share the same conventions (pkgtools, bsd make)) and really the only place I should be seeking to set something is at the start of the process and have that carry through, rather than trying to short circuit the process and repeat myself.

Thanks to John Klos, I was given control of a IBM Power 8+ S822LC running Ubuntu, which started setting up for pkgsrc bulk builds.
First issue I hit was pkgsrc not being able to find libc.so, this turned out to be the lack of handling for the multilib paths found on Debian & derivates for PowerPC based systems.
This system is a little endian 64bit PowerPC machine which is a new speciality in itself and so I set out to make my first mistake. Adding a new check for the wrong MACHINE_ARCH, long forgotten about the previous battle with OpenBSD/x86_64 I added a new statment to resolve the relevant paths for ppc64le systems. Bootstrap was happy with that & things moved forward. At this point I was pointed to lang/python27 most likely being borken by Maya Rashish, John had previously reported the issue and we started to poke at things. As we started rummaging through the internals of pkgsrc (pkgsrc/mk) I started to realise we’re heading down the wrong path of marking things up in multiple places again, rather than setting things once & propogating through.

It turned out that I only need to make 3 changes to add support for Linux running on little endian 64bit PowerPC to pkgsrc (2 additions & 1 correction 😉 )
First, add a case in the pkgsrc/bootstrap/bootstrap script to set $machine_arch to what we want to group under when the relevant machine type is detected. In this case it was when Linux running on a ppc64le host, set $machine_arch to powerpc64le. As this is a new machine arch, also ensure it’s listed in the correct endianness category of pkgsrc/mk/bsd.prefs.mk, in this case add powerpc64le to _LITTLEENDIANCPUS.
Then correct the first change to replace the reference to ppc64le for handling the multilib paths in pkgsrc/mk/platform/Linux.mk.

The bulkbuild is still in progress as I write this post but 5708/18148 packages in an the only fall out so far appears to be the ruby interepreters.

12″ PowerBook G4 PT5 – Electronic Battle Weapon

Preparation for a trip started off a little earlier this christmas. I planned to take my PowerBook on the road with me to Hamburg for 33c3. Previous attempts to use this machine as my primary system on the road in the past had been thwarted by leaving too little time to build & prepare before departure.
The system has been dual booting NetBSD & Mac OS X Tiger for some time now, recently I’ve been doing almost daily upgrades to NetBSD-HEAD on the system using the generated iso images from NYFTP.
My plan was to get the machine installed with a current build FireFox on NetBSD & bring the existing installed packages up to date. I managed to update the existing packages without any problems but it didn’t look like FireFox was going to build successfully. The package as-is currently in pkgsrc does not build on NetBSD/macppc. I was pointed to a patch in pkg/48595 which was pending commit and required testing. It cleared up the initial issue I ran into but the build still failed (see previous link on updates about the failure), though it took a little longer to fail in the day. After several days of failed build attempts I made sure I had an up to date copy of TenFourFox installed on Tiger and settled for Dillo on NetBSD instead.

My usage of Dillo stayed somewhat basic during the trip, despite having the Mozilla certificate bundle installed, I could see any obvious way to point Dillo to it & have it use it. Hence, any site using SSL I visited generated a certificate warning. Perhaps the config should’ve been done in wget?

www/dillo pkgdepgraph

Alexander Nasonov created packages for Dillo & Links-gui targeted for running under a minimal chroot but I did not get around to trying them out. There are chrooted browser packages for other browsers in his pkgsrc github repo. The screenshot above shows the www/dillo package’s dependencies, generated using pkgtools/pkgdepgraph

Moving on, the AirPort Extreme card in the laptop is based on a Broadcom chipset which has a flaw, it’s incapable of addressing memory above 1GB (30 bits) which means the driver needs to care for that or else the card doesn’t work. This is not unique to this Broadcom chipset, the BCM4401 10/100 ethernet interfaces which use the bce(4) driver also suffer from the same problem (unable to address memory allocated above 30 bits), the BCM580x ethernet interfaces which use the bge(4) driver suffer from not being able to address more than 40 bits. Going back to the wireless chipset, the bwi(4) driver which is used in the BSDs, originated from DragonFly BSD. This driver was put together by Sepherosa Ziehau using the documentation from a reversing effort in the Linux community. The bwi driver was then imported in to Free/Open/NetBSD and was eventually removed from DragonFly BSD. A new wireless subsystem was introduced in DragonFly which required change to drivers to work again and the bwi driver was never adapted. It now lives on in the other BSDs.

The version of bwi(4) driver came to NetBSD from OpenBSD, ported by Taylor R. Campbell back in 2009. At the time neither version of drivers could handle the 30 bit bug so you either ran with less than 1GB of RAM or used another card. In 2014 Stefan Sperling committed a workaround for this in OpenBSD. I wanted this fix in NetBSD so my wifi could also work & asked the NetBSD developers if such a change was appropriate in NetBSD. I was introduced to bus_dma(9) and the bus_dmatag_subregion() function, the bce(4) driver was my reference on how to use the function. Looked fairly straight forward, a single call this function and off you go, wasn’t too sure how it would fit into the bwi driver but I thought I’d have a go.

This was one of the things I was hoping to work on during my trip but It turned out to be the only thing I attempt. I happened to meet Stefan at 33c3 and we discussed the driver, the work around and the mighty days of the past when Damien Bergamini was hacking on the OpenBSD WiFi stack. In the OpenBSD driver Stefan had opted to deal with the issue of allocating memory in a specific region directly in the driver rather than adding a new interface to the kernel for  such a task so with a bit of thought about the past and a review of the driver, I was given a diff of the changes and suggestions about where I could start making changes.

I still don’t know yet if it’s possible to lift the changes from OpenBSD and apply them to the NetBSD version of the driver, because the DMA framework is different between the systems.
Partially implementing the change Stefan made without all the bounce buffers he’d added in the OpenBSD driver didn’t work and using the bus_dmatag_subregion() function didn’t work either. I pursued the bus_dmatag_subregion() path during 33c3 and didn’t get anywhere. At this point I started looking deeper in the system by looking at the implementation. It was at this point that I discovered this function was defined to EOPNOTSUPP on PowerPC based systems. No matter what I had tried with this function it was a waste of time^W^W^Wvaluable learning experience about keeping documentation up to date & consistent.

At this point I started looking into adding support for tagged subregions so I could make use of the function. The implementation is fairly simple, a public function for a developer to use which performs various tests and a private function which is called to deal with the memory allocation. Unfortunately there were some missing members from the data structure on the powerpc side of NetBSD which needed further investigation and I stopped there for the time being.

For the trip I relied on a tiny Realtek RTL8188CUS based wifi adapter to get me network access. The card worked on the 802.1x enabled SSID at 33c3 using wpa_supplicant(8) on NetBSD/maccppc.

urtwn0 at uhub4 port 5
urtwn0: Planex Communications Inc. GW-USNANO2, rev 2.00/2.00, addr 2
urtwn0: MAC/BB RTL8188CUS, RF 6052 1T1R, address 00:22:cf:xx:xx:xx
urtwn0: 1 rx pipe, 2 tx pipes
urtwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
urtwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps

The driver for this card is now enabled by default in the GENERIC config file for NetBSD/macppc along with a group of other drivers for USB peripherals.

Thanks to Stefan for his help and advice with the bwi driver and Alex for the chrooted browser packages! 🙂

pkgsrcCon 2016 videos

The videos from some of the talks that took place at pkgsrcCon in Kraków, Poland during the Summer are now available on the Internet Archive.
This years conference drew speakers from many different project, not necessarily BSD related though Net/Free/OpenBSD were represented. Talks were on a pretty diverse range of topics around software but unfortunately not all talks were recorded. I had the opportunity to tag on Mateusz Kocielski’s slot on security & give a brief talk on the work of the pkgsrc security team (slides), whilst he covered the work of the NetBSD security team (see files security-team & flash-die).

The schedule for the conf is available here

Adventures in Open Source Software: Dealing with Security

I had the opportunity to give this talk at the London chapter of DefCon, DC4420 and censecutivily at London Perl Mongers technical meeting last week.
The subject of the talk was all the factors outside of doing security research which can make the process of dealing with advisories a daunting or a seamless process. As observed during the last year, while working as part of a security team.

Slides

Intro:
pkgsrc is a crossplatform packaging system by the NetBSD project, forked from the FreeBSD ports in the late 90’s, initially the primary target was NetBSD but with the portability focus of the project, the list of supported flatforms has grown to a list of 23 operating systems (16 out of those 23 are currently actively worked on). Within the pkgsrc project, there is a dedicated security team whose responsibility is to audit published vulnerabilities and ensure that those which apply to software we offer packages for are listed in a file. Users download this file & use it to check their installed packages.
Other open source projects have teams who are more involved and participate in the security research process and publish their own advisories, such as Debian or Redhat but that is not  main focus of our team.
There may be several reasons for this, during my talk I refered to the aquisition of a security company by Redhat, but looking up Redhat on Wikipedia, there doesn’t appear to be anything to suggest that.
I can say that for the pkgsrc-security team, the role is focused on filtering information and ensuring that items are listed in the pkg-vulnerabilities file, maintainers are notified (if there is one) and co-ordinating with the release engineering team so that necessary commits are pulled into the relevant branches. This is because we try to avoid dealing with development within our tree and opt to co-ordinate with upstream to submit fixes. Majority of our changes focus on removing assumptions to ensure things are built in a consistent manner and allow the software to be packaged how we like.

Dealing with advisories:
The advisories which we receive range in quality / detail.
A personal favourite are the drupal advisories. We offer the drupal core as a package but not any of the 3rd party modules. Their advisories clearly indicate if they apply to core or any published 3rd party modules and which scenario is needed for the vulnerability to be exploitable eg the user must be able to upload content.

The opposite of that is independently published advisories without any co-ordination with affected parties or independently published advisories which are disputed or not acknowledge from upstream. In these situation the role becomes more involved in order to work out if there is clearly an issue or not.

Then there’s Oracle advisories, we can confirm there’s one or more problems in the following versions of software, no more details than that. Upgrade to this version at a minimum to fix said issue(s). Here’s a chart so you can evaluate the risk.

It can be that upstream has actually made an announcement with the details of an issue in public but the mitre website will still lists the CVE as reserved. Ideally you’d like to list the mitre site in pkg-vulnerabilities as it’s where IDs are assigned and it’s self referencing (url will contain the CVE id). But it’s a terrible thing to do to a user. “You have a package installed which is vulnerable to the following type of issue follow this link to not find out any more information about it. Go fish”. Or maybe you have no choice.

Project Websites:
If the published advisory come via a Linux distribution it can be common that the fix references a binary package for users to install or perhaps further information required. In the 2000’s Soureceforge was a popular host for open source projects usually complimented with a separate web page of some kind, it’s now common to have projects which solely exists as an authoritative repo on github. In either scenario, a dedicated section for publishing security information is usually not found. This trend is also prevelant in large commercially backed projects, which play an extremely critical role. Projcts such as ICU (International Components for Unicode), a project by IBM which deals with unicode, an issue in ICU can mean an issue in chrome/chromium, java.

There are also projects like Qemu which have a security page for submitting vulnerability information but never publish advisories themselves. It is common for advisories to reference a git commit email. KVM completely lacks any links related to security. Qemu has a strong link with Xen & KVM which rely on Qemu in one way or another.
While we do not offer KVM as a package, we do at present have four different versions of Xen in our tree and Qemu. This becomes a bit of a timesink when there are multiple advisories to address.

Commercial Repositories:
There are opensource projects with no publicly accessible source code repo. This makes the evaluation of the range of effected verisons difficult if the project only chooses to cover their supported versions.
ISC up until recently (past two years?) required paid membership to access BIND’s repo.
In the talk I refered to the ICU project here, this was incorrect. ICU advisories are either reserved or the bug report access blocked from public view.

OpenSSL:
Relationships with projects are important and they play a critical role in not only sharing information but code as well. Changes for 3rd party software really needs to be passed to the 3rd party to take care of. If relationships have tourned sour, it makes sharing somewhat difficult and has further implications when developing a project with the support for other projects in mind.
The LibreSSL project published a patches page which covered the changes needed to get affected software built but also co-ordinated with upstream projects to get the fixes integrated. Some projects needed more pressure^Wpersuasion than others to accept the patches.

Key components & deadware:
As mentioned previously, you want changes to go back up stream and not to carry changes in your own tree. But there are scenarios where this is possible, for example, the project is no longer developed. This is a huge problem if the project is widely used because you end up carrying local patches which hinders progress when auditing for vulnerabilites by consumers of the software downstream. It’s no longer a case of ensuring you have a specific version number but which patches are also applied to that baseline version, that then opens further questions about the patches, have you created a new issue that didn’t exist previously??
The widely used unzip utility is such an example,  are you patched for CVE-2015-7696?

An example fragmentation of fixes being carried locally is libwmf, with the announcement of some CVEs earlier this year, Jason Unovitch from the FreeBSD project discovered that there were unpatched vulnerabilities in this library going back to 2004, with patches spread across different Linux distributions, none carrying fixes for all advisories, in one case a hunk of the patch didn’t even apply. Development for libwmf stopped in the early 2000’s but it still exists as a project on sourceforge.

Jasper is another commonly used graphics library, this time for jpeg-2000, again development ceased long ago. In this case Slackware put out an advisory for their package to cover vulnerabilities from the past, going back to 2008, at which point we realised that we didn’t have the fixes either. The version in OpenBSD ports was vulnerable to the issues listed from 2014 but the vulnerabilities from prior (2008) were fixed because they’d been flagged up by the compiler in OpenBSD.

Widely Deployed:
Popular projects which have a large install base can greatly increase impact of a mistake, hence local changes should be kept to a minimum to ease maintanence and auditability.
Projects can see a very fast release cycle, especially ones which have advisories published about them regularly.
Keeping local changes to a minimum reduces the necessary effort to update. With projects which rely on downstream consumers to publish information it makes the process more difficult. Both KVM & QEMU projects do not publish any advisories themselves, at best you may have a git commit email which may be the patch you carry locally. Thankfully the Xen project publish advisories on Qemu as it can be a dependency. They are able to flesh out the details of the issue a little better than a vague commit message.
I’m unsure what happens if you’re not a Linux distro and utilise KVM.

Co-ordinating with upstream:
As I mentioned, relationships are important. An understanding and tolerrance for difference is absolutely essential in the world of software just as it is in day to day life. A common topic of disagreement is licensing, the terms expressed by said licenses and the strong opinions expressed by the participants in the disagreement. Whatever ones belief, the need to co-ordinate with people from different groups is absolutely necessary.
Of the fixes upstreamed from LibreSSL, the author of stunnel rejected a fix initially but eventually changed his mind. The change in question was a 2 line addition to add an ifdef statement so that RAND_egd function was only used if the SSL library being linked to offered such a function (detected by autoconf already). The author rejected the change based on terms of licensing of his project when the change was submitted.

Taking bigger leaps in a software project by trying to clean up a popular target can amass a large collection of local patches which need to make their way upstream. As observed by the Alpine Linux project, a Linux distribution with a new libc called musl libc. While it’s possible to build over 13000 packages with Debian 8 on pkgsrc, the package count is less than 9000 on Alpine, despite both being Linux distributions.

The submission process can be quite daunting depending on the project, to filter out submissions which may not be sound and reduce the workload of developers working on a project, some opt to requiring certain things such as results from a test suite or alike. It doesn’t help matters if project has multiple branches developed in parallel without changes being in sync.
Dealing with GNU toochain such as GCC can be very much like this. Again, local changes amass, slow transforming the local version of the toolchain to an extended version of upstream. While the toolchain may offer security features such as SSP (stack smashing protection), it’s not just the simple case of being able to switch it on, in some cases it either doesn’t work or worse, it results in broken binaries. Work to enable some of these features in pkgsrc began in the summer.

While not specifically security related, I was reminded of an incident on the OpenBSD mailing lists with the author of the once popular ion window manager. The ion developer was frustrated with operating system projects packaging older versions of his software with local changes as users where following up with him for support (in this case he was referring to RedHat). So he came on to the OpenBSD mailing list to ask about updating the software to the latest version available but that was not possible due to incompatible license changes. As usual, the combination of making demanding + licensing drama didn’t work out to well.

Conclusion

  • Organise the information on your site, for an Open Source software project your source control repo should be a single click away from your websites main page.
  • Provide a dedicated security page or section where you post advisories.
  • Write descriptive commit messages.
  • Participate in the community and help your neighbours
  • Even if you stop developing software, the code may live on longer than envisioned, think about what happens if/when you decide to stop (who become the authoritative repo).
  • Don’t make changes locally which do not go upstream by default, for it’ll surely bite you or a member of the project later down the line.
  • Publish actual advisories for your project, don’t pass the buck.
  • Technical problems are best solved with technical solutions eg a bug can still continue to exist despite adhering to a license.
  • Make the submission process to your project effortless for both parties, not just one or the other.

A week of pkgsrc #12

To fill in the gap since the last post, I thought I’d get the notes which had been collecting up, posted here. pkgsrc got a mention in the Quarterly FreeBSD status report. My bulkbuild effort started on FreeBSD/amd64 10.1-RELEASE but thanks to my friend James O’Gorman, I was able to expand to FreeBSD 11-CURRENT and recently switched over from 10.1-RELEASE to 10.2-RELEASE.
I got the idea to try to pkgsrc on Android after someone posted a screenshot of their Nexus 7 tablet with the bootstrap process completed.

There are several projects on the google play store for running the user land built from a Linux/arm distro in a chroot on Android.
The first project I tried was Debian noroot (based on the tweet that inspired me), it spawned a full X11 desktop to run & so the process was painfully slow.

Switching to GNUroot Debian which just ran a shell in the chroot was much faster at extracting the pkgsrc archive though bootstrap still took long. The best result was with Linux deploy using an Arch Linux user land, everything was very snappy.

On Mac OS X Tiger PowerPC, GCC 5 appears to no longer require switching off multilib support when building on a 32-bit PowerPC CPU, my hardware has changed but the CPU is still a G4. The same changes to force dwarf2 and removing the space in-between flags and paths fed to the linker were otherwise required, as with previous versions of GCC.

I spent a little time with OmniOS and “addressed” the outstanding issues which prevented it from working out of the box. shells/standalone-tcsh was excluded on OmniOS which prevented the version of tcsh shipped with the OS from being clobbered during bulkbuilds. The other issue was what appeared to be a problem with gettext but turned to be an issue with the compiler shipped with OmniOS. This became a topic of discussion on what the correct solution to the problem is. The GCC provided with OmniOS is built with Fortran support and includes the OpenMP libraries (I’m guessing this is the reason for the libraries) in its private lib directory inside /opt/gcc-4.8.1/lib, it turns out that gettext will make use of OpenMP libraries if it detects them during configure stage which I’ve not been able to find a concrete answer for why, the GCC documentation don’t say more than a paragraph about the OpenMP libraries themselves (libgomp) either. The problem was that GCC was exposing its private library in the link path but not in the run path, this meant you could produce binaries which would compile fine but would not run without having to play around with the runtime linker. In my case I’d previously added the private library locate to the runtime linkers search path as a workaround, I disabled the OpenMP support in devel/gettext-tools and that’s where the discussion began. Basically, it’s not possible to expose the private library location to the linker because that would cause issues with upgrades. The location should not be exposed by the compiler in the first place (I guess this was for the convenience of building the actual release of OS?). Richard Palo pursued the issue further and I’m informed that future releases of OmniOS will move libgomp out from this private location to /usr/lib so that it’s in the default library search path.

With the introduction of the GPLv3 license, GNU projects have been switching to the new license. This causes problems for projects outside the GNU eco-system which utilise them if the terms of the new license are unacceptable for them. Each project has dealt with it differently, for OpenBSD they maintain the last version which was available under GPLv2 & extend the functionality it provides. Bitrig has inherited some of this through the fork. Through the bulkbuilds it was revealed that the upstream version of binutils has no support for OpenBSD/amd64 or Bitrig at all. Adding rudimentary support was easily achieved by lifting some of the changes from the OpenBSD CVS repo. While at present I’m running bulkbuilds against a patched devel/binutils which I’ve not upstreamed or committed for both OpenBSD & Bitrig, I am thinking that for OpenBSD we should actually just use the native version and not attempt to build the package. For Bitrig, there is already a separate package in their ports tree for a newer version of binutils, it’s pulled in alongside other modern versions of tools under the meta/bitrig-syscomp package so it makes sense to mimic that behaviour.

Coming to the realisation that stock freedesktop components were not going to build on OpenBSD, I switched to using X11_TYPE=native to utilise what’s provided by Xenocara. Despite the switch, pkgsrc still attempted to ignore the native version of MesaLib and try to build its own, the build would fail and prevent a couple of thousand packages from building.
This turned out to be because of a test to detect the presence of X11 in mk/defaults/mk.conf, it was testing for the presence of an old path which no longer exists. As this test would fail, the native components would be ignored & pkgsrc components would be preferred. The tests for OpenBSD & Bitrig were removed & now default to a default of an empty PREFER_PKGSRC variable. The remaining platforms need to be switched over after testing now.

As Mac OS X on PowerPC gets older and older with time, the requirement for defining MACOSX_DEPLOYMENT_TARGET grows ever more redundant, Ruby now ships with it & unless it’s defined, you will find that it’s not possible to build the ruby interpreter any more. I am considering setting MACOSX_DEPLOYMENT_TARGET="10.4" for PowerPC systems running Tiger or Leopard so that packages could be shared between the two but have not had a chance to test on Leopard yet to commit it. I somehow ended up on a reply list for a ticket in the Perl RT for dealing with this exact issue there. They opted to cater for both legacy & modern version of OS X by setting the necessary variables where necessary.

Getting through backlogged notes to be continued

A week of pkgsrc #11

It’s been a while since the last post in the series, the details of what was covered in these posts was the partial basis of my talk at BSDCan and I got to repeat the talk again in Berlin, I was much less nervous the second time, not having a fire alarm going off during the talk may have helped. I will cover briefly some things that were mentioned in the talks which I hadn’t written up here, for the sake of completeness.
Thanks to the DragonFlyBSD folks, I have access to a build server for doing regular bulkbuilds on. As I’m running these as a unprivileged user, there’s not much parallelism in the package builds, it’s one package at a time. The system aptly named Monster is a 48 Opteron CPU server with a 128GB of RAM so I can at least run with MAKE_JOBS set to 96. At the start of the bulkbuilds some deadlock issues in DragonFlyBSD were revealed by pkgsrc which Mat Dillon addressed promptly.

On the Bitrig front, I managed to add support for the OS to lang/python27 which was the package causing the biggest breakage and now in the process of trying to get the support added upstream, there appears to be a bug report from 2013 in the Python bug tracker to add support ubut it was marked as won’t fix, I’m hoping the decision will be changed but will have to wait and see.
With Python 2.7 built successfully it was onto the next set of breakages, gettext!
I had taken a patch from OpenBSD ports for getting devel/gettext-tools building but was asked to back it out as it was not the correct solution to the problem. I decided to reapply the fix in my build just to progress to the next hurdle. The next major breakage was with devel/p5-gettext which needed to be told to include libiconv, I’m now stuck at getting converters/help2man building.
During this process I found that we were missing some necessary flags for creating shared libraries which were highlighted by clang:
relocation R_X86_64_32S can not be used when making a shared object; recompile with -fPIC

This turned out to be bug in the platform support, the necessary fPIC flags were defined but under a if statement for version of OS running with a.out binaries still. mk/platform/Bitrig.mk was stripped of anything related to a.out and everything was rebuilt again from scratch.

OpenBSD and Bitrig probably have many more breakages due to the fact that their architecture is detected as amd64 and not under the x86_64 banner by the build system. One example is x11/libdrm which is set to add sysutils/libpciaccess as a dependency if the host is a i386 or x86_64.
At present libdrm fails at the configure stage with
checking for PCIACCESS... no
configure: error: Package requirements (pciaccess >= 0.10) were not met:

No package 'pciaccess' found

Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.

Alternatively, you may set the environment variables PCIACCESS_CFLAGS
and PCIACCESS_LIBS to avoid the need to call pkg-config.
See the pkg-config man page for more details.

Trying to add OpenBSD to the x86_64 arch list revealed a problem in pkgsrc, the culprit being devel/bmake.
The problem is that there are three separate points where the architecture is defined. In the bootstrap script, in BSDMake’s on source and the settings it passes onto pkgtools/pkg_install. Unfortunately the settings defined at the start in bootstrap are ignored at the bootstrap stage & are not necessarily what pkg_install is built with. To add to this, it’s possible that BSDMake may need to work out what the system is for itself rather than to be expected to have settings passed to itself. That is they should build with settings passed down in succession or independently.
With severe bludgeoning of code between devel/bmake and pkgtools/pkg_install, I managed to get it to
pkg_add: OpenBSD/x86_64 5.7 (pkg) vs. OpenBSD/amd64 5.7 (this host)
pkg_install performs a check of the OS it’s running on against the settings it was built with (the settings bmake passed it during bootstrap), removing the check revealed there was nothing else preventing things from working but the check needs to be there.

For OmniOS, a major components components in the OS which caused many packages to break was the bundled gettext, failing during builds as it could not find the libgomp from (the also bundled) GCC. As a temporary work around to see how the build would progress if libgomp could be found, I added the lib directory to the search path of ld using crle(1).

Configuration file [version 4]: /var/ld/ld.config
Platform: 32-bit LSB 80386
Default Library Path (ELF): /lib:/usr/lib:/opt/gcc-4.8.1/lib
Trusted Directories (ELF): /lib/secure:/usr/lib/secure (system default)

Command line:
crle -c /var/ld/ld.config -l /lib:/usr/lib:/opt/gcc-4.8.1/lib

It was possible to build 13398 packages out of 16536 possible packages with this workaround in place.

With the help of Joerg Sonnenberger, at pkgsrcCon I added support for fetching the OS version info in OmniOS & SmartOS for use in build build reports, this should mean that these operating systems will be reported correctly rather than as SunOS 5.11.

sevan.mit.edu is back online as a G4 Mac Mini with 128GB SSD. It’s yet to complete its first bulkbuild since the rebuild but it’s nearly finished as I type this.

It’s now possible to build more than 14100 packages on FreeBSD 10.1-RELEASE with pkgsrc.

My talk at BSDCan 2015

List of talks and speakers printed on the back of t-shirt given to attendees

I gave a talk titled “Adventures in building open source software” where I covered the last year of playing with CoovaChilli and pkgsrc. This was my first time speaking at a conference, sadly the recording is in two parts because the fire alarm went off and we had to evacuate the building.

My slides

A week of pkgsrc #10

Following on from last weeks post, I forgot to mention building on OpenBSD/sparc64 via a LDOM running on a Sun T5210, this was even more painful than the Solaris counterpart and took the best part of a month, some of this delay was initially caused by problematic packages which held up the build, not parallelising the builds and again issues with FTP mirrors.
devel/electric-fence was another of packages which was responsible for holding up the build that I didn’t mention in the previous post. During the build it runs a binary called eftest and that’s it, it’s stuck there until killed.
The LDOM I was running in was allocated 4 vCPUs but the build was running as a single threaded build. Defining MAKE_JOBS=4 in pkg/etc/mk.conf and recompressing the bootstrap kit (bootstrap.tar.gz) helped this situation. To work around FTP issues, bulkbuilds were switched to HTTP only thanks to a pointer from Joerg Sonnenberger. As defined in pkgsrc/mk/defaults/mk.conf
#MASTER_SORT_REGEX= ftp://.*/
# Same as MASTER_SORT, but takes a regular expression for more
# flexibility in matching. Regexps defined here have higher priority
# than MASTER_SORT. This example would prefer ftp transfers over
# anything else.
# Possible: Regexps as in awk(1)
# Default: none

Setting MASTER_SORT_REGEX= http://.*/ in pkg/etc/mk.conf and recompressing the bootstrap kit ensured builds use HTTP from there on.

The bulkbuild report showed lots of fallout from packages which hadn’t been updated yet to support LibreSSL e.g. net/wget expects DES support.
Rodent@ fixed lang/python27 with the changes due for subsequent releases of Python.
Bernard Spil fixed Heimdal which failed due to the lack of RAND_EGD in LibreSSL, these fixes will be in the next release of Heimdal (1.6.0?), back porting the changes to 1.5.3 which is the current release available resolved the issue with lack of RAND_EGD but then failed at building a kerberised telnet due to changes in the OpenBSD IPv6 stack which removed functionality telnet was expecting to be there. There is no fix for the issue in the OpenBSD ports as Heimdal is set to build without legacy and insecure protocols such as telnet and rsh.

Due to the connectivity issues on the OpenCSW build cluster, I erased the error report and restarted the bulkbuild on Solaris 10 SPARC and 11 x86 to re-attempt everything that had failed during the previous run for whichever reason. The Solaris 10 SPARC bulkbuild has now finished with a total of 7389 packages built, previously 5701. I discovered a particularly nasty bug with lang/gcc3-c++ which cost 3 days as the configure stage ran over and over again before being killed manually.

My access to the AIX LPAR expired, taking with it what I had previously tried, I requested access again but this time also requested access to a LPAR running SUSE 12 on Power8 as well.
Still no further with the AIX LPAR but managed to getting bulkbuild going on the SUSE 12 one with a little bit of assistance.
The first thing which needed to be done was to specify the ABI and the suffix applied to the library search path. This is because the system is 64 bit without any 32bit libraries installed and by default pkgsrc opts for 32bit unless set otherwise. When attempting to bootstrap initially, it failed with ERROR: bin/digest: missing library: libc.so.6. I initially set about the wrong path of trying to locate the glibc-32bit rpm for SUSE on Power before realising what was actually required. This may have been a knee-jerk reaction from the past before the days of yum and such on Linux. With the necessary change to pkgsrc/mk/platform/Linux.mk the bulkbuild environment setup continued before hanging on the installation of pkgtools/pkg_install. pkg_add would hang and CPU utilisation would spike to 100%.

A backtrace of the running process in gdb revealed it was stuck on mpool_get().
(gdb) bt
#0 0x0000000010096650 in mpool_get ()
#1 0x0000000010093658 in __bt_search ()
#2 0x000000001009318c in __bt_put ()
#3 0x000000001000b614 in pkgdb_store ()
#4 0x000000001000430c in extract_files ()
#5 0x0000000010006fd0 in pkg_do ()
#6 0x00000000100075a4 in pkg_perform ()
#7 0x0000000010005650 in main ()

Turns out the issue also affects pkgsrc on Linux/ARM and was previously reported in a bug report from 2013 with a workaround. Setting the GCC optimisation level to 0 for pkgtools/libnbcompat and pkgtools/pkg_install allowed mk/pbulk/pbulk.sh to setup a buklbuild environment and a bulkbuild is currently in progress. The bulkbuild was initially aborted to added some critical missing components which caused major breakage.

zypper install libxshmfence-devel gettext-tools gcc-c++.

With Suse Linux on Power8, that bumps my operating system count to 9 across 5 architectures. Just need to get AIX going to round off the OS count. 🙂

A week of pkgsrc #9

The past few weeks have been pretty hectic, as the time for BSDcan gets shorter and shorter, I’m thinking about my talk and testing more and more in pkgsrc. Rodent@ added support for Bitrig to pkgsrc-current last month, his patches highlighted an issue with the autoconf scripts (which should be shared across core components) not being pulled in automatically. Joerg Sonnenberger resolved this issue and I regenerated the patch set again. With the system bootstrapped the next thing which was broken was Perl, applying the changes needed for OpenBSD resolved any remaining issues and the bulk build environment was ready. After three days, the first bulkbuild attempt on Bitrig was complete and a report was published. There is now a bulkbuild in progress with devel/gettext-tools and archivers/unzip fixed, that should free over 8400 packages to be attempted to be built.
For Solaris, my first bulkbuild on Solaris 10 completed after 22 days. Mid-April I also started off bulkbuilds on Solaris 11 (x86 and SPARC) using the SunStudio compilers (It’s not possible to use GCC at the moment due to removed functionality that was previously deprecated). The Solaris 11 SPARC bulkbuild is still in progress and the x86 bulkbuild is running. Unfortunately the build cluster had some connectivity issues and needed rebooting during the bulkbuild but not until lots of packages had failed to fetch distfiles, hence the figures look a lot worse than they could be. Solaris 10 SPARC report, Solaris 11 x86 report.

Through bulk building on multiple operating systems another issue that’s surfaced is problematic packages that hold the build up. On Bitrig mail/fml4 is an issue, on OpenBSD www/wml, FTP mirror issues for ruby extension on Solaris, Xorg FTP mirror issues on OmniOS. Things need regular kicking, a brief glance into pkgsrc/mk didn’t reveal any knobs which would allow the preference of HTTP for fetching distfiles. On Bitrig & OpenBSD I’ve excluded these packages from being attempted via NOT_FOR_PLATFORM statement in their Makefile until I have a look into the issue.

sevan.mit.edu completed another bulkbuild, pkgsrc-current now ships with MesaLib 10.5.3 as graphics/MesaLib, version 7 has now been re-imported as graphics/MesaLib7 by tnn@, the new MesaLib needed a patch for FreeBSD, similar to NetBSD to build successfully, due to ERESTART not being defined. At present, it’s still broken on Tiger as I’ve not looked into yet.

I revisited AIX again to test out pkgsrc once again, this has turned into a massive yak shaving session. I’ve yet to run a bulkbuild successfully as the scan stage ends with a coredump.
I originally started off with using the stock system shell, bootstrap completed successfully but scan stage of a bulkbuild would just stop without anything being logged. Manually changing the shell used to shells/pdksh in pkg/etc/mk.conf and pbulk/etc/mk.conf resulted in the following error message:
bmake: don't know how to make pbulk-index. Stop
pbulk-scan: realloc failed:

This turned to be a lack of RAM, my shell account was to a AIX 7.1 LPAR running on a Power8 host with 2 CPUs and 2GB of RAM committed, unfortunately the OS image IBM provided came with Tivoli support enabled and a bug in the resource management controller which meant RMC was consuming way more resource than it needed to. I was running with less than 128MB of RAM.
Stopping Tivoli & RMC freed up about 500MB of RAM, attempting to bulkbuild again, caused the process to fail once again at the same stage. With a heads up from David Brownlee & Joerg Sonnenberger, I bumped the memory and data area resource limits to 256MB.
This allowed the scan to finish with a segfault.
/usr/pkgsrc/pbulk/libexec/pbulk/scan[54]: 11272416 Segmentation fault(coredump).
pscan.stderr logged multiple instances of
bmake: don't know how to make pbulk-index. Stop.
The segfault generated a coredump but it turned out that dbx, the debugger in AIX was not installed. IBMPDP on twitter helped by pointing to the path where some components are available for installation, unfortunately, while the dbx package was available there, some of its dependencies were not. Waiting on IBMPDP to get back to me, I fetched a new pkgsrc-current snapshot (I couldn’t update via CVS because it wouldn’t build) and re-setup my pbulk environment via mk/pbulk/pbulk.sh.
I should mention that initially when I setup, I’d explicitly set CC=/usr/bin/gcc last time, then while trying to get various things to build subsequently, I’d symlink /usr/bin/cc to /usr/bin/gcc. When I came to set thing up with the new snapshot, I did not pass CC=/usr/bin/gcc this time round and found that I was unable to link Perl, not sure if this was the Perl build files assuming if on AIX & /usr/bin/cc exists, it’s XLC or if ld(1) takes on different behaviour but I had to remove this symlink.
Once everything was setup, the bulkbuild failed agin at the same place, except this time I had a different message logged.
/bin/sh: There is no process to read data written to a pipe..
I edited the bootstrap/bootstrap script & devel/bmake/Makefile to set shells/pdksh as a dependency & rerun bulkbuild.
The scan stage again completed with a coredump with this time pscan.stderr just contained Memory fault (core dumped).
I’ve committed these changes so pkgsrc-current now defaults to using shells/pdksh as its shell but have not been able to try anything else as this weekend the system is unaccessible due to maintenance.

At present, I’m attempting to bulkbuild pkgsrc-current on 8 Operating systems
OpenBSD (5.6-RELEASE & -current), FreeBSD, Bitrig (current), Mac OS X (Tiger), Solaris (10 & 11), OmniOS on 4 architectures (i386, AMD64, SPARC, PowerPC).
If I could get AIX going that would bump the OS & arch could up by 1. Maybe by the next post perhaps. 🙂

Thanks to Patrick Wildt for access to host running Bitrig and Rodent@ for adding support to pkgsrc.

Cross-platform software packaging with pkgsrc

I recently gave a brief talk titled “cross-platform packaging, from Oracle Solaris to Oracle Linux & many more Operating systems using pkgsrc” at Solaris SIG (formerly London OpenSolaris User Group), where I tried to describe the benefits of pkgsrc for a system administrator. This was the second talk on the subject of software packaging at these meetings, the first was by Mandy Waite back in 2009 on the OpenSolaris SourceJuicer which used the spec files usually offered for generating rpm packages. There were no slides and I was quite nervous so I thought I’d write this post to clarify things.

pkgsrc is a portable framework for building software with more than 12000 packages available (with variants of packages, more than 15000) on 22 different platforms (see platform notes for level of support).

pkgsrc is inspired from FreeBSD ports. As with ports, each unique package is assigned a sub-directory in a parent directory named after the relevant category. This directory contains a minimum of a Makefile, a file containing cryptographic checksums of the source files, a file containing a description and a packing list which contains a list of files which will be installed. Sometimes there may be additional files which need to be installed or patches which need to be included as well. Though patches exist in the tree, we strive to upstream changes as much as possible to a) be good netizens b) make maintenance easier going forward.

Packages which are available to be built from pkgsrc generally do not have changes outside what’s needed to integrate with the framework or build correctly on various systems where additional flags need to be specified. There is a mechanism in place for packages to offer build time options to enable or disable functionality.

We try to use components offered by host operating system if viable but offer the facility to rebuild against versions available from pkgsrc.

If you’re mandated to use specific packages for system wide use but require tools for use personally, there is a unprivileged mode where packages are installed within ~/pkg (or another prefix of your choice) and the OS is left untouched.

There is a dedicated security team responsible for keeping a vulnerabilities file up to date so that packages installed from the tree that are affected will be flagged. There is also support for signed packages which utilises gpg for signing.

To start using pkgsrc you either need to run the bootstrap script in pkgsrc/bootstrap (fetch & uncompress pkgrc.tar.gz) or obtain a prebuilt bootstrap kit for your OS from pkgsrc.org. If you bootstrap manually using the script you’ll end up with your own copy of the bootstrap kit which you can distribute amongst other systems running the same operating system. With a system bootstrapped, you have the necessary tools to build software from the pkgsrc tree or fetch and install prebuilt binaries.

Assuming that your bootstrap kit is /usr/pkg & you uncompressed your pkgsrc archive to /usr/pkgsrc and you wanted to install nginx
cd /usr/pkgsrc/www/nginx
/usr/pkg/bin/bmake install

You can setup a build environment using the pkgsrc/mk/pbulk/pbulk.sh script which will allow you to build packages in bulk or individually for the purpose of distribution across multiple system.

Solaris SIG name badge

A week of pkgsrc #8

Or should that be a month of pkgsrc!
Since I wrote my last post I received the good news that I’ll be presenting at BSDCan this year in Ottawa. I’m really stoked about this and looking forward to presenting there. In preparation for the big day, I’ll be speaking about different aspects of what I’ve been working on at various events/meetings. My first speaking opportunities will be at Oracle’s SolarisSIG (LOSUG in a past life) tomorrow on cross platform packaging with pkgsrc.
I didn’t have any hosts running Solaris to attempt bulkbuilds on for this, so I asked for shell access at various places, this landed me with access to Solaris 9 to 11 on x86 & SPARC and OmniOS.
Shell access to a Solaris 10u9 host came first and I quickly found that mandoc would not build due to the use of mkdtemp(), dirfd() and vasprintf() whilst attempting to setup a bulkbuild environment using the mk/pbulk/pbulk.sh script in pkgsrc.
Following up with the mandoc developers regarding the issue meant the issue was fixed by Ingo Schwarze within a couple of days with access to Solaris via OpenCSW, the next release of mandoc due Summer time will contain these fixes. I’m undecided whether to back port the fixes to the current release or not. If you have a Solaris 10 host with a “recent” patch cluster (unsure how recent) you should be unaffected by this issue.

On OmniOS, the shells/standalone-tcsh package reared its head again. OmniOS ships with tcsh as standard which shells/standalone-tcsh clobbers again during bulkbuild. The solution to this may be that pkgsrc becomes aware of the various Illumos distros instead of treating everything as SunOS, but nothing has been decided for definite and needs further discussion.
The bulkbuilds on OmniOS using the toolchain published in IPS for that release show issues in pkgsrc where libgomp bundled with GCC is not being found by gettext, this causes lots of packages to fail.
The first bulkbuild which completed showed it was possible to build 9547 out 15245 packages without issue but the inability to build databases/shared-mime-info due to the libgomp issue caused 2274 packages to fail. This was the large impact of the issue mentioned but there are more packages which suffer from the same problem.

With the resumption of bulk building pkgsrc on FreeBSD, I found lots of packages which would fail to package due to use of the -o flag with pax(1) which is not present in the FreeBSD version. I’ve not yet addresses the issue with those packages but I did raise a patch for FreeBSD-current so that it will be available in future versions of FreeBSD. It is now committed and will treacle down to 10-STABLE at some point for inclusion in FreeBSD 10.2-RELEASE.

Bulk builds have also resumed on OpenBSD/amd64 and Sparc64, there is a third bulkbuild in progress on amd64 while the sparc64 host is still running the first attempt. The use of unprivileged user account and the lack of nullfs support prevents the setup of parallel runs to speed up the process. Building on OpenBSD was extremely useful for indicating what software in pkgsrc is effected by LibreSSL without having to put effort into integrating LibreSSL into pkgsrc first. It also raised an instance of copying overlapping buffers destructively (using memcpy() vs memmove()), a fix for this issue is yet to be committed pending further investigation. MAKE_JOBS was raised to 8 on OpenBSD/sparc64 to improve performance whilst compiling, the build environment is hosted on a Sun T5210 with a dedicated disk mounted with soft writes & the kern.bufcachepercent value bumped from 20 to 50.

With the pkgsrc-2015Q1 release announced, the figures for packages indicate that Darwin/PowerPC exceeded Darwin/x86.

11224 binary packages built with gcc for Darwin 8.11.0/powerpc
10019 binary packages built with gcc for Darwin 10.8.0/i386

My first bulkbuild report sent to to the pkgsrc-bulk mailing list back in September 2014 shows a total of 8501 packages could be built. There is more work required to keep this position in the future, as a new version of MesaLib is to be committed at some point soon. There is a bulkbuild in progress on sevan.mit.edu, upon completion I’ll be placing a request for the host to be wiped and setup again, the bulk builds are getting slower and slower and I suspect the IDE disk may be failing.

Thanks to Phil Stracchino, Sascha Curth and the OpenCSW Project for access to hosts running Solaris. Ian Kremlin, Brandon Mercer & Rodent @ NetBSD for access to hosts running OpenBSD.

Virtualising retail Mac OS X images on OS X with virtualbox

For testing changes related to OS X in pkgsrc I revisited trying to get virtual machines of the various releases of OS X running to improve test coverage. At present I’m confined to testing on Tiger and Mavericks though I also have machines running Leopard and Lion but they need setting up.
By default, it’s not possible to boot an instance of Mac OS X from a genuine install image, on a Mac host, running OS X using virtualbox.
Searching around reveals using modified images intended for building Hackintosh as the solution most people use. Virtualbox supports OS X guests but when following the usual steps in the wizard to create a new VM & pointing it to your unmodified OS image, nothing much happens.
Depending on the version of OS X you’re trying to boot you’ll either end up with a XNU hang/panic or just dropped straight to an EFI prompt.
Again, depending on the version of OS X being attempted the issue differs. I’ve managed to install 10.7 to 10.10 successfully on virtualbox so far. 10.5 & 10.6 remain to be done.

10.7 – Lion

With the release of 10.7, Apple changed the way OS was packaged, the digital distribution came with a disk image named InstallESD.dmg nested inside an application named Install Mac OS X Lion.app. It’s possible to use this disk image with virtualbox as-is without change however the system will not boot from the image because it fails a test by the boot loader to ensure the image is being booted on a genuine Mac. In my case it is, but unfortunately the cpuid virtualbox presents to the operating system is not one that the OS recognise & so it fails.
The solution to this is to tell virtualbox to mask the cpuid of the guest, unfortunately depending on the version of hardware? or virtualbox that you’re using you may have to experiment with which ID works. I first tried the ID 00000001 000306a9 00020800 80000201 178bfbff listed in the post by BitTorrent engineering but it did not work on a Mid-2012 MacBookAir5,1 with VirtualBox 4.3.22 r98236.
Searching around I found the ID 1 000206a7 02100800 1fbae3bf bfebfbff to try in a comment on another guide which did work.

To create a working VM of Lion in virtual
1) Create a VM in virtualbox named something, type Mac OS X, version Mac OS X 10.7 Lion (64 bit).
2) Before booting the VM, switch to Terminal and change the cpuid of the guest by running
VBoxManage modifyvm something --cpuidset 1 000206a7 02100800 1fbae3bf bfebfbff
3) Right click on Install Mac OS X Lion.app, select “Show Package Contents” and navigate to Contents/SharedSupport. Copy InstallESD.dmg to a locate on your disk which is navigatable.
4) Start the VM & when asked for an install disk, point to the InstallESD.dmg which you copied out in the previous step. The system should boot without any need for further modification (most guides recommend other changes such as switching to a PIIX chipset).

10.8 – Mountain Lion and newer

With Montain Lion, InstallESD.dmg was changed once again, this time to contain multiple partitions (EFI, Boot/Rescue, Install), unfortunately it’s not possible to boot these images successfully as the notion of multiple partitions is not applicable to media such as optical so what happens is that the system is able to boot from the disk image & load the kernel but unable to continue to load the install environment.
What needs to happen is a new “flattened” image needs to be generated which is on a single partition & contains everything from the boot partition.
There is no need to modify any settings for the VM such as cpuid as previous or chipset as recommended by other guides like Engadgets

To flatten the image a tool called iESD is used.
iESD can either be installed via gem(1) or if you’re a pkgsrc user, I’ve created a WiP package.

The instruction in the Engadget guide pretty much covers everything needed. Just make sure that the disk images are fully detached before issuing the hdiutil commands, quickest way being to open Disk Utility.app, selecting mounted disk images & pressing eject or checkout the output of hdiutil info & using hdiutil detach $devicename to detach all device names associated with the disk images.

A week of pkgsrc #7

Time again to write up what’s been happening on the pkgsrc front since last time, this time the focus hasn’t been so much around pkgsrc on Darwin/PowerPC but more about pkgsrc in general & not necessarily code related.
At the end of the last post I mentioned a gentleman who’d been working on pkgsrc/Haiku and posting videos of his progress, I managed to make contact with him (James) & discussed his work that he’d been doing on pkgsrc. He sent me copy of the repo he’d be working off so I could assist with the aim of getting everything upstreamed as in the current state everything would need to be reintegrated per quarterly release rather than only having to pay attention if a new issue has arisen.
After getting the correct version of Haiku installed in virtualbox, I discovered a nasty bug in the Haiku network kit, it was unable to detect when the end of the file had been reached & would continue (restart?), this was revealed when I tried to download the pkgsrc tar ball from via WebPositive, ftp from the terminal was not affected however. pkgsrc bootstrapped unprivileged without issue. Hint: use the nightly snapshots until there is a newer release than Alpha 1 available.
The integration of pkgsrc into the user-land on Haiku is not currently possible due to the way the user-land is constructed, from what I understood, each Haiku package contains a piece of the filesystem, all the packages are union mounted to construct the user-land dynamically when the system comes up. That aside, with my system bootstrapped, I attempted to build Perl and ran into another bug, it seems that the library path for libperl is not populated on haiku hence perl is able to “build” but unable to run, the workaround for this in the tree I was given was to symlink libperl into ~/pkg/lib & move on. I tried various things but was unsuccessful, I believe the problem is pkgsrc specific as the version of Perl available in haikuports do not need any special treatment and the rpath is passed in correctly.
The problem was trying to isolate the required change to fix the problem, whereas in pkgsrc a policy file is passed to the build to set how Perl should be built, haikuports clobbers the source & patches in a replacement, I stopped at that point.

Haiku nightly running in virtualbox

At around about this time I received the good news about the NetBSD Foundation membership and commit bit so my focus moved to reading the various developer documentation & getting familiar with processes.

sevan.mit.edu finished a bulkbulid attempt of the entire tree which took the longest time so far to complete, through all the build attempts I uncovered a new bug, the range to use for numerical IDs of UID/GID, is not sufficient to cover all the packages in the tree that need to create an account. On further discussion with asau@, it was suggested the IDs are allocated randomly and should be fixed for consistency across builds. I started doing bulkbuilds of the entire tree on FreeBSD 10.1-RELEASE and stumbled across a very nasty bug. There is a version of tcsh package in the pkgsrc tree called shells/standalone-tcsh, this is tcsh built as a static binary & set to install to /bin (the only package in the pkgsrc tree which violates the rules and places files outside of $PREFIX by default?), this ended up overwriting the system bundled version of tcsh in FreeBSD & then deleting /bin/tcsh when the package is removed, this was fixed promptly by dholland@. It was also discovered that Python’s configure script had not been changed when FreeBSD switched to elf binaries and so still trimmed the name of libraries to account for the old linker which could not handle a minor version in a libraries filename (libpython.so.1, not libpython.so.1.0). All versions of Python had been patched in the pkgsrc tree to remove this change so that it used a consistent naming convention across all platforms. After discussing this with bapt@ (FreeBSD) at FOSDEM, it turned out to be bug and should be fixed in future Python versions once the fixes are upstreamed.

Koobs@ (FreeBSD) dug out the commit which introduced the change & the bug report.

The opportunity to join the pkg-security team came up & for the past few weeks I’ve been getting familiar with the processes of dealing with security advisories & listing them so that users who fetch the pkg-vulnerabilities database are notified if they have any vulnerable packages installed. The general advisory process is a little infuriating, based on my recent experience I’d say at the top of my list are the Oracle security advisories as they do not divulge any details other than “unknown” in version(s) X, PHP for the frequency, OpenSSL for the impact. On the one hand I was quite impressed that CVE IDs were becoming so familiar that I could spot, on the fly, an advisory that had been accounted for, but on the other hand quite upset that I was using brain capacity on this. The availability of information is quite frustrating too, issues which are assigned an ID but cannot by checked on Mitre’s site take extra effort to find the necessary information to include (Mitre are responsible for allocating the CVEs!), I should note that this is from public advisories, say from a distribution. Example, CVE-2015-0240 was announced today, the Redhat security team published a blog post covering the issue, the Mitre site at present says:
“** RESERVED ** This candidate has been reserved by an organisation or individual that will use it when announcing a new security problem. When the candidate has been publicised, the details for this candidate will be provided.”
The wording & the lack of information can also be frustrating because it’s not clear what is affected. Looking at it positively, the requirement for clarification on these discrepancies means I get lots of opportunities to approach new people in different communities to ask questions.

I created a new wiki article on the NetBSD wiki to start documenting the bootstrap process of pkgsrc on Solarish, Illumos based distributions. At present the article covers what’s required to bootstrap successfully on OmniOS, Tribblix, OpenIndiana and OpenSXCE.

One thing that’s clearly evident is my workflow needs attention, at the moment things are very clumsy, involving lots of switching around but hopefully that will be addressed in the coming month. The first thing I’ve done is setup templates for emails with the correct preferences specified so that I just need to fill the necessary information & hit send, the necessary settings are automatically applied. Still thinking about how to deal with the scenario where the system that work is being carried out on is different to the system where the patch is going to be committed from, this also happens to be a different system which a developer is using. How to deal with that in as few steps from reading, say a bug report, to generating a patch, testing it & committing a fix.

For testing patches on Mac OS X, I revisited running OS X as guest on a Mac running OS X with virtualbox. Attempts in the past had not been successful & it seemed from search results that the only approach taken was to use modified OS images for hackintosh which I did not want to take. I have a genuine machine & genuine license, I shouldn’t have to resort to 3rd party images to run this. After whinging on twitter & referencing some older links I was able to successfully virtualise Mac OS X 10.7 to 10.10 in virtualbox. Will follow up with the details on that in a separate post.

OS X Lion as a virtualbox guest