Posted on 2018-06-17
Determining Boot Duration Issues on Our Ubuntu 18.04 (Bionic Beaver) Preview Images for AML-S905X-CC (Le Potato)
One of the major issues with our previous Ubuntu 18.04 preview release images was the long 5 minute boot time. We did a little debugging on that front to find out the exact reason and to derive a resolution. In this article we go over the steps that we took to help people understand how to approach similar issues. Please note that the problem is not related or relevant for official Ubuntu images for x86 and Armbian Ubuntu 18.04 images.
To start off with a background of the boot process for our images, we have to begin with the disk layout for our Ubuntu images. Our images are released as a zip file of the raw block data. The raw image is 4GB in size using a MBR partition system composed of two primary partitions.
Disk Image (4GB):
- MBR Partition System (512 Byte Sectors)
- Empty Space - Sector 0 to 2047 (0MB to 1MB) - MBR and u-boot
- Partition 1 - Sector 2048 to 524287 (1MB to 256MB) - FAT (EFI) - /boot
- Partition 2 - Sector 524288 to 8191999 (256MB to End) - btrfs
When you flash this image onto a MicroSD card using
Win32DiskImager, you only have to flash 4GB even if the MicroSD card is much larger. Upon boot, the image has a run-once script,
lc_repart_disk_once, that determines the actual MicroSD card size and re-formats the disk to make use of the empty space > 4GB.
It computes the last incomplete gigabyte (1024 ^ 3) and creates primary partition 3 in that space for a swap partition. Then it extends primary partition 2 to use all of the intermediary space. BTRFS can be resized online so there no need to reboot to extend it. The swap partition is added to
/etc/fstab and turned on.
Since most of the early kernel actions happen when there isn't a GUI, we used our trusty UART to USB adapter to get early access to the system. We connected it to the three pin 2J1 connector which is highlighted in red on the picture below.
[caption id="attachment_982" align="alignnone" width="920"] 2J1 Connector Highlighted in Red: Ground (BLACK) TX (WHITE) RX (GREEN)[/caption]
We use Ubuntu/Debian internally so we ran
sudo minicom -b 115200 -D /dev/ttyUSB0 on our computers after plugging in the UART cable. The baud rate for the board is set to 115200 in software. We had to disable the hardware flow control by pressing Control+A, O, Serial port setup, F.
Problem Isolation and Resolution
Ubuntu 18.04 like Ubuntu 16.04 before it uses systemd as the init system which allows for clear dependencies and parallel process execution. There are two valuable tools:
journalctl for reading logs and
systemd-analyze for determining the process tree that took the longest.
It takes about 20 seconds for the board to get to UART TTY prompt. We login using
computer as the username and password respectively. We crawled through the boot logs using
sudo journalctl to find that
lc_repart_disk_once was timing out after 5 minutes and getting restarted. The other way to determine this was by running
sudo systemd-analyze criticial-chain. After a system is fully booted (or timed out), this will give you the process tree that took the longest time with each process itemized.
Next, we enabled debugging on the
lc_repart_disk_once shell script by adding
set -x to enable verbose output that can be examined via
sudo journalctl -u lc_repart_disk_once. We noticed that the
mkswap command in the script seems to hang for a few minutes even though it should only take a few seconds.
To trace what goes wrong with a process, we installed the handy
strace utility via
sudo apt-get install strace. This utility will report userspace and kernel interactions. By prefixing the
mkswap command with
strace a script, we were able to determine exactly what
mkswap was doing.
After restoring the filesystem to its original state, we restarted the system.
sudo journalctl -u lc_repart_disk_once reported that
mkswap was hanging on a read from
/dev/random which is a system entropy issue. The annoying thing with
/dev/random is that reading from it is a blocking call when system entropy gets low and won't unblock until system entropy recovers, which can be quite slow. We checked
sudo cat /proc/sys/kernel/random/entropy_avail and sure enough it was below 100, which will cause reads from
/dev/random to block.
Luckily, the Amlogic S905X found in Le Potato has a built-in hardware random number generator (RNG) and BayLibre upstreamed support for it in Linux 4.8. All that was missing is the
rng-tools daemon that will back the
/dev/random with the HWRNG in
/dev/hwrng. By installing it via
sudo apt-get install rng-tools, we were able to let
mkswap finish within seconds instead of hanging on entropy.
This problem is not readily transparent or an issue with the application code. It is sometimes critical for an user or developer to understand how Linux (and system level design) works in order to develop an effective solution and not resort to workarounds like patching base utilities or working around system level problems in application level logic.
With this all being said and done, we have released our Ubuntu 18.04 Preview Image 3 which now boots in two minutes to the Gnome Display Manager instead of five minutes on the first run. Second boots takes less than 45 seconds. The headless boot times have not changed from the previous 20 seconds. This is a tremendous improvement to usability.
We have started putting images in Google Drive for faster downloads. You can find them in the README.txt. Other changes in PI3 include Linux LTS 4.14.50, defaulting to Wayland in GDM for the libre user, increased compressed memory pool, and a few more resolutions. Outstanding issues include overlay implementation, upstream Linux support for 2K/4K HDMI output, and VPU work for accelerated video decoding. Ely, a community member, has contributed work towards open source hardware video decoding which is very exciting.
We expect another preview release before we have a formal release. We are currently getting infrastructure in place to host repositories for the formal release so you can
sudo apt-get update && sudo apt-get upgrade to keep everything up to date instead of re-flashing MicroSD cards.
By the time the next Linux LTS rolls around (4.19 in October), we should have an unified image for all three of the current CC and CM platforms.