Posted on

Determining Boot Duration Issues on Our Ubuntu 18.04 (Bionic Beaver) Preview Images for AML-S905X-CC (Le Potato)

One of the major issues with our previous Ubuntu 18.04 preview release images was the long 5 minute boot time. We did a little debugging on that front to find out the exact reason and to derive a resolution. In this article we go over the steps that we took to help people understand how to approach similar issues. Please note that the problem is not related or relevant for official Ubuntu images for x86 and Armbian Ubuntu 18.04 images.

Background

To start off with a background of the boot process for our images, we have to begin with the disk layout for our Ubuntu images. Our images are released as a zip file of the raw block data. The raw image is 4GB in size using a MBR partition system composed of two primary partitions.

Disk Image (4GB):

  • MBR Partition System (512 Byte Sectors)
    • Empty Space – Sector 0 to 2047 (0MB to 1MB) – MBR and u-boot
    • Partition 1 – Sector 2048 to 524287 (1MB to 256MB) – FAT (EFI) – /boot
    • Partition 2 – Sector 524288 to 8191999 (256MB to End) – btrfs

When you flash this image onto a MicroSD card using dd or Win32DiskImager, you only have to flash 4GB even if the MicroSD card is much larger. Upon boot, the image has a run-once script, lc_repart_disk_once, that determines the actual MicroSD card size and re-formats the disk to make use of the empty space > 4GB.

It computes the last incomplete gigabyte (1024 ^ 3) and creates primary partition 3 in that space for a swap partition. Then it extends primary partition 2 to use all of the intermediary space. BTRFS can be resized online so there no need to reboot to extend it. The swap partition is added to /etc/fstab and turned on.

Setup

Since most of the early kernel actions happen when there isn’t a GUI, we used our trusty UART to USB adapter to get early access to the system. We connected it to the three pin 2J1 connector which is highlighted in red on the picture below.

AML-S905X-CC UART 2J1 Connector
2J1 Connector Highlighted in Red: Ground (BLACK) TX (WHITE) RX (GREEN)

We use Ubuntu/Debian internally so we ran sudo minicom -b 115200 -D /dev/ttyUSB0 on our computers after plugging in the UART cable. The baud rate for the board is set to 115200 in software. We had to disable the hardware flow control by pressing Control+A, O, Serial port setup, F.

Problem Isolation and Resolution

Ubuntu 18.04 like Ubuntu 16.04 before it uses systemd as the init system which allows for clear dependencies and parallel process execution. There are two valuable tools: journalctl for reading logs and systemd-analyze for determining the process tree that took the longest.

It takes about 20 seconds for the board to get to UART TTY prompt. We login using libre and computer as the username and password respectively. We crawled through the boot logs using sudo journalctl to find that lc_repart_disk_once was timing out after 5 minutes and getting restarted. The other way to determine this was by running sudo systemd-analyze criticial-chain. After a system is fully booted (or timed out), this will give you the process tree that took the longest time with each process itemized.

Next, we enabled debugging on the lc_repart_disk_once shell script by adding set -x to enable verbose output that can be examined via sudo journalctl -u lc_repart_disk_once. We noticed that the mkswap command in the script seems to hang for a few minutes even though it should only take a few seconds.

To trace what goes wrong with a process, we installed the handy strace utility via sudo apt-get install strace. This utility will report userspace and kernel interactions. By prefixing the mkswap command with strace a script, we were able to determine exactly what mkswap was doing.

After restoring the filesystem to its original state, we restarted the system. sudo journalctl -u lc_repart_disk_once reported that mkswap was hanging on a read from /dev/random which is a system entropy issue. The annoying thing with /dev/random is that reading from it is a blocking call when system entropy gets low and won’t unblock until system entropy recovers, which can be quite slow. We checked sudo cat /proc/sys/kernel/random/entropy_avail and sure enough it was below 100, which will cause reads from /dev/random to block.

Luckily, the Amlogic S905X found in Le Potato has a built-in hardware random number generator (RNG) and BayLibre upstreamed support for it in Linux 4.8. All that was missing is the rng-tools daemon that will back the /dev/random with the HWRNG in /dev/hwrng. By installing it via sudo apt-get install rng-tools, we were able to let mkswap finish within seconds instead of hanging on entropy.

This problem is not readily transparent or an issue with the application code. It is sometimes critical for an user or developer to understand how Linux (and system level design) works in order to develop an effective solution and not resort to workarounds like patching base utilities or working around system level problems in application level logic.

Results

With this all being said and done, we have released our Ubuntu 18.04 Preview Image 3 which now boots in two minutes to the Gnome Display Manager instead of five minutes on the first run. Second boots takes less than 45 seconds. The headless boot times have not changed from the previous 20 seconds. This is a tremendous improvement to usability.

Other Thoughts

We have started putting images in Google Drive for faster downloads. You can find them in the README.txt. Other changes in PI3 include Linux LTS 4.14.50, defaulting to Wayland in GDM for the libre user, increased compressed memory pool, and a few more resolutions. Outstanding issues include overlay implementation, upstream Linux support for 2K/4K HDMI output, and VPU work for accelerated video decoding. Ely, a community member, has contributed work towards open source hardware video decoding which is very exciting.

We expect another preview release before we have a formal release. We are currently getting infrastructure in place to host repositories for the formal release so you can sudo apt-get update && sudo apt-get upgrade to keep everything up to date instead of re-flashing MicroSD cards.

By the time the next Linux LTS rolls around (4.19 in October), we should have an unified image for all three of the current CC and CM platforms.

2 thoughts on “Determining Boot Duration Issues on Our Ubuntu 18.04 (Bionic Beaver) Preview Images for AML-S905X-CC (Le Potato)

  1. When you say “Mali GBM support, upstream Linux support for 2K/4K” does this pertain to any/all 4k support. I’ve been trying to find a distro for this device that does allow me to run in 4k. Id settle for a how to setup if there are no distros.

    1. Amlogic has a BSP based on Linux 4.9 that supports HDMI 2.0 2K/4K but most of the code is not mainline. We have been sponsoring work for mainline but it doesn’t have the code to drive HDMI 2.0 yet since HDMI 2.0 is much more complex. Your best bet is to look at the CoreELEC Linux 3.14 kernel based on older Amlogic BSP or Amlogic current BSP on Linux 4.9.

Leave a Reply