Building Armbian for Rockchip RK3128

Technical note on building Armbian for Rockchip RK3128

Introduction

I have had an RK3128 TV box sitting around for a long time. I got Armbian booting on it years ago, but the setup was quite complicated, with a lot of manual steps and one-off fixes. Even though I use it day to day as a small Linux box, I never shared it with anyone else because reproducing it would require too much manual work, and I was too busy to write a proper guide.

After I posted the note about running Debian on the S805 box, someone asked about the RK3128 one. He was reusing old TV boxes for a small project to reduce electronic waste, which was reason enough to revisit it.

This article is not an installation guide. It is a technical note on cleaning up that old setup and figuring out which parts were actually necessary to make it reproducible.

RK312x Booting note

Booting process

At a high level, the boot flow is simple: power on, run Miniloader, load U-Boot, load the kernel, then boot the operating system. The real details are messier, but this is the part that matters for the rest of this note.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
Power on
   |
   v
Boot ROM
   |
   v
Miniloader
   |  - initialize DRAM
   |  - set up basic storage access
   v
U-Boot
   |  - load kernel
   |  - load device tree
   |  - set bootargs
   v
Linux kernel
   |
   v
init
   |
   v
Userspace / OS

When the board powers on, the SoC first executes its internal boot ROM. It then looks for a Miniloader on the configured storage. If it finds one, it loads it into the small internal SRAM and starts it. Miniloader is Rockchip’s closed-source first-stage bootloader. It is usually specific to a CPU family and, in practice, often tied to the board’s DRAM initialization and storage setup as well. Its main job is to bring up enough hardware, especially DRAM, so the next stage can run normally.

Once that is done, Miniloader loads U-Boot. U-Boot then takes over the more flexible part of the process: reading the kernel, device tree, and optional initramfs from storage, setting the kernel command line, and jumping into the kernel.

From there, the Linux kernel initializes the rest of the hardware, mounts the root filesystem, starts init, and brings the operating system up.

Boot mode

Besides the normal boot path, Rockchip also has two special USB boot modes:

MaskROM mode is the lowest-level recovery mode. Rockchip CPUs have a small internal ROM inside the SoC that contains the ROM code for this mode. That is why people sometimes say Rockchip is “like a rock”: unless the chip is physically damaged, there is usually still a recovery path. The board enters this mode when it cannot find any valid Miniloader on any storage device. The functionality is limited, but it is still enough to connect over USB and download a higher-stage bootloader.

Loader mode is one step higher. In this case, the chip has already managed to load and execute Miniloader, and Miniloader exposes the USB interface used by Rockchip’s flashing tools. Because DRAM and basic storage access are already initialized at this point, Loader mode is more convenient than MaskROM mode for normal flashing work.

1
2
3
Normal boot:  Boot ROM -> Miniloader -> U-Boot -> Kernel -> OS
Loader mode:  Boot ROM -> Miniloader -> USB flashing interface
MaskROM mode: Boot ROM -> USB flashing interface

Storage types

NAND vs eMMC

This class of TV box is old enough that it may come with two storage types: eMMC or NAND. In short:

  • NAND is a raw storage device. Wear management and other low-level handling must be done in software.
  • eMMC is NAND plus a controller. Wear leveling and device commands are handled by the controller.

Here are photos of my two boards with different storage types. This is not a perfect rule, but in practice the chip with visible pins is usually NAND, while the package without visible pins is usually eMMC. Identifying the storage type matters because I may need to adjust the Armbian boot setup in the next steps.

Booting into MaskROM mode

  • Normally, you just need to boot into Loader mode to update or install a new ROM. On this board, the easiest way to do that is to keep holding the reset button while powering it on. But sometimes you break things badly enough that you need MaskROM mode to recover it.
  • For NAND devices, you can short either ALE or CLE (pin 16 or pin 17) to GND while powering on the board. The board will then fail to recognize the NAND and enter MaskROM mode. You may also find advice telling you to short D0 or other data pins to GND. NEVER DO THAT. EVER. It may appear to work a few times, but it can damage the CPU’s NAND interface and leave the board unable to detect NAND again. You can also short pin 16 and pin 17 together.
  • If counting the pins directly is inconvenient, first find the side that has pin 1, then count backward from pin 24 on that side. The two pins I use are the 8th and 9th pins from pin 24. In the photo below, I soldered a wire to pin 16. Whenever I need to enter MaskROM mode, I just short that wire to the USB port.

Disable storage for MaskROM mode

  • For eMMC devices, this type of board often still has the NAND pads routed out, so you can use the same pins described above to get the same effect.

Rockchip Partitioning

Rockchip partitioning

Like many other vendors, Rockchip uses its own proprietary partition table together with its NAND driver. In this partitioning system, everything is counted in 512-byte blocks.

The first 0x2000 blocks (4 MiB) form a small region used for Miniloader, partition layout, and system parameters. I call it the Loader partition. After that region, the rest of the storage is split into partitions such as uboot, trust, system, and so on.

Partitioning Rockchip storage is actually very simple. We upload a file called parameter.txt to address 0x0 of the storage, and then Miniloader uses it to partition the device. After it is written, this parameter data is stored at both the beginning and the end of the Loader partition. The layout of the Loader partition is below:

1
2
3
4
Parameter
Partition table
MiniLoader
Parameter

Below is a sample of the parameter I took from a ROM running Android 7.1.2.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
FIRMWARE_VER:7.1.2
MACHINE_MODEL: Karabox_k2
MACHINE_ID:007
MANUFACTURER:RK30SDK
MAGIC: 0x5041524B
ATAG: 0x60000800
MACHINE: 312x
CHECK_MASK: 0x80
KERNEL_IMG: 0x60408000
#RECOVER_KEY: 1,1,0,20,0
CMDLINE:earlycon=uart8250,mmio32,0x20068000 earlyprintk console=ttyS2 androidboot.baseband=N/A androidboot.selinux=disabled androidboot.hardware=rk30board androidboot.console=ttyS2 init=/init initrd=0x62000000,0x00800000 mtdparts=rk29xxnand:0x00002000@0x00002000(uboot),0x00002000@0x00004000(trust),0x00002000@0x00006000(misc),0x00008000@0x00008000(resource),0x00006000@0x00010000(kernel),0x00006000@0x00016000(boot),0x00010000@0x0001C000(recovery),0x00020000@0x0002C000(backup),0x00040000@0x0004C000(cache),0x00008000@0x0008C000(metadata),0x00002000@0x00094000(kpanic),0x00400000@0x00096000(system),0x00020000@0x00496000(radical_update),-@0x004B6000(userdata)

The partition information is written inside CMDLINE. Miniloader reads it and updates the partition table from it. Later, the rknand driver reads that partition table and creates logical devices named rknand_<partition_name>. For example, for the boot partition you will see a device named rknand_boot. Then we can mount /dev/rknand_boot as an ext4 filesystem.

RKDevTool

RKDevTool is a Windows tool provided by Rockchip. It can be used to write almost anything to NAND or eMMC. It gets updated quite frequently, but in my experience version 2.69 is the most stable. There is also a Linux version if you prefer that.

RKDevTool v2.69

Some notes for this tool:

  • Miniloader and parameter use the same address, 0x0. That is intentional, because parameter is input for Miniloader, not a separate raw text blob written to storage by itself.
  • In MaskROM mode, all buttons except Run are not working. After Miniloader is downloaded, the other buttons start working. At that point, even if the tool still says the board is in MaskROM mode, it is effectively in Loader mode.

Building U-boot

One thing I like about Rockchip is that they publish a fairly complete vendor U-Boot tree together with the extra binaries needed for the older boot flow. Their basic U-Boot documentation is here: https://opensource.rock-chips.com/wiki_U-Boot.

For RK3128, those vendor changes still matter, especially around the old storage and loader flow. I also made a few tweaks in my own fork to make it work better on these TV boxes: https://github.com/chieunhatnang-personal/u-boot-rk3128-tvbox. The main changes are:

  • Enable booting from all available devices in the order USB -> SD card -> eMMC -> NAND -> PXE. If it can’t find boot script (boot.scr) or ext file, it will move to the next device.
  • Add a few utility commands
  • Add 9 seconds wait for Ctrl+C to stop autoboot and give U-boot command
  • Add support for the reset key. When pressing it, it will goes to the MaskROM mode, similar to the stock U-boot.

To build it, we need a 32-bit ARM cross toolchain and the rkbin directory, because Rockchip’s build scripts still depend on binaries and helper tools from there. Rockchip’s own vendor documentation uses gcc-arm-8.3-2019.03-x86_64-arm-linux-gnueabihf for 32-bit ARM boards. In my case, I built it with gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf, which worked fine for this tree. The official rkbin repository is here: https://github.com/rockchip-linux/rkbin

After getting those, we need to adjust the paths in make.sh. Then we can simply run:

1
./make.sh rk3128

After the build finishes, the useful output files at the project root are:

1
2
3
rk3128_loader_v2.12.263.bin -> Rockchip USB loader / Miniloader bundle
uboot.img                   -> U-Boot image in Rockchip format
trust.img                   -> Trust / TEE image

The most important one for recovery work is rk3128_loader_v2.12.263.bin. This is the loader image that RKDevTool can download while the board is in MaskROM mode. Once that succeeds, the board effectively moves into Loader mode, and then the rest of the images can be written normally.

Building Linux kernel 4.4.194

This was the hardest part by far. Rockchip does publish kernel source at https://github.com/rockchip-linux/kernel, but the tree has many branches and very little guidance on which one actually fits an old TV box board. I spent quite a bit of time reading through the Armbian RK322x work, especially jock’s thread here: https://forum.armbian.com/topic/34923-csc-armbian-for-rk322x-tv-box-boards. Even with that as a reference, this part was still messy.

The first attempt

Out of all the branches in Rockchip’s kernel tree, I started with the one based on Linux 4.4.194. The main reason was practical: I wanted something close enough to the older RK322x Armbian userspace that I could reuse jock’s root filesystem with as few surprises as possible. I even changed the kernel sub-version to match with his build (4.4.194-rk322x) to reuse his drivers.

Inside that tree, the two obvious starting points were rk3128_linux.config and rk3128_linux_spi_nand.config. Both built without much trouble, and I could get some basic parts of the system working, such as CPU bring-up and USB. But a lot of the board-specific hardware was still broken. I also tried most of the DTS files with the rk3128 or rk3128h prefix under the vendor DTS directory, and the result was still a mess. At that point it became clear that “building the kernel” mostly meant iterating on DTS and driver fixes, not just running make.

CPU stepping

CPU frequency scaling was another part that looked simple at first and then turned into a small detour. The board would boot, but the CPU stayed at 600 MHz. That was enough to keep the system usable, so it did not immediately look broken in the same obvious way as storage or Wi-Fi, but it was clearly not the intended behavior.

My first assumption was the usual one: maybe the DTS was missing the correct CPU OPP table, or maybe the regulator wiring was incomplete. The vendor RK3128 Android DTS did have a more complete CPU voltage table, and the RK322x tree I was reusing also had working DVFS on very similar Cortex-A7 TV boxes. So I started by merging those ideas into rk3128-linux.dts.

That still did not actually enable scaling. The useful clue was not the current frequency itself, but the absence of the usual cpufreq sysfs nodes. There was no policy0, no scaling_max_freq, and no scaling_available_frequencies. That meant the problem was deeper than “the governor picked a low speed”. The cpufreq policy had never been created at all.

Normally, this kind of CPU stepping needs a programmable regulator. On this class of cheap hardware, I first assumed the CPU rail would just be fixed and that real DVFS was therefore not possible. But Rockchip took a cheaper path here: the SoC already has PWM outputs, and the board uses those PWM lines to control the CPU and logic rails. So from the kernel side, the important part was not finding a fancy PMIC. It was describing the rail as a pwm-regulator correctly. That is also why the RK322x reference DTS ended up being a better clue than the leftover rk816 node in the original board DTS. Once I switched the CPU rail to a PWM regulator on pwm1, added the logic rail on pwm2, and pointed cpu-supply, mali-supply, and center-supply to those rails, cpufreq finally came up correctly.

There was one more small mismatch after that. The OPP table I had copied still used a 1.425 V maximum, while the PWM regulator definition on this board only allowed a lower ceiling. Because of that, the kernel rejected every CPU OPP as unsupported by the regulator. After aligning the regulator limits and the OPP table, the cpufreq policy appeared as expected.

At that point the board finally exposed the proper frequency table. The simplest way to check it was:

1
2
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
216000 408000 600000 816000 1008000 1200000

For quick testing, these were the sysfs knobs I ended up using most often:

1
2
3
4
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo interactive > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo 1200000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq

One small trap here was that the governor did not stay where I set it. Even though the kernel tree itself was built with interactive as the default governor, after reboot the board still came back as ondemand. That turned out not to be a kernel problem at all. It was a userspace leftover from the RK3229 root filesystem:

1
2
3
4
5
6
cat /etc/default/cpufrequtils

ENABLE=true
MIN_SPEED=600000
MAX_SPEED=1500000
GOVERNOR=ondemand

That file is applied during boot and writes directly to the same cpufreq sysfs nodes, so in practice it overrides the boot-time governor and min/max limits. On RK3128, that MAX_SPEED=1500000 setting was clearly just old RK3229 baggage, because the actual RK3128 frequency table stopped at 1200000. The safer version for this board was:

1
2
3
4
ENABLE=true
MIN_SPEED=216000
MAX_SPEED=1200000
GOVERNOR=interactive

The important correction is that the governor was not the real bug either. I originally suspected that because Firefox and video playback would trigger random hard lockups, and one of the traces happened to involve the ondemand workqueue. That turned out to be misleading.

What actually failed was the voltage table I first copied from RK3229. RK3229 and RK3128 are close relatives, but not identical. The RK3229 datasheet allows a lower CPU voltage range: roughly 0.9 V minimum and 1.0 V typical. RK3128 is specified higher: roughly 1.0 V minimum and 1.1 V typical. In other words, the low-voltage DVFS values that are reasonable on RK3229 are already too aggressive for RK3128.

That mistake showed up very clearly in the early OPP table I used. I had copied low points like:

  • 216 MHz at 950 mV
  • 408 MHz at 950 mV
  • 600 MHz at 975 mV

Those values are in the RK3229 comfort zone, but they are below where RK3128 should have much margin. The old vendor Rockchip code also adds another complication on top of that: it can pick different voltage variants (L0, L1, L2) from efuse leakage data. So even when the visible OPP table looked acceptable, the actual voltage used on a particular chip could end up a bit lower again.

That is why the board behaved in such a confusing way. A fixed 1200 MHz stress test could run for hours without problems, so the top speed itself was clearly not the issue. The crashes happened instead during dynamic transitions, especially when the system was lightly loaded and the governor tried to walk the CPU back down into those low-voltage states. Firefox was a very good trigger because it causes a lot of bursty load changes: wake up, render, idle, wake up again, start a video, and so on. The system was not failing because it needed more peak frequency. It was failing because some of the low-frequency, low-voltage states were not stable on RK3128.

So the real fix was not “use a different governor” and not “work around some kernel bug”. The real fix was to stop treating RK3128 like RK3229 in the DVFS table:

  • keep the PWM regulator wiring, because that part was correct
  • disable the leakage-based voltage selection for the CPU OPP table
  • use fixed voltages instead of per-chip lower-voltage variants
  • raise the CPU regulator floor above the RK3128 minimum
  • put back the low frequencies only with RK3128-safe voltages

The stable table I ended up with was:

  • 216 MHz at 1.05 V
  • 408 MHz at 1.05 V
  • 600 MHz at 1.10 V
  • 816 MHz at 1.20 V
  • 1008 MHz at 1.20 V
  • 1200 MHz at 1.35 V

So the practical lesson here is a bit different from what I first wrote:

  • cpufrequtils may still override your chosen governor at boot, so it is worth checking
  • performance or interactive can make the system look more stable because they reduce how often it enters the low OPPs
  • but if the voltage table itself is wrong, changing the governor only changes how often you hit the problem

For this board, the root cause was the RK3229-style low-voltage OPP table, not the governor itself.

RAM dynamic frequency

RAM frequency scaling looked similar to CPU scaling at first, but it turned out to be much less predictable. The vendor DTS already has a dmc node and a DDR OPP table, and on some RK3128-family boards that dynamic path works fine. In fact, one of my boards is very stable with dynamic DDR scaling, while another is unreliable enough to hang. So this is not one of those cases where dynamic DDR scaling is universally broken. It seems to depend much more on the specific board, its DRAM chips, and the routing.

At first I also suspected the Miniloader, because Rockchip’s old boot flow does DRAM initialization very early. But in this case I was using the same Miniloader on both boards and still getting different results, so the Miniloader was probably not the main variable here. On the unstable board, dynamic DDR scaling was not reliable. With the DMC path enabled normally, the system could hang during boot or later under load. If I disabled DMC completely, the board became stable again, and the RAM stayed at the bootloader-set rate, which on this board looked like about 300 MHz. So the first useful step was not to chase the dynamic governor immediately, but to test fixed DDR targets one by one with overlays.

The first thing that confused me was that the number in the DTS is not necessarily the real frequency you get. The DDR driver asks for a target frequency, but the final rate can be rounded by Rockchip’s DDR clock handling path. In other words, the overlay name is really “request this DDR OPP”, not “guarantee this exact final clock”. On my board, selecting 330 MHz did not actually give 330 MHz. The reported live clock became 396 MHz, which means about 792 MT/s effective.

That rounding also means the results are not monotonic in the way you might expect. On this board:

  • disabling DMC was stable and left the RAM around 300 MHz
  • selecting 330 MHz was also stable, but the real clock became 396 MHz
  • selecting 400 MHz did not give a higher stable result; it fell back to 300 MHz
  • selecting 600 MHz caused a hard kernel hang

That is also why I switched to testing the real DDR steps already present in the RK3128 DMC table: 300, 330, 400, 600, 666, 700, 786, and 800. Even then, those numbers should still be treated as requested operating points, not a promise that the live clock will match them exactly on every board.

So the practical approach here is much more conservative than I first expected:

  • if dynamic DDR scaling works on a board, great, leave it enabled
  • if it does not, switch to a fixed static DDR overlay first
  • start from the lowest stable point and move up one step at a time
  • after each step, check the real live DDR clock instead of trusting the overlay name
  • only keep going if the board survives both boot and real workloads

For this particular board, the useful lesson was that “higher configured DDR” does not necessarily mean “higher real DDR”, and it definitely does not mean “stable”. The safe path was to treat each DDR step as an experiment. In practice, 330 was the best fixed choice here, not because 330 MHz itself was special, but because it rounded to a stable 396 MHz on this board while the higher requests did not behave well.

GPU

For desktop testing, I installed LXDE, mainly because it is light enough that the board stays usable while I debug the rest of the system.

GPU support was another place where I had to correct my own expectations. RK3128 does have a Mali-400 MP, and on the vendor 4.4 kernel it can still be used through the old proprietary Utgard stack. Following jock’s old RK322x legacy media notes, I tried the same userspace direction here, especially libmali-rk-utgard-400-r7p0-x11_1.7-1_armhf.deb together with the armsoc Xorg driver. After also disabling the inherited RK322x Lima Xorg fragment from the reused root filesystem, es2_info finally reported:

1
2
3
EGL_VENDOR: ARM
GL_RENDERER: Mali-400 MP
GL_VERSION: OpenGL ES 2.0

So the GPU is not dead. The board really can run the old Mali userspace stack, but only in a narrow legacy sense. It is good enough for EGL / OpenGL ES 2.0 applications. It is not a modern Mesa/Lima desktop stack, and it does not provide a clean accelerated GLX path for normal Linux desktop software. In Xorg, the system still ended up falling back to DRISWRAST for GLX, so desktop OpenGL and modern browser rendering stayed mostly on software paths.

That also explains why browser testing was so disappointing. Firefox could sometimes feel a bit smoother to move around, but video playback inside the browser was still poor. More importantly, after bringing in the old libMali userspace stack, the kernel became unstable enough to crash randomly. After a lot of trying, I ended up going back to the plain Mesa software path, which on this board means llvmpipe.

That fallback is less exciting, but much more useful in practice. Performance was not dramatically worse for the lightweight desktop workloads I actually care about, while stability was clearly better. With llvmpipe, the system no longer suffered the random crashes I was seeing with the old Mali userspace stack.

Video playback has a similar limitation. The GPU does not decode video; that job belongs to the old Rockchip video blocks and the legacy RKMPP userspace on the 4.4 vendor kernel. That may still help local media players, but browser video sites, especially YouTube, are still a bad workload for this board.

So the honest summary is:

  • the legacy Mali stack works for some OpenGL ES 2.0 applications
  • but for a normal GUI desktop, I ended up back on Mesa llvmpipe
  • llvmpipe was about the same for my practical desktop use, but much more stable
  • so if someone just wants to install a GUI on this board, the software path is the safer choice

RKNAND

rknand was especially important here. RK3128 is older and cheaper than RK3229, and many of these boxes use raw NAND instead of eMMC. On RK3229 boxes, eMMC is much more common, so storage support tends to be less painful there. On RK3128, if rknand does not work, the whole bring-up becomes much less useful.

After trying a long list of DTS combinations, I accidentally tried an rk3126 DTS. That was the first time rknand suddenly started working. I would not generalize that into “RK3126 DTS is the correct DTS for RK3128”, but it was a useful clue: in this vendor tree, the DTS names were not always a reliable guide, and some RK3128 TV boxes were clearly closer to the RK3126 reference setup in the storage path than the obvious file names suggested.

SD Card and UART

The SD card slot was another part that looked easy at first and then consumed a lot more time than expected. This board already had the footprint for a microSD socket, and the surrounding passive parts were populated, so I soldered the socket and started testing. The confusing part was that the slot was not completely dead. On some ROMs the card could at least be detected, and on Android 7.1 it could even identify the card brand, but real read/write access still failed.

That partial success was useful. It meant the controller itself was alive, and the problem was somewhere after the very first stages of card initialization. So the debugging path became: check card-detect, check pinmux conflicts, check slot power, check bus width, and only then suspect the data path.

The first clear problem was card-detect. On this board the cd-gpio never behaved like a real hotplug signal, and the kernel did not see insert/remove events reliably. Switching the SD node to broken-cd was not pretty, but it was the right move here. Polling the slot was much more reliable than trusting a dead card-detect line.

There was also a pinmux trap. On RK3128, sdmmc overlaps with UART2 on part of the data bus, so the old fiq-debugger and stale UART2 early console settings were muddying the water. That overlap turned out not to be the final root cause of the SD failures, but it was still real enough to make debugging harder than it needed to be. Moving the kernel log cleanly to UART1 and keeping UART2 out of the way removed that variable.

After that I used 1-bit mode as a diagnostic baseline. That was important because it proved the basic path was alive: power, clock, command line, and DAT0 were all working. Once that was stable, I switched back to 4-bit mode and confirmed that the extra data lines were also good enough for normal use. So by that stage the slot itself was no longer the mystery.

The last real problem was Linux DMA. In this vendor 4.4.194 tree, mmc0 uses the external DMA path by default. With DMA enabled, the card would enumerate correctly, mmcblk0 would appear, and then actual I/O would fail with errors such as DTO timeout when already completed and mmcblk0: error -110 transferring data. That was the important pattern: enumeration worked, but real transfers did not.

I spent a while trying to save the DMA path because, on paper, DMA is the better mode. I tried reducing the DMA burst size, forcing very small transfers back to PIO, forcing single-block DMA, adding completion fallbacks in the DW-MMC driver, and even adjusting the PL330 DMA request behavior. None of that made mmc0 reliable. The card would still fall over as soon as real data reads started.

One easy mistake here is to assume that U-Boot proves Linux DMA should work. In this case it does not. The working U-Boot path was not a valid reference for Linux external DMA. On this board, the known-good U-Boot behavior was effectively closer to a FIFO/PIO path, while Linux was using the external DMA engine. So “U-Boot can read the card” was not evidence that the Linux DMA path was correct.

The stable result was much less elegant but much more useful: keep the slot in 4-bit mode, but force Linux to use PIO instead of DMA. Once I did that, the card became stable and normal read/write tests passed. So the final DTS direction for this board was:

  • broken-cd
  • a real vcc_sdmmc regulator
  • bus-width = <4>
  • remove dmas and dma-names from &sdmmc so the host falls back to PIO

So PIO was not chosen because it is theoretically better. It was chosen because, on this RK3128 vendor kernel, it is the mode that actually works reliably.

The commands I kept around for checking the state were:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# MMC runtime state
mount -t debugfs none /sys/kernel/debug 2>/dev/null || true
cat /sys/kernel/debug/mmc0/ios
grep -E 'clock|bus width|timing spec' /sys/kernel/debug/mmc0/ios

# SD/MMC related logs
dmesg | grep -iE '10214000|mmc0|mmcblk0|error|timeout|crc'

# Check whether the block device exists
ls -l /dev/mmcblk*
lsblk | grep mmc

On the stable setup, the runtime state looked like this:

1
2
3
clock:          50000000 Hz
bus width:      2 (4 bits)
timing spec:    2 (sd high-speed)

So even though the host was using PIO, it was still running in normal 4-bit, 50 MHz, SD high-speed mode.

For speed testing, these were the commands I used:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Write test
sync
dd if=/dev/zero of=/media/pi/FE79-0EBB/sdtest.bin bs=4M count=256 conv=fsync,notrunc oflag=direct status=progress

# Read test from the filesystem
dd if=/media/pi/FE79-0EBB/sdtest.bin of=/dev/null bs=4M iflag=direct status=progress

# Raw device read test
dd if=/dev/mmcblk0 of=/dev/null bs=4M count=256 iflag=direct status=progress

# Remove the temporary file
rm -f /media/pi/FE79-0EBB/sdtest.bin

On this board, the stable 4-bit + PIO setup gave roughly 11.7 MB/s write speed and 23.4 MB/s read speed. That is not spectacular, but it is perfectly usable for this class of old TV box, and more importantly it is stable.

The wifi

Ethernet was easy enough to bring up, but I also wanted Wi-Fi because these boxes are much more useful once they can sit somewhere without a cable. After killing my first board by shorting D0 to GND, I bought a few more boards for testing. Between them, I found three different Wi-Fi chips: ESP8089, SSV6051P, and RTL8189.

ESP8089

The first confusing part was that the SDIO side looked half-correct. I could get the SDIO card itself recognized, which already told me that power, clock, and at least part of the SDIO wiring were alive. But that was only the first milestone. The Wi-Fi chip still did not come up, no interface appeared, and the first driver I reused from the RK322x image was clearly not enough.

That changed the direction of the debugging. At that point, the problem no longer looked like “SDIO is dead”. It looked more like “SDIO enumeration works, but the chip never finishes its own bring-up”. That distinction mattered a lot, because it told me to stop randomly changing DTS files and start looking at the driver, reset sequence, and board-specific power hooks instead.

After digging around, I found the open driver source here: https://github.com/al177/esp8089. The notes there were the first thing that matched what I was seeing on the board. In particular, they made it clear that bring-up over SDIO is awkward and reset handling matters. The chip may enumerate once, reset during firmware loading, disappear, and then reappear. If you do not expect that sequence, it looks like the driver is failing when it is actually doing the normal dance.

That was exactly the progress point I had been missing. Once I followed that driver flow, the logs started to make more sense:

  • first, mmc1 saw a new SDIO card
  • then the ESP driver started its power-up path
  • then there was a temporary probe failure while the chip restarted
  • and finally the card re-enumerated and the station interface appeared

So the real breakthrough was not “finding a magic DTS”. It was understanding that the useful milestones were different:

  • if the SDIO card does not enumerate at all, look at power, pinmux, bus width, and DTS wiring
  • if the SDIO card enumerates but Wi-Fi never comes up, look at reset handling, firmware loading, and the driver itself

There was another trap here, and it only became obvious after the SD card work was already done. Once mmc0 was switched to PIO to make the external SD slot stable, leaving the Wi-Fi SDIO host mmc1 on DMA caused a different failure mode: the board would hang during module loading, usually with a blocked task stuck in the esp8089 SDIO path. The stack traces were not in the external SD slot anymore. They were in mmc_wait_for_req, sdio_memcpy_toio, and esp_sdio_probe.

That mattered because it meant the system was not “randomly unstable”. It was specifically unhappy with the combination of a stable PIO SD card path on mmc0 and a still-active DMA SDIO path on mmc1. Once I switched the Wi-Fi host to PIO as well, the hangs stopped and the board became stable again. So the final result was the same lesson as the SD card section: PIO was not chosen because it is theoretically better, but because in this vendor 4.4.194 RK3128 kernel it is the mode that actually works reliably.

For ESP8089, I also had to be careful not to mix board-specific Wi-Fi wiring into the generic base DTS. This board family shows up with multiple Wi-Fi chips, so the cleaner setup was to keep the main DTS generic and carry the ESP8089-specific power/reset details in an overlay. That kept the base board file usable while still allowing the correct GPIO wiring for boards that really have ESP8089.

The commands I kept around for checking the Wi-Fi state were:

1
2
3
4
5
6
7
8
# SDIO / ESP8089 related logs
dmesg | grep -iE 'WLAN_RFKILL|esp8089|mmc1|sdio'

# Check whether the network interface appeared
ip link

# Check loaded modules
lsmod | grep -iE 'esp|8189|ssv'

Once that part was sorted out, ESP8089 became usable too. At that point the RK3128 boards no longer felt random. The hardware variants were still annoying, but the debugging path was finally predictable enough to work through one board after another.

SSV6051P

SSV6051P turned into a different kind of problem from ESP8089. The SDIO side was easy enough to confirm: once the overlay was correct, the card enumerated and the driver could probe it. But the module did not load reliably during boot.

At first I handled it the same way as the other onboard Wi-Fi options in my rk3128-config script: write the selected module name into /etc/modules-load.d/rk3128-wifi.conf and let systemd-modules-load do the rest. That was fine for ESP8089, but for SSV6051P it was inconsistent. The useful clue was that modprobe ssv6051 worked when I ran it manually later, while the same module sometimes failed to come up when it was loaded very early in boot.

So I stopped using modules-load.d as the active mechanism for the managed Wi-Fi selection. Instead, rk3128-config now writes the chosen module into /etc/default/rk3128-wifi and installs a small rk3128-wifi-loader.service. That service runs later in boot, right before network-pre.target, and only then calls modprobe for the selected driver. In other words, the device tree overlay still describes the hardware, but the actual module loading is deferred until the SDIO host, RFKILL glue, and the rest of the board setup are already in place.

This is one of those cases where a userspace service actually is the right tool. The hardware itself is already described by the DT overlay before the kernel boots, so delaying the module load in userspace does not break anything fundamental. It just avoids a bad load order. That is very different from the USB OTG case below, where trying to fix the mode from userspace would already be too late for early boot.

Realtek RTL8189FTV

RTL8189FTV was by far the easiest Wi-Fi chip in this group. Once the SDIO wiring and the overlay were correct, it worked cleanly from the first usable build. Unlike ESP8089, there was no awkward reset and re-enumeration sequence to debug. Unlike SSV6051P, it also did not need special treatment around boot-time module loading.

The one useful note here is that RTL8189FTV and RTL8189FS use the same driver in this tree. So from the software side, I treated them as the same family and only kept the board-specific differences in the overlay and module selection.

I also removed the p2p0 interface from the driver setup, so the board exposes only wlan0 at boot. That made the behavior simpler and avoided an extra virtual interface that was not useful on this box.

USB OTG

In theory, the RK3128 OTG port should automatically switch between host and peripheral mode. The decision is normally made from two hardware signals on the OTG PHY:

  • ID / IDDIG: decides which side becomes host. On a Micro-USB OTG port, grounding the ID pin means host; leaving it floating means peripheral.
  • VBUS / BVALID: indicates whether 5V from the USB bus is present.

The Linux OTG state machine uses these signals to select the role dynamically. On this cheap board, however, the OTG wiring does not seem to match what the Rockchip USB2 PHY driver expects. After debugging, both host mode and peripheral mode were confirmed to work, but automatic switching did not.

The first useful command was locating the runtime OTG mode file:

1
2
3
4
# Find the sysfs file exported by the Rockchip USB2 PHY driver.
find /sys -name otg_mode 2>/dev/null
# Example:
/sys/devices/platform/20008000.syscon/20008000.syscon:usb2-phy@17c/otg_mode

Then I used it to force each role manually:

1
2
3
4
5
6
# Force the OTG port to behave as a USB host.
echo host > /sys/devices/platform/20008000.syscon/20008000.syscon:usb2-phy@17c/otg_mode
# Force the OTG port to behave as a USB peripheral.
echo peripheral > /sys/devices/platform/20008000.syscon/20008000.syscon:usb2-phy@17c/otg_mode
# Hand control back to the automatic OTG state machine.
echo otg > /sys/devices/platform/20008000.syscon/20008000.syscon:usb2-phy@17c/otg_mode

To check what was happening, these were the commands I ended up using most:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Show the current OTG role selected by the PHY driver.
cat /sys/devices/platform/20008000.syscon/20008000.syscon:usb2-phy@17c/otg_mode
# Check whether the OTG PHY interrupts are actually firing.
cat /proc/interrupts | grep -i rockchip_usb2phy
# See whether the USB Device Controller exists.
ls /sys/class/udc
# Show the current USB gadget state when acting as peripheral.
cat /sys/class/udc/10180000.usb/state
# Show the USB host topology when acting as host.
lsusb -t
# Grep the kernel log for OTG/PHY related messages.
dmesg | grep -iE 'usb2phy|otg|bvalid|id'

The useful result was that forced host worked, forced peripheral worked, and live manual switching through otg_mode also worked. So the OTG port itself was fine. What was broken was only the automatic detection path in otg mode.

In practice, this leaves a few choices:

  • Keep otg_mode only for debugging. It is useful to test the port live, but writing to /sys/.../otg_mode is not persistent.
  • Use boot-time DT overlays for normal use. The base DTS stays in otg, and a small overlay applied by U-Boot overrides only the dr_mode to force either host or peripheral.
  • Keep host as the default overlay. That is the safer choice on this board, and it also keeps booting from a USB drive on the OTG port possible.

I also thought about using a small Linux service to write host or peripheral into /sys/.../otg_mode during boot. That would work after the kernel is already running, but it is too late for early boot. If the root filesystem is on a USB drive attached to the OTG port, the controller already needs to be in host mode before the kernel mounts root. A userspace service cannot help there, because userspace starts only after the root device has already been found. That is why I ended up preferring DT overlays instead of a Linux service.

The result

Here is my fork of the Rockchip 4.4.194 kernel, with the changes needed to make it work on RK3128 TV boxes: https://github.com/chieunhatnang-personal/linux-kernel-4.4-rk3128-tvbox

My build environment was:

  • Ubuntu 20.04
  • Python 3.8.10 with the python-is-python3 package
  • libssl-dev
  • gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf

You can follow the manual build steps from the Rockchip wiki, but I also wrote a small build script to make the process easier: https://github.com/chieunhatnang-personal/RK3128-Linux-SupportingScripts

The script is at /Kernel/4.4.194/build_kernel.sh. You may need to adjust TOOLCHAIN_DIR so it matches the actual path of your Linaro toolchain. After cloning the kernel source, place this script at the same directory level as the kernel tree and run it. When the build finishes, you will get zImage and the kernel modules under the out/ directory.

By default, the script compiles rk3128-linux.dts into rk3128-linux.dtb. You can override that with an environment variable. For example:

1
RK_DTS=rk3128-linux-esp8089 ./build_kernel.sh

Building Armbian 22.02

As I mentioned above, I intentionally reused jock’s RK322x Armbian 22.02 build as the base. The goal here was not to redesign the whole userspace stack, only to replace the board-specific parts that were wrong for RK3128. In this image, I changed:

  • boot.cmd and the generated boot.scr so the boot flow can handle NAND and USB devices
  • the Wi-Fi drivers and the logic that loads the correct module at boot
  • the rk3128-config tool
  • wifi-driver-loader.service for deferred Wi-Fi module loading
  • the motd script so it shows the board-specific system information at login
  • and, of course, the kernel zImage, rk3128-linux.dtb, and the overlays described above

After all of that, this is what works:

  • custom U-Boot based on Rockchip U-Boot 2017.09
  • all four cores, up to 1.2 GHz
  • CPU frequency scaling and governors
  • DRAM frequency control, both dynamic and fixed
  • NAND, eMMC, SD card, and USB booting, including both OTG and EHCI/OHCI ports; OTG mode can be selected in rk3128-config
  • Ethernet
  • Wi-Fi (SSV6051P, SSV6256P, ESP8089, and several Realtek chips)
  • GPU acceleration
  • UART1 and UART2, configurable

What still does not work:

  • Bluetooth, because I do not have a board with it to test
  • VPU / video-processing support
  • the SD card DMA path; PIO mode is slower, but it is already stable enough to use as the default
From Hanoi with love
Built with Hugo
Theme Stack designed by Jimmy