<< >>
justin = { main feed , music , code , askjf , social , pubkey };
[ <<< last (older) article | view in index ]

May 12, 2017
linux hacking on an ASUS T100TA

I've been working on a REAPER linux port for a few years, on and off, but more intensely the last month or two. It's actually coming along nicely, and it's mostly lot of fun (except for getting clipboard/drag-drop working, ugh that sucked ;). Reinventing the world can be fun, surprisingly.

I've also been a bit frustrated with Windows (that crazy defender/antispyware exploit comes to mind, but also one of my Win10 laptops used to update when I didn't want it to, and now won't update when I do), so I decided to install linux on my T100TA. This is a nice little tablet/laptop hybrid which I got for $200, weighs something like 2 pounds, has a quad core Atom Bay Trail CPU, 64GB of MMC flash, 2GB of RAM, feels like a toy, and has a really outstanding battery life (8 hours easily, doing compiling and whatnot). It's not especially fast, I will concede. Also, I cracked my screen, which prevents me from using the multitouch, but other than that it still works well.

Anyway, linux isn't officially supported on this device, which boots via EFI, but following this guide worked on the first try, though I had to use the audio instructions from here. I installed Ubuntu 17.04 x86_64.

I did all of the workarounds listed, and everything seemed to be working well (lack of suspend/hibernate is an obvious shortcoming, but it booted pretty fast), until the random filesystem errors started happening. I figured out that the errors were occurring on read, the most obvious way to test would be to run:

debsums -c
which will check the md5sum for the various files installed by various packages. If I did this with the default configuration, I would get random files failing. Interestingly, I could md5sum huge files and get consistent (correct results). Strange. So I decided to dig through the kernel driver source, for the first time in many many years.

Workaround 1: boot with:
sdhci.debug_quirks=96
This disables DMA/ADMA transfers, forcing all transfers to use PIO. This solved the problem completely, but lowered the transfer rates down to about (a very painful) 5MB/sec. This allowed me to (slowly) compile kernels for testing (which, using the stock ubuntu kernel configuration, meant a few hours to compile the kernel and the tons and tons of drivers used by it, ouch. Also I forgot to turn off debug symbols so it was extra slow).

I tried a lot of things, disabling various features, getting little bits of progress, but what finally ended up fixing it was totally simple. I'm not sure if it's the correct fix, but since I've added it I've done hours of testing and haven't had any failures, so I'm hoping it's good enough. Workaround 2 (I was testing with 4.11.0):
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2665,6 +2665,7 @@ static void sdhci_data_irq(struct sdhci_host *host, u32 intmask)
 				 */
 				host->data_early = 1;
 			} else {
+				mdelay(1); // TODO if (host->quirks2 & SDHCI_QUIRK2_SLEEP_AFTER_DMA)
 				sdhci_finish_data(host);
 			}
 		}
Delaying 1ms after each DMA transfer isn't ideal, but typically these transfers are 64k-256k, so it shouldn't cause too many performance issues (changing it to usleep(500) might be worth trying too, but I've recompiled kernel modules and regenerated initrd and rebooted way way too many times these last few days). I still get reads of over 50MB/sec which is fine for my uses.

To be properly added it would need some logic in sdhci-acpi.c to detect the exact chipset/version -- 80860F14:01, not sure how to more-uniquely identify it -- and a new SDHCI_QUIRK2_SLEEP_AFTER_DMA flag in sdhci.h). I'm not sure this is really worth including in the kernel (or indeed if it is even applicable to other T100TAs out there), but if you're finding your disk corrupting on a Bay Trail SDHCI/MMC device, it might help!






4 Comments:
Posted by Justin on Tue 16 May 2017 at 10:08 from 74.72.45.x
turns out udelay(333) also works, haven't tried anything slower yet. Also, to avoid having to recompile the whole kernel, grab the source, then: make -C /lib/modules/(current kernel)/build M=`pwd`/drivers/mmc/host modules


Posted by Justin on Tue 16 May 2017 at 10:09 from 74.72.45.x
My patch also re-enables DMA if it's disabled via the command line. here.


Posted by Justin on Tue 16 May 2017 at 10:09 from 74.72.45.x
Here: 1014.org/_/t100_sdhci_patch.txt


Posted by Justin on Tue 23 May 2017 at 21:53 from 74.72.45.x
tried another test, where I call clflush_cache_range() before and after the DMA, but it didn't seem to help. So it's probably not a cache-coherency issue, but maybe an interrupt-firing-before-the-sdhci-controller-really-has-the-ram-written-issue.


Add comment:
Name:
Human?: (no or yes, patented anti crap stuff here)
Comment:
search : rss : recent comments : Copyright © 2017 Justin Frankel