<< >>
justin = { main feed , music , code , askjf , pubkey };
[ <<< last (older) article | view in index | next (newer) article >>> ]

December 27, 2021
realtime audio on macOS in the age of asymmetrical multicore CPUs

It's now time when I bitch about, and document my experiences dealing with Apple's documentation/APIs/etc. For some reason I never feel the need to do this on other platforms, maybe it's that they tend to have better documentation or less fuss to deal with, I'm not sure why, but anyway if you search for "macOS" on this blog you'll find previous installments. Anyway, let's begin.

A bit over a year ago Apple started making computers with their own CPUs, the M1. These have 8 or more cores, but have a mix of slower and faster cores, the slower cores having lower power consumption (whether or not they are more efficient per unit of work done is unclear, it wouldn't surprise me if their efficiency was similar under full load, but anyway now I'm just guessing).

The implications for realtime audio of these asymmetric cores is pretty complex and can produce all sorts of weird behavior. The biggest issue seems to be when your code ends up running on the efficiency cores even though you need the results ASAP, causing underruns. Counterintuitively, it seems that under very light load, things work well, and under very heavy load, things work well, but for medium loads, there is failure. Also counterintuitively, the newer M1 Pro and M1 Max CPUs, with more performance cores (6-8) and fewer efficiency cores (2), seem to have a larger "medium load" range where things don't work well.

The short summary:

  • Ignore the thread QoS APIs, for realtime audio they're apparently not applicable (and do not address these issues). This was the biggest timesink for me -- I spend a ton of time going "why doesn't this QoS setting do anything?" Also Xcode has a CPU meter and for each thread it says "QoS unavailable"... so confusing.

  • If a normal thread yields via usleep() or pthread_cond_timedwait() for more than a few hundred microseconds, it'll likely end up running on an efficiency core when it resumes (and it takes an eternity in terms of audio blocks to get bumped back to a performance core, by which there's been an underrun and the thread probably will go back to sleep anyway). Reducing all sleeps/waits to at most a few hundred microseconds is a way to avoid that fate (though Apple recommends against spinning, likely for good reason). It's not ideal, but you can effectively pin normal threads to performance cores using this method.

  • Porting Your Audio Code to Apple Silicon was the most helpful guide (I wish I had seen the link at the bottom of one of the other less-helpful guides sooner! so much time wasted...), though it assumes some knowledge which doesn't seem to be linked in the document:

    You want to get your realtime threads in the same thread workgroup as the CoreAudio device's (via kAudioDevicePropertyIOThreadOSWorkgroup), and to do that you first have to make your threads realtime threads using thread_policy_set(THREAD_TIME_CONSTRAINT_POLICY) (side note: we probably should have been doing this already, doh), ideally with the similar parameters that the coreaudio thread uses, which seems to be a period and constraint of ((1000000000.0 * blocksize * mach_timebase_info().denom) / (mach_timebase_info().numer * srate), and a computation of half that. If you don't set this policy, it will fail adding your thread to the workgroup (EINVAL in that case means "thread is not realtime" and not "workgroup is canceled" per the docs). Once you do that, you do effectively get your threads locked to performance cores, and can start breathing again.
Perhaps this was all obvious and documented and I failed to read the right things, but anyway I'm just putting this here in case somebody like me would find it useful.

Posted by Tale on Tue 28 Dec 2021 at 03:03 from 77.170.68.x
Thanks for sharing, this will certainly come in useful one at some point. BTW, Apple's (lack of) documentation if also one of my biggest frustrations.

Posted by Chris on Fri 23 Dec 2022 at 16:21 from 99.155.33.x

Thank you SO MUCH for writing this up, saved me a lot of effort...

I put together a gist that describes how to do it, seeing as their documentation really sucks

I was running into the EINVAL thing, and nothing was helping besides this article.

I had a couple questions:

1. How did you discover the period/constraint/computation of the coreaudio thread? Do you have a source anywhere?
2. In your calculation, what do `blocksize` and `srate` refer to, and in what units?

Again, thank you so much. Really useful

Posted by Justin on Mon 26 Dec 2022 at 09:52 from 174.247.17.x
1. I forget if that calculation was buried as code somewhere on the apple documentation site, or if it was just described. Block size is samples, srate is the samplerate in samples/second.

Posted by Chris on Mon 02 Jan 2023 at 19:10 from 76.77.180.x
Perfect, thank you very much.

Add comment:
Human?: (no or yes, patented anti crap stuff here)
search : rss : recent comments : Copyright © 2024 Justin Frankel