Coruna exploit kit - Further investigations and reverse engineering

Brief

Coruna is a nation state exploit kit founded in the wild and researched by GTIG and iVerify. The exploit kit target iOS version 13.0 up to version 17.2.1 and some OSx versions.

All the exploits are 1-click delivered trought phishing sites impersonating crypto stuff, AI stuff, gambling stuff... This is an example of a phishing site source code deliverying the first stage

You can clearly see a one liner of obfuscated javascript tag inside the HTML source code of the web page

When an operation is exposed by Google or whoever the first thing you should do is take down all C2/payloads host domains to not leave free access to the public, in this case chinese peoples ?forgot? to do that giving the public the chance to get their hands on it too. So the idea working with some friends was to scrape all of the still active domains and try to "recreate" the exploit kit (or just a full chain) for educational purpose.

Researching

First evidence of public release of the payloads was from matteyeux on his Github, all of those payloads were hosted on b27.icu, we started looking at other domains known to delivery the first loader of the Coruna chain (see here)

The whole web infrastructure for my understanding was divided into those components:

Domains deliverying modules and exploits to the loaders: Called by the phishing sites
Domains deliverying loaders: Phishing sites
Post exploitation domains deliverying C2 connections: Generated with a DGA and always use as TLD ".xyz"

Those are the active phishing domains (used during research) that delivers the payloads:

51fl.shop
17cxa.com
iransupport.cyou
lmt0ken.com
lovescape.com
b27.icu
sadjd.mijieqi.cn

Working domains hosting malicius iframes:

remotexxxyyy.com/static/analytics.html
remotehealthcheck.com/static/analytics.html

Loaders and exploits were interchanged between sites with little to no difference in their logic and implementation, here we can see the page source code for lmt0ken loading remotehealtcheck malicius analytics.html

We tryied generating ~4000 domains using a DGA algorithm that implements the seed "lazarus" and has a max length of 15 characters (as described by Google TIG) but non of them worked

Each js payload was hosted in a single hardcoded domain using a hex string as a name (eg: http://b27.icu/166411bd...js)

Breakdown 1-click Exploit chain

This is a simplified breakdown on Coruna exploit chain, from initial compromise to OS memory access, i hope this can make the level of engineering required behind Coruna and in general all the modern browser exploits

From here things will get tough so i need to say this: im NOT by any means an expert in iOS/browser exploitation, i usually do Windows/Linux and less complicated binary exploitation, do not rely 100% on what i will say and do your own research, thanks (:

Initial Loader and Device Fingerprinting

Every code was part of a larger chain, the whole operation was very modularized (basically a framework), each modules and additional utilities were delivered by the loaders.

Let's view the analytics.html deobfuscated code to better understand all the functionalities of one loader

The loader first check the version of WebKit executing the payload, it does that by parsing WebKit version number and comparing it to some hardcoded values

If the webkit version is < 13000 the module abort the execution, else if the version is > 16000 further setups are executed (runtime.zo())

C2 comunications works through HTTP using XMLHttpRequest, it's used to fetch further modules and execute them using stealth DOM injection.

The loader use a particular way to resolve URLs, it calculate the sha256 of the remote resource that need to be fetched and uses only the first 40 hex chars for the filename

Stealth DOM injection happens by creating a new div with opacity: 0.0 and executing the JS module by creating a new script element

WebKit RCE

Coruna uses 3 differents RCE exploits in the webkit renderer, based on my understanding none of them were actually exploiting 0days but rather known ndays never exploited in the wild (still crazy)

1) NaN Boxing type confusion: On legacy devices running iOS 11-13 | CVE 2021-30952
2) JIT Type Confusion: iOS 15.2-15.5 | CVE 2023-48503
3) Heap corruption + SVG R/W: iOS 16-17 | CVE 2024-23222

Because im not an expert on browser exploitation and 3 CVEs are a lot to go through in the next paragraphs ill just analyze and rewrite the RCE + R/W primitive of the heap corruption, i will link to a repository with all of the exploits cited above (obfuscated and not), you are welcome to do the same with all of them

Doing a quick brief on JS engines vulnerabilities the majority of them arise because of the natural difference between JavaScript, a dinamically typed language, and c++, language used to write the engine that is a statically typed language.

Often on the other hand, type confusions are well known problems within the JIT compilation on the JS engine, this is because JIT is based on assumptions in order to generate machine code and be faster. JavaScript variable are dynamic, a variable x can be a double in one line and an Object on the next, to make execution faster JIT will assume that x will stay a double, if the variable changes during runtime and the developer doesnt guard deoptimization the CPU will interprets the memory bits of a doubles as an object

Memory R/W Primitive upgrade

Lets first start with a fundamental distinction in exploit development: A vulnerability is the root cause, a primitive is what the bug allows you to do (often limited.

Upgrading the primitive is critical, you need to have some kind of "privileged" and reliable primitive, preferable an arbitrary R/W, Coruna in all 3 exploits finded a way to upgrade the initial vulnerability and gain this primitive, later we will se an example and how it create a "stable channel" for writing and reading

In the realm of browser/JS engine exploitation the 2 most powerful primitives are addrof and fakeobj.

addrof:

func addrof( obj ) { return address; }

Used to defeat ASLR this primitive leak memory addresses from JS objects

fakeobj:

func fakeobj( addr ) { return object; }

Used to create JS objects at specific memory addreses

JIT Cage execution and PAC

After establishing R/W primitive its time to achive Arbitrary Code Execution, generally its obtained trough hijacking alredy existing pointers/callbacks in order to invoke an attacker controlled function pointers with a PAC valid signature. This phase has big constraints, one of those being PAC.

Doing a brief overview on Pointer Authentication, PAC is a mechanism by which certain pointers are signed, when pointers get signed, a cryptographic hash of its value and other values is stored in unused bits of that pointer

PAC is a mitigation against ROP and JOP attacks, general ways to bypass this mitigation are by leaking a valid signature key or signing your own pointer

You can read more about PAC here

Coruna start its PAC bypass by searching for ARM64 PAC Trampoline gadgets, it does that by looking for 17 specific instructions. JSC includes a PAC trampoline for its own internal use (signing JIT code pointers), the exploit locate those pointers using pattern matching

A64 architecture has specific instructions for Pointer Authentication, for effective PAC bypass its important to control them.

Looking at the whole and in-depht PAC bypass mechanism is a lot of work and i still want to cover a lot of things, however at the end i will link my github repository with all the deobfuscated files im possessing right now, if you are interested you can have a look at it by your self

This is an high level overview of what should be happening:

1. Finds the singing gadgets - JSC PAC trampoline for own usage
2. Reuses the signing infrastructure - Calling pacia1716 with controlled inputs for generating PAC signatures
3. Signs WASM table entries - The variant 1 use the WASM function table, it expects only PAC-signed entries, the exploit signs the function pointers with correct discriminator (0x24ad) before patching them into the table

To execute code from within the JIT cage Coruna abuse Intl.Segmenter, a JavaScript API for unicode text segmentation. This API allow developers to split strings into units according to some unicode standards or language-specific rules, return an iterable object via the segment() method

Intl.Segmenter( locale, { granularity: 'something' } )

When called using "sentence" as granularity and used on a long string, under the hood JSC JIT compiles the ICU break iterator's (icu::BreakIterator) internal loop. The exploit corrupt the iterator's internal data structures to redirect the JIT compiled code path to an arbitrary native function. This works because the iterators internal pointers and vtable entries are stored in the heap accessible via the R/W primitive

This is the setup, whats happening is:

1. Create a segmenter
2. Build a long string - (300 "a" joined with white spaces), forces JIT compilation in native code
3. Warm up - .segment(longstring) triggers the JIT compilation
4. A buffer is allocated to hold a copy of the original JIT code

Now start the actual call phase (pointer hijack), this is the core logic of the exploit that will run everytime we want to call a function.

First we need to get a new iterator using Symbol.iterator method and parse the internal pointers from it

We are using the previous R/W primitive in order to read those pointers, what we are interest in is the code address

You can see strange sequence of characters like: FVmvHQ or wKny90, those are the memory offset (specific to JSC version) where function pointers resides

We now copy and modify the internal ICU data structure

Critical step is to set the flag to RWX on the newly allocated buffer, this bypass Write XOR Execute mitigation

We have now one buffer for the Trampoline code and one buffer for the cloned ICU internal pointers

linking the 2 buffer together we now swap pointers and point the internal codeBufferPtr from the iterator to the cloned and RWX buffer

Patching the iterators internal pointer codeBufferPtr to point to cloned buffer with our modified ICU pointer. Now the engine thinks our buffer is the legitimate code buffer

We copy legittimate JIT instruction inside the Trampoline buffer allocated before, then we patch the Call instruction to target the native function we want to execute.

To finish we trigger the JIT execution by calling iter.next on the modified iterator object

We are now executing native c++ functions, but note that we are still inside the renderer's memory space

As a summary: Normally the JIT space its executable but not writable, this exploit create a trusted RWX that JIT will execute, you can write shellcode in this space as if it was legittimate JS code. we can call this a "call anything" primitive

I hope this graph can help, if you find this difficult its because it is

After JIT cage escape that im not going to cover here Coruna will exploit a kernel vulnerability (CVE 2023-41974) for gaining LPE and install an in-memory implant inside the root daemon powerd

All i have described above is just the tip of the iceberg about browser exploitation, its a huge world and i encourage you to look further. ret2systems its an extraordinary resource

Heap corruption vulnerability - Heap Grooming

I took a deeper look at one of the WebKit vulnerability and how Coruna updates the initial relative to a complete arbitrary R/W

The vulnerability is a Type Confusion, whats crazy about this is how it get triggered trough Heap Grooming vodoo magic

Heap grooming is a technique where an attacker deliberately shapes the memory allocator's state so that 2 unrelated objects end up at overlapping heap addresses. Unlike heap spraying grooming is more precise and surgical, its a combination of frees and allocations in order to control with a greater probability where the next allocation lands

The exploit abuse Intl.NumberFormat("en-US") and OfflineAudoContext.decodeAudioData() javascript objects

Looking at Intl.NumberFormat("en-US") internals, when this object its created JSC allocate an ICU UNumberFormat structure containing a float-formatting buffer, this buffer resides on the heap and it contains intermediate results when a number is converted from local-formatted to string (see Intl.NumberFormat doc)

When OfflineAudoContext.decodeAudioData() is called it allocates a temporary buffer for decoded stuff

If the heap is groomed correctly the buffer allocated by OfflineAudoContext.decodeAudioData() can land on an Intl.NumberFormat("en-US") float buffer

The grooming sprays 3 tiers of Intl.NumberFormat("en-US")

formatterPool - 7000 objects, are the primary ones and where we will be looking for overlapping
flankingFormatters - 3x primary (21000 objects), are used to fill the gaps around primary formatters and create precise holes around them
tempFormatters - 2x primary (14000 objects), used for additional heap density, they force the allocator to compact the memory early

So at the beginning we have ~21 MB of heap space used just for float buffers. You can alredy understand the concept of grooming, each allocation has its own scope

Now the core grooming logic begin, first we release all the 14000 tempFormatters allocated before, creating holes inside the heap

Continuing we do cycles of applying heap pressure and exception spray

240 allocations of 4 MB ArrayBuffer each for a total of 960 MB, why are we doing that? => The act of allocating those chunks force JSC garbage collector to compact the heap pushing smaller allocation (NumberFormat internal buffers) closer together

This is the exception spraying logic

The locale "dowocjfjq[" (CANARY_LOCALE) is syntatically invalid because it contains a "[". The ICUs locale parser reject this allocation but not before allocating internal temporaries to parse the locale. This allocation -> immidiate freed cycle has 2 effects:

Free-list churning: ICUs internal allocator fragment free list, breaking up contiguous blokcs into smaller pieces that match the size of a formatter's float buffer (~512 bytes)
Garbage Collector triggering: The repeated exception path may trigger GC to compacts the "young generation" and moves surviving objects (primary formatters) into tighter arrangements

Everything is still being done with the objective of compacting more and more the primary formatters around the holes we created freeing the temporary formatters

The next phase require to call format() method on primary formatters

The format(n) is a method that converts a number into a locale-specific string, calling format 3 times with 3 different values forces:

1. format(1): Forces allocation of ~512 byte ICU float buffer
2. format(2): Ensure buffer is fully initialized
3. format(3): After this call the the live formatting state will produce deterministic output

After the warm up each primary buffer owns distinct ~512 byte heap allocation containing its ICU float buffer. These buffer are now interspersed with the holes left by the released temporary formatters, exactly where DecodeAudioData will try to allocate

Now another cycle of release - Heap pressure - exception spray is done, this time we are releasing the flankigFormatters (creating 21000 new holes)

The exploit run this cycle twice in order to refine the heap layout, the first round creates holes and compact them, then the warm-up materializes buffers into those compacted regions, then the second round creates new holes right next to the now-materialized buffers

Now its time to trigger the corruption/overlap by using the OfflineAudioContext.decodeAudioData() method to allocate ArrayBuffer chunks that land in the holes next to primary formatters

OfflineAudioContext.decodeAudioData() takes has arguments an ArrayBuffer containing a full audio data and return an AudioBuffer containing decoded PCM audio data

We are performing 20 rounds where each round performs 3 decodes:

1. Allocates a medium size PCM buffer
2. Another allocation of the same size as 1, two consecutive allocation of same size tend to land adjacent in the heap
3. Null-Write decode, an intentional malformed WAV, the decoder start processing it, allocate a buffer, write the initial decoded sample into it and then encounters the malformed data and throws. By the time of the throw the decoded PCM have been alredy written in the heap

The carriers buffer are retained inside the heap so they occupy space permanently and reduce the effective free region forcing the subsequent allocation closer to formatter buffers. The null-write buffers are the actual overlap triggers, they fail during decoding but the partial write has alredy occured

To finish we want to detect the overlapping

What is format(1.02)? => On an uncorrupted en-US formatter, format(1.02) returns 4 chars string "1.02", this is a deterministic output with a fixed len. If the formatter internal float buffer has been overwritten by an overlapping ArrayBuffer the ICU formatting engine reads garbage data from the corrupted buffer, this buffer is interpreted as a different value with a length other than 4

If the overlapping is triggered the format(1.02) will leak an address from the heap, result contains bytes from the overlapping ArrayBuffer, we just parse the string to extract the memory address

Due to apple being apple i had many problems testing the exploit live while debugging with lldb, at the end have opted to just use extensive debugging messages. I have spinned up a local web server deliverying the heap grooming phase of the exploit and navigated to that using an iPhone XR with iOS 16, this is the result

A graph to better visualize each phase of the grooming

Upgrading to arbitrary memory R/W Primitive

This is the second stage for the exploit, its job is to transform the limited heap corruption achieved during heap grooming into a reliable mechanism for reading and writing arbitrary memory

This stage of the exploit produces 2 primitives:

AudioPrimitive - format(NaN) for reading and decodeAudioData for writing
SvgPrimitive - orderX.baseVal

We will only look at the AudioPrimitive, its enough to understand how the exploit achieve relative - arbitrary R/W

AudioPrimitive - Reading memory

Remember that now the formatter's internal float buffer overlaps with an AudioBuffers decode PCM buffer, when Intl.NumberFormat.format(NaN) is called ICU serializes the internal float buffer as a string. Because the buffer is overlapped the resulting string returned contains raw heap memory content

This is the code for reading 32 bits value

Calling format(NaN) is expensive, to avoid redundancy the exploit cache the result inside _readBuffer

AudioPrimitive - Writing memory

A general rule in exploit development is that the majority of the times you will achieve arbitrary something is by overwriting already existing pointers

As we said before OfflineAudioContext.decodeAudioData() allocate and internal buffer for the decoded PCM output, these allocations go through the same allocator (bmalloc) as Intl.NumberFormat internal buffers. By carefully constructing the input buffer is possible to control the PCM values that get written to the overllaped memory region

This is how writing works:

1. Construct a TargetedWriteStrategy WAV buffer with encoded the target address and value

This is building a WAV buffer from a strategy object, TargetedWriteStrategy define strategy objects, these objects are used to compute the sequence of PCM sample "jumps" that will position the decoder's output cursor at the target address and deposit the desired value

When decodeAudioData processes the resulting WAV, the audio decoder interprets the sample descriptors as PCM jumps, each jump is simply a 16-bit offset that advances a cursor through the output buffer, we can position the cursor by carefully selecting the jumps sequence (this is the strategy)

2. Submit to decodeAudioData()

At this point we basically have a very limited R/W primitive, we can read memory using the format(NaN) and write to memory by crafting special buffers whose decoded PCM output writes a specific value at a specific address through decodeAudioData

Writes are unreliable, unlike reads (which are deterministic once we know the memory address of the overlap), writes depend on where the allocation place the decoder's ouput buffer. The best thing would be to target a specific address

To read/write at arbitrary addresses we modify the overlapped formatter's internal structure fields (using the write primitive) to "slide" the window

At leakedAddress + INTERNAL_SIZE_OFFSET (968) there are several internals structure:

Capacity (INTERNAL_SIZE_OFFSET + 8) - How many bytes the formatter think the buffer holds
Length (INTERNAL_SIZE_OFFSET + 12) - Data length
Data Pointer (INTERNAL_SIZE_OFFSET + 24) - its the base address that format(NaN) reads from

We will write the target address inside the data pointer at offset +24

As you see he target address is written 32 bit at a time

We can call format(NaN) to read from the new address

Now decodeAudioData will writes to the new address written inside Data Pointer

In this way we are setting a specific and arbitrary address in which apply the already present R/W primitive

Conclusion

The entire Coruna exploit kit is much bigger than what i have discussed in this post, you can find all the deobfuscated payloads i used here

Useful links:

https://cloud.google.com/blog/topics/threat-intelligence/coruna-powerful-ios-exploit-kit
https://www.validin.com/blog/aye_coruna_ios_exploit_kit_c2/
https://browser.training.ret2.systems/welcome
https://www.youtube.com/@OffByOneSecurity - Everything related to browser exploitation
https://www.nadsec.online/blog/coruna
https://github.com/matteyeux/coruna