Within the IT security industry, mobile security analysts frequently choose to analyse an android application by only looking at its Java decompiled code, static resources and network communication to the servers. Given the difficulties in analysing the native layer of an apk, researchers tend to disregard the valuable information and insights regarding the internal functions on an app that this can provide. This post will show how to advantage of the most common analysis strategies, shedding greater light on the workings of the native layer of an application.
Application developers working with the Android Platform have the Java API Framework at their disposal. These APIs written in Java, allow developers to access the components of the Android OS using high level language. They consolidate the building blocks used for creating apps, and include the following components:
Despite the breadth of the Java API Framework, many apps include native libraries written in C and C++ as part of their source code. To distinguish them from the system libraries, these are called "application native libraries". These are used for a variety of different reasons, for example video optimization, usually because code written in C/C++ provides better performance. In addition, application native libraries are increasingly used as a means of obfuscation (based on the controversial paradigm of security by obscurity). Decompiling an Android apk and retrieving a - more or less - accurate copy of the Java code used to build the app is trivial. However, reverse engineering the native library files and analysing the decompiled pseudo C/C++ code is a very different story.
Adopting the role of a bug hunter or a pentester, it can be assumed that there will be no access to the source code of the app to be analysed. Therefore, this post will provide a detailed description of how to find within an apk, the application native libraries that were precompiled with the app, how to gain insights on which functions of the C/C++ code are being invoked in the Java layer and how to leverage Frida dynamic instrumentation capabilities to gather further information on the input and output of those native calls.
Some of the examples below will analyse the CyberTruck application from Eduardo Novella of NowSecure that was part of a CTF of r2con.
How to find the native libraries inside an app?
The first step in this journey is to decompile the app and peek into de lib/
folder to see if the app build included a native library.
$ apktools -d app.apk
Or directly use jadx-gui
tool to decompile and navigate the source code. Inside Resources
> lib
one native library is found included in the app.
For the Cybertruck app, the library libnative-lib.so is found
inside the apk.
How to find the native functions that are being invoked by an app?
This raises two questions:
Where in the source code of the apk is this library being used?
Which methods of the native layer are being invoked by the app?
Static Analysis
Grepping on the Java decompiled code for the keyword System.load
shows when the library is being loaded into memory by the app. Then searching for the keyword native
provides information about the native methods being called by the application.
In this case the native libnative-lib.so
library is being loaded (by its short name without the prefix lib
) in the MainActivity
of the application and init()
is the unique native method being declared. The method is invoked by the app inside the k()
function of MainApplication
.
As this is a native method, the implementation of init can be found in the native file. (Please refer to the project of the app on Github to perform your own retrieval of this information with the binary).
Dynamic Analysis
A second strategy is to use dynamic methods to gather information regarding the native methods called by the app. Frida, a dynamic instrumentation framework, allows a runtime analysis of the behaviour of the application to be performed, using the injection of instrumentation code. Among the amazing capabilities that this methodology offers are bypass client-side security checks, access process memory, hook and intercept functions, change returned values and arguments.
To study the native layer without detailed knowledge of the implementation, the power of Frida can be leveraged using the regex "Java_*"
as shown in the following command:
In this case the name of the package of the app is used (easily retrievable from the AndroidManifest.xml file) and any function starting with "Java_*"
is traced. By using the app in an emulator or rooted device, information on the command line of the native methods being invoked can be obtained. Using the Cybertruck app shows that the init() method is immediately called.
The name of the native method follows a convention in the way the Java layer links to the native layer. The specification states that a native method name is concatenated from the following components:
the prefix Java_
a mangled fully-qualified class name
an underscore (“_”) separator
a mangled method name
for overloaded native methods, two underscores ("__" followed by the mangled argument signature
Source: https://docs.oracle.com/javase/8/docs/technotes/guides/jni/spec/design.html#resolving_native_method_names
This is the reason that the native init()
method ends up with the name of Java_org_nowsecure_cybertruck_MainActivity_init().
(For additional information on the subject, this amazing resource is recommended reading).
How can more information on the invocation of these native functions be gathered?
To explore the inner working of the native layer of an app in more detail, then further key information can be gathered using dynamic instrumentation. Understanding the arguments used to call the native function is always useful. As an example of this scenario, imagine that a more complex apk is decompiled. The image below shows its structure.
As before, searching for the string "native" in Java shows the following: in class app_sp.p067ai.Processing.FeaturesJni a native function is invoked by the name a_recognizeFromFile:
Checking the code, shows that the native library is being loaded with the code System.loadLibrary("libprocessing.so");
At this point all the information needed to trace the arguments that are used to call the native function has been obtained
1. The name of the library being loaded: libprocessing.so
2. The name of the method to be traced: a_recognizeFromFile
that ends up being invoked in the native layer as Java_com_app_sp_p067ai_Processing_FeaturesJni_a_recognizeFromFile
3. The parameters of the invoked function: choosing to include the String parameter of the function a_recognizeFromFile(long j, String str)
in the trace.
Now to start building a script to dynamically instrument the runtime of the app with Frida.
Once this script is in the toolkit, then the app can be run and instrumented with the command:
Frida will then start a hassle free retrieval of information about the library being loaded, the native method being called and finally provide the string being used as a parameter. The following image shows a capture of the command line when the app is used in the emulator.
Finally, combining the static and dynamic analyses will show the native library being loaded by the app, which of its methods were being invoked and the contents of the parameters that were passed to them. This information is key when performing a security assessment of any mobile application.
Teresa Alberto
Security Researcher