浅析基于静态分析的Android应用漏洞扫描工具

简单学习了一下几款基于静态分析的Android扫描工具,顺手解决了一些BUG。

Appshark

https://github.com/bytedance/appshark

基本使用

下载代码并编译:

1
2
3
4
git clone https://github.com/bytedance/appshark.git
cd appshark
./gradlew build -x test
java -jar ./build/libs/AppShark-0.1.2-all.jar

然后修改./config/config.json5文件,apkPath为要分析的APK文件路径,rules为要使用的扫描规则,规则默认为rulePath下的所有*.json文件。

这里我使用了InsecureShop.apk来测试:

1
java -jar ./build/libs/AppShark-0.1.2-all.jar config/config.json5

扫描结果默认在out目录下,有两条扫描结果,分别对应规则PendingIntentMutableIntentRedirectionBabyVersion,并且有html格式的结果文件:

1
2
cd out/vulnerability
python3 -m http.server 80

Appshark是基于jimple分析的,给出的代码也是jimple,上图的Intent重定向漏洞详见:https://docs.insecureshopapp.com/insecureshop-challenges/access-to-protected-components

BUG修复

另外需要一提的是我在分析自己编写的应用时出现如下报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[main] ERROR soot.jimple.infoflow.android.resources.ARSCFileParser - Error when looking for XML resource files in apk /Users/leixiao/AndroidStudioProjects/JavaAppDemo/app/build/outputs/apk/debug/app-debug.apk
java.lang.RuntimeException: File format violation, res1 was not zero
at soot.jimple.infoflow.android.resources.ARSCFileParser.readTypeSpecTable(ARSCFileParser.java:2603)
at soot.jimple.infoflow.android.resources.ARSCFileParser.readResourceHeader(ARSCFileParser.java:2202)
at soot.jimple.infoflow.android.resources.ARSCFileParser.parse(ARSCFileParser.java:2084)
at soot.jimple.infoflow.android.resources.ARSCFileParser$1.handleResourceFile(ARSCFileParser.java:2074)
at soot.jimple.infoflow.android.resources.AbstractResourceParser.handleAndroidResourceFiles(AbstractResourceParser.java:54)
at soot.jimple.infoflow.android.resources.ARSCFileParser.parse(ARSCFileParser.java:2068)
at net.bytedance.security.app.android.AndroidUtils.parseApkInternal(AndroidUtils.kt:300)
at net.bytedance.security.app.android.AndroidUtils.parseApk(AndroidUtils.kt:252)
at net.bytedance.security.app.StaticAnalyzeMain.startAnalyze(StaticAnalyzeMain.kt:52)
at net.bytedance.security.app.StaticAnalyzeMainKt$main$2.invokeSuspend(StaticAnalyzeMain.kt:100)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106)
at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:284)
at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
at net.bytedance.security.app.StaticAnalyzeMainKt.main(StaticAnalyzeMain.kt:100)
at net.bytedance.security.app.KotlinEntry$Companion.callMain(KotlinEntry.kt:24)
at net.bytedance.security.app.KotlinEntry.callMain(KotlinEntry.kt)
at net.bytedance.security.app.JavaEntry.main(JavaEntry.java:6)

我找到了类似问题: https://github.com/secure-software-engineering/FlowDroid/issues/716 ,看起来貌似是最近Android的构建链发生了变化,FlowDroid已经在最新版修复了该问题,修复代码见: https://github.com/secure-software-engineering/FlowDroid/pull/717/files ,但Appshark用的是二开版本 implementation(“io.github.nkbai:soot-infoflow-android:2.10.4”) ,要修复的话要用二开代码按照FlowDroid的修复方式再次进行修改。

后面我了解到二开版本其实并没有做功能性上的更改,所以自行升级soot-infoflow-android即可,但最新版API有部分变化,需要修改Appshark代码进行适配。

提了PR:https://github.com/bytedance/appshark/pull/78

规则编写

参考 https://github.com/bytedance/appshark/blob/main/doc/zh/how_to_write_rules.md ,我简单写了一个类似Mariana Trench中 Third-party input flows into file resolver 的规则:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
"InputToFile": {
"enable": true,
"SliceMode": true,
"traceDepth": 10,
"desc": {
"name": "InputToFile",
"category": "File",
"detail": "Values from third-party controlled source may eventually flow into sink file resolver"
},
"entry": {},
"source": {
"Return": [
"<android.content.Intent: * get*(*)>"
]
},
"sink": {
"<java.io.File: * <init>(*)>": {
"TaintCheck": [
"p*"
]
},
"<java.io.FileOutputStream: * <init>(*)>": {
"TaintCheck": [
"p*"
]
}
}
}
}

可以扫出这个漏洞:https://docs.insecureshopapp.com/insecureshop-challenges/theft-of-arbitrary-files-from-localstorage

Mariana-Trench

https://github.com/facebook/mariana-trench

基本使用

ARM版MacOS不支持以pip方式安装,需要自行编译,见 https://github.com/facebook/mariana-trench/issues/49 ,看起来挺麻烦的,另外pip仓库内的版本貌似更新也不及时,所以我写了Dockerfile用以编译最新版本(也踩了相当多坑才编译成功):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# need boost 1.75, ubuntu20.04 only has 1.71. change to homebrew/ubuntu24.04
FROM --platform=linux/amd64 homebrew/ubuntu24.04:latest

# install dependencies
# https://mariana-tren.ch/docs/build-from-source/
ENV HOMEBREW_NO_AUTO_UPDATE=true
RUN brew install git cmake
RUN brew install zlib boost jsoncpp re2 googletest openjdk@17
ENV CMAKE_PREFIX_PATH=/home/linuxbrew/.linuxbrew/opt/jsoncpp:/home/linuxbrew/.linuxbrew/opt/zlib

# https://github.com/facebook/mariana-trench/issues/175
RUN brew uninstall jsoncpp && \
sudo apt update && \
sudo apt install -y libjsoncpp-dev

# when building Redex, there are still errors. fix it
RUN brew uninstall boost && \
sudo apt install -y libboost-all-dev

# download mariana-trench
ENV MARIANA_TRENCH_DIRECTORY=/home/linuxbrew/mariana-trench
RUN git clone https://github.com/facebook/mariana-trench.git && \
cd $MARIANA_TRENCH_DIRECTORY && \
mkdir install && \
mkdir dependencies

# build fmt
RUN cd "$MARIANA_TRENCH_DIRECTORY/dependencies" && \
git clone -b 8.1.1 https://github.com/fmtlib/fmt.git && \
mkdir fmt/build && \
cd fmt/build && \
cmake -DCMAKE_INSTALL_PREFIX="$MARIANA_TRENCH_DIRECTORY/install" .. && \
make -j4 && \
make install

# build redex
RUN cd "$MARIANA_TRENCH_DIRECTORY/dependencies" && \
git clone https://github.com/facebook/redex.git && \
mkdir redex/build && \
cd redex/build && \
cmake -DCMAKE_INSTALL_PREFIX="$MARIANA_TRENCH_DIRECTORY/install" .. && \
make -j4 && \
make install

# build mariana-trench
RUN cd "$MARIANA_TRENCH_DIRECTORY" && \
mkdir build && \
cd build && \
cmake -DREDEX_ROOT="$MARIANA_TRENCH_DIRECTORY/install" -Dfmt_ROOT="$MARIANA_TRENCH_DIRECTORY/install" -DCMAKE_INSTALL_PREFIX="$MARIANA_TRENCH_DIRECTORY/install" .. && \
make -j4 && \
make install

# install mariana-trench
RUN sudo apt install -y rsync gcc-11 && \
cd "$MARIANA_TRENCH_DIRECTORY" && \
sed -i 's/"pip", "install"/"pip", "install", "--break-system-packages"/g' scripts/setup.py && \
python3 scripts/setup.py --binary "$MARIANA_TRENCH_DIRECTORY/install/bin/mariana-trench-binary" --pyredex "$MARIANA_TRENCH_DIRECTORY/install/bin/pyredex" install

# modify the sapp listening address
RUN sed -i 's/host="localhost"/host="0.0.0.0"/g' /home/linuxbrew/.linuxbrew/lib/python3*/site-packages/sapp/ui/server.py
1
2
docker build -f Dockerfile . -t l3yx/mariana-trench:250105
docker run --rm l3yx/mariana-trench:250105 mariana-trench --help

依然用 InsecureShop.apk 测试,需要先使用Jadx反编译出源代码(如果是混淆过的APK建议在Linux下反编译,因为存在很多同名的大小写不同的文件和目录,Windows和MacOS文件系统都不区分大小写),Mariana Trench 会在分析之前对源路径进行索引,当然不提供源代码也可以分析,只不过结果中没有代码预览了。

1
jadx --show-bad-code --rename-flags none --no-imports --export-gradle --output-dir output InsecureShop.apk

映射文件和端口,并进入容器操作:

1
docker run --rm -ti -v ~/Library/Android/sdk/platforms/android-34/android.jar:/work/android.jar -v ./InsecureShop.apk:/work/app.apk -v ./output/app/src/main/java:/work/src -p 13337:13337 l3yx/mariana-trench:250105 bash
1
2
3
4
mkdir results
mariana-trench --system-jar-configuration-path=/work/android.jar --source-root-directory=/work/src --apk-path=/work/app.apk --output-directory=results
sapp --tool=mariana-trench analyze results
sapp --database-name=sapp.db server --source-directory=/work/src

默认规则扫描出11个问题,主要是三类:Third-party input flows into file resolverThird-party input flows into WebView Javascript execution APIExported component calls setResult

结果分析可以参考官方文档:https://mariana-tren.ch/docs/getting-started/#exploring-results

Mariana Trench的结果Traces分三个部分,分别是source trace(表示数据来自何处)、trace root(表示source tracesink trace 交汇处)、sink trace (表示追踪数据流向最终Sink的过程)。

以如下问题举例:

source trace为:

代码位置不太对,应该是使用Jadx反编译后行号不一致导致的,用Jadx GUI查看时,行号是正确的:

source trace很短,source为Activity中的用户输入,Activity.getIntentChooserActivity.onCreate被直接调用。

后面我用真实源码传入Mariana Trench再次扫描了一下,效果如下:

trace root为:

sink trace 为:

数据从onCreate到makeTempCopy最后到文件解析器。

这里对应漏洞: https://docs.insecureshopapp.com/insecureshop-challenges/theft-of-arbitrary-files-from-localstorage

分析真实漏洞APK

Basecamp文件泄漏分析

HackerOne上有一个漏洞案例:https://hackerone.com/reports/2553411 ,会将敏感文件写入用户通过Activity传入的路径中,APP并没有对路径进行处理和验证,导致可以目录穿越写到公共目录,APK下载地址:https://www.apkmirror.com/apk/basecamp/basecamp-3/basecamp-3-4-8-6-release/

复现方法如下(其中数字id要替换为自己的账号):

1
adb shell am start -a android.intent.action.VIEW -n com.basecamp.bc3/com.basecamp.bc4.app.main.MainActivity -d 'https://3.basecamp.com/5884861/reports/progress?filename=/../../../../../../../../../../../sdcard/Download/disclosure.txt'

我尝试人工逆向分析漏洞调用链路,但该APP经过混淆,感觉人力基本不可能分析出来。

不过可以用Frida Hook关键函数大致推出调用链路:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Java.perform(() => {
const FileOutputStream = Java.use('java.io.FileOutputStream');
FileOutputStream.write.overload('[B', 'int', 'int').implementation = function (bytes,i1,i2) {
if(this.path.value.includes('disclosure.txt')>0){
console.log("FileOutputStream.write:");
console.log(this.path.value);
console.log(Java.use("android.util.Log").getStackTraceString(Java.use("java.lang.Throwable").$new()));
console.log("");
}
FileOutputStream.write.overload('[B', 'int', 'int').call(this, bytes,i1,i2)
}
FileOutputStream.$init.overload('java.io.File', 'boolean').implementation = function (file,append) {
if(file.toString().includes('disclosure.txt')>0){
console.log("FileOutputStream.$init:");
console.log(file);
console.log(Java.use("android.util.Log").getStackTraceString(Java.use("java.lang.Throwable").$new()));
console.log("");
}
this.$init(file,append);
}

const Intent = Java.use('android.content.Intent');
Intent.getData.implementation = function () {
const url = Intent.getData.call(this);
if(url){
console.log("Intent.getData:");
console.log(url);
console.log(Java.use("android.util.Log").getStackTraceString(Java.use("java.lang.Throwable").$new()));
console.log("");
}
return url;
}

const Uri = Java.use('android.net.Uri');
Uri.getQueryParameter.implementation = function(key){
const ret = Uri.getQueryParameter.call(this, key);
if(key=="filename"){
console.log("Uri.getQueryParameter:")
console.log(ret);
console.log(Java.use("android.util.Log").getStackTraceString(Java.use("java.lang.Throwable").$new()));
console.log("")
}
return ret;
}
});
console.log("\n")


// frida -U -f com.basecamp.bc3 -l basecamp.js

静态分析

使用前文创建的容器扫描时总会在一个地方卡很久,不知道是因为ARM系统转译AMD软件有性能损失还是Mariana Trench最新代码中的问题,或者是新增的某个规则比较耗时。我用Python镜像和pip重新安装了mariana-trench 1.0.6进行测试:

1
2
3
4
FROM python:3.12

RUN pip install mariana-trench==1.0.6
RUN sed -i 's/host="localhost"/host="0.0.0.0"/g' /usr/local/lib/python3.12/site-packages/sapp/ui/server.py
1
2
docker build -f Dockerfile . -t l3yx/mariana-trench:1.0.6
docker run --rm l3yx/mariana-trench:1.0.6 mariana-trench --help

由于我将--maximum-source-sink-distance设置的比较大,扫出了549个问题(默认参数下是48个)

1
2
3
4
5
6
docker run --rm -ti -v ./:/work -p 13337:13337 l3yx/mariana-trench:1.0.6 bash
cd /work/
mkdir results
mariana-trench --maximum-source-sink-distance 50 --system-jar-configuration-path=android.jar --apk-path=app.apk --output-directory=results
sapp --tool=mariana-trench analyze results
sapp --database-name=sapp.db server

但是很遗憾并未发现前文中Basecamp的那个漏洞,或许是因为混淆导致断链?

Jandroid

https://github.com/WithSecureLabs/Jandroid

Jandroid是基于 Androguard 的静态分析工具,集成了很多漏洞模版用以安全检测。

基本使用

InsecureShop.apk 放在apps目录中,然后:

1
2
3
pip install -r requirements.txt
docker run -d -p 7474:7474 -p 7687:7687 -e NEO4J_AUTH=neo4j/n3o4jn3o4j neo4j:5.23
python3 src/jandroid.py -g neo4j

Jandroid通过neo4j来展示查找出的节点或者调用图,比如查看所有对外暴露的Activity:

对外暴露的Activity从 Intent.getData() 获取的数据最终传入 WebView.loadUrl() :

这里对应漏洞:https://docs.insecureshopapp.com/insecureshop-challenges/unprotected-data-uris

Intent.getParcelableExtra() 到 startActivity() :

对应漏洞:https://docs.insecureshopapp.com/insecureshop-challenges/access-to-protected-components

BUG

使用中发现官方规则getStringExtraFileOutputStream查询如下代码时始终没有结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"METADATA": {
"NAME": "getStringExtraFileOutputStream"
},
"CODEPARAMS": {
"TRACE": {
"TRACETYPE": "ADVANCED",
"TRACEFROM": "ARGTO Ljava/io/FileOutputStream;-><init>(Ljava/lang/String;Z)V ARGINDEX1",
"TRACELENGTHMAX": 10,
"TRACETO": "RESULTOF Landroid/content/Intent;->getStringExtra(Ljava/lang/String;)Ljava/lang/String;",
"RETURN": "<tracepath> AS @tracepath_getStringExtra_to_FileOutputStream"
}
},
"GRAPH": "@tracepath_getStringExtra_to_FileOutputStream WITH <method>:<desc>:<class> AS attribute=nodename"
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
class Test {
public static void test(String path){
try {
new FileOutputStream(path,false);
} catch (FileNotFoundException e) {
throw new RuntimeException(e);
}
}
}

public class MainActivity extends AppCompatActivity {
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);

Intent intent = getIntent();
String path = intent.getStringExtra("path");
Test.test(path);
}
}

测试发现把"TRACEFROM": "ARGTO Ljava/io/FileOutputStream;-><init>(Ljava/lang/String;Z)V ARGINDEX1"改为"TRACEFROM": "ARGTO Ljava/io/FileOutputStream;-><init> ARGINDEX1"就可以,然后从结果中可以看到FileOutputStream构造函数的参数签名部分丢失了:

模板编写

参考 https://github.com/WithSecureLabs/Jandroid/wiki/3.-Templateshttps://labs.withsecure.com/publications/automating-pwn2own-with-jandroid 我简单写了一个用户输入到文件输出的规则:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
{
"METADATA": {
"NAME": "IntentToFile"
},
"MANIFESTPARAMS": {
"BASEPATH": "manifest->application->activity OR manifest->application->activity-alias",
"SEARCHPATH": {
"LOOKFOR": {
"TAGVALUEMATCH": "<NAMESPACE>:exported=true"
}
},
"RETURN": ["<smali>:<NAMESPACE>:name AS @exported_activity_name"]
},
"CODEPARAMS": {
"TRACE": {
"TRACETYPE": "ADVANCED",
"TRACEFROM": "ARGTO Ljava/io/FileOutputStream;-><init> ARGINDEX 1",
"TRACELENGTHMAX": 100,
"TRACETO": "RESULTOF ->getIntent",
"RETURN": "<tracepath> AS @tracepath_Intent_to_FileOutputStream",
"TRACELOCATION": "@exported_activity_name"
}
},
"GRAPH": "@tracepath_Intent_to_FileOutputStream WITH <method>:<desc>:<class> AS attribute=nodename"
}

不过也无法扫到Basecamp中的漏洞。