PRACTICAL IOT CRYPTOGRAPHY ON THE ESPRESSIF ESP8266
The Espressif ESP8266 chipset makes three-dollar ‘Internet of Things’ development boards an economic reality. According to the popular automatic firmware-building site nodeMCU-builds, in the last 60 days there have been 13,341 custom firmware builds for that platform. Of those, only 19% have SSL support, and 10% include the cryptography module.
We’re often critical of the lack of security in the IoT sector, and frequently cover botnets and other attacks, but will we hold our projects to the same standards we demand? will we stop at identifying the problem, or can we be part of the solution?
This article will focus on applying AES encryption and hash authorization functions to the MQTT protocol using the popular ESP8266 chip running NodeMCU firmware. Our purpose is not to provide a copy/paste panacea, but to go through the process step by step, identifying challenges and solutions along the way. The result is a system that’s end-to-end encrypted and authenticated, preventing eavesdropping along the way, and spoofing of valid data, without relying on SSL.
We’re aware that there are also more powerful platforms that can easily support SSL (e.g. Raspberry Pi, Orange Pi, FriendlyARM), but let’s start with the cheapest hardware most of us have lying around, and a protocol suitable for many of our projects. AES is something you could implement on an AVR if you needed to.
Teori
MQTT is a lightweight messaging protocol that runs on top of TCP/IP and is frequently used for IoT projects. client devices subscribe or publish to topics (e.g. sensors/temperature/kitchen), and these messages are relayed by an MQTT broker. more information on MQTT is available on their webpage or in our own getting-started series.
The MQTT protocol doesn’t have any built-in security features beyond username/password authentication, so it’s common to encrypt and authenticate across a network with SSL. However, SSL can be rather demanding for the ESP8266 and when enabled, you’re left with much less memory for your application. As a lightweight alternative, you can encrypt only the data payload being sent, and use a session ID and hash function for authentication.
A straightforward way to do this is using Lua and the NodeMCU Crypto module, which includes support for the AES algorithm in CBC mode as well as the HMAC hash function. using AES encryption correctly requires three things to produce ciphertext: a message, a key, and an initialization vector (IV). Messages and keys are straightforward concepts, but the initialization vector is worth some discussion.
When you encode a message in AES with a static key, it will always produce the same output. For example, the message “usernamepassword” encrypted with key “1234567890ABCDEF” might produce a result like “E40D86C04D723AFF”. If you run the encryption again with the same key and message, you will get the same result. This opens you to several common types of attack, especially pattern analysis and replay attacks.
In a pattern analysis attack, you use the knowledge that a given piece of data will always produce the same ciphertext to guess what the purpose or content of different messages are without actually knowing the secret key. For example, if the message “E40D86C04D723AFF” is sent prior to all other communications, one might quickly guess it is a login. In short, if the login system is simplistic, sending that packet (a replay attack) might be enough to identify yourself as an authorized user, and chaos ensues.
IVs make pattern analysis more difficult. An IV is a piece of data sent along with the key that modifies the end ciphertext result. As the name suggests, it initializes the state of the encryption algorithm before the data enters. The IV needs to be different for each message sent so that repeated data encrypts into different ciphertext, and some ciphers (like AES-CBC) require it to be unpredictable – a practical way to accomplish this is just to randomize it each time. IVs do not have to be kept secret, but it’s typical to obfuscate them in some way.
While this protects against pattern analysis, it doesn’t help with replay attacks. For example, retransmitting a given set of encrypted data will still duplicate the result. To prevent that, we need to authenticate the sender. We will use a public, pseudorandomly generated session ID for each message. This session ID can be generated by the receiving device by posting to an MQTT topic.
Preventing these types of attacks is important in a couple of common use cases. Internet controlled stoves exist, and questionable utility aside, it would be nice if they didn’t use insecure commands. Secondly, if I’m datalogging from a hundred sensors, I don’t want anyone filling my database with garbage.
Practical Encryption
Implementing the above on the NodeMCU requires some effort. You will need firmware compiled to include the ‘crypto’ module in addition to any others you require for your application. SSL support is not required.
First, let’s assume you’re connected to an MQTT broker with something like the following. You can implement this as a separate function from the cryptography to keep things clean. The client subscribes to a sessionID channel, which publishes suitably long, pseudorandom session IDs. You could encrypt them, but it’s not necessary.
1
2.
3.
4.
5.
6
7
8
9
10
11
12
1. 3
14
15
m = mqtt.Client("clientid", 120)
m:connect("myserver.com", 1883, 0,
function(client)
print("connected")
client:subscribe("mytopic/sessionID", 0,
function(client) print("subscribe success") end
)
slutt,
function(client, reason)
print("failed reason: " .. reason)
slutt
)
m:on("message", function(client, topic, sessionID) end)
Moving on, the node ID is a convenient way to help identify data sources. You can use any string you wish though: nodeid = node.chipid().
Then, we set up a static initialization vector and a key. This is only used to obfuscate the randomized initialization vector sent with each message, NOT used for any data. We also choose a separate key for the data. These keys are 16-bit hex, just replace them with yours.
Finally we’ll need a passphrase for a hash function we’ll be using later. A string of reasonable length is fine.
1
2.
3.
4.
staticiv = "abcdef2345678901"
ivkey = "2345678901abcdef"
datakey = "0123456789abcdef"
passphrase = "mypassphrase"
We’ll also assume you have some source of data. For this example it will be a value read from the ADC. data = adc.read(0)
Now, we generate a pseudorandom initialization vector. A 16-digit hex number is too large for the pseudorandom number function, so we generate it in two halves (16^8 minus 1) and concatenate them.
1
2.
3.
4.
5.
half1 = node.random(4294967295)
half2 = node.random(4294967295)
I = string.format("%8x", half1)
V = string.format("%8x", half2)
iv = I .. V
We can now run the actual encryption. here we are encrypting the current initialization vector, the node ID, and one piece of sensor data.
1
2.
3.
encrypted_iv = crypto.encrypt("AES-CBC", ivkey, iv, staticiv)
encrypted_nodeid = crypto.encrypt("AES-CBC", datakey, nodeid,iv)
encrypted_data = crypto.encrypt("AES-CBC", datakey, data,iv)
Now we apply the hash function for authentication. first we combine the nodeid, iv, data, and session ID into a single message, then compute a HMAC SHA1 hash using the passphrase we defined earlier. We convert it to hex to make it a bit more human-readable for any debugging.
1
2.
fullmessage = nodeid .. iv .. data .. sessionID
hmac = crypto.toHex(crypto.hmac("sha1", fullmessage, passphrase))
Now that both encryption and authentication checks are in place, we can place all this information in some structure and send it. Here, we’ll use comma separated values as it’s convenient:
1
2.
payload = table.concat({encrypted_iv, eid, data1, hmac}, ",")
m:publish("yourMQTTtopic", payload, 2, 1, function(client) p = "Sent" print(p) end)
When we run the above code on an actual NodeMCU, we would get output something like this:
1d54dd1af0f75a91a00d4dcd8f4ad28d,
d1a0b14d187c5adfc948dfd77c2b2ee5,
564633a4a053153bcbd6ed25370346d5,
c66697df7e7d467112757c841bfb6bce051d6289
All together, the encryption program is as follows (MQTT sections excluded for clarity):
1
2.
3.
4.
5.
6
7
8
9
10
11
12
1. 3
14
15
16
17
18
19
nodeid = node.chipid()
staticiv = "abcdef2345678901"
ivkey = "2345678901abcdef"
datakey = "0123456789abcdef"
passphrase = "mypassphrase"
data = adc.read(0)
half1 = node.random(4294967295)
half2 = node.random(4294967295)
I = string.format("%8x", half1)
V = string.format("%8x", half2)
iv = I .. V
encrypted_iv = crypto.encrypt("AES-CBC", ivkey, iv, staticiv)
encrypted_nodeid = crypto.encrypt("AES-CBC", datakey, nodeid,iv)
encrypted_data = crypto.encrypt("AES-CBC", datakey, data,iv)
fullmessage = nodeid .. iv .. data .. sessionID
hmac = crypto.toHex(crypto.hmac("sha1",fullmessage,passphrase))
payload = table.concat({encrypted_iv, encrypted_nodeid, encrypted_data, hmac}, ",")
Decryption
Now, your MQTT broker doesn’t know or care that the data is encrypted, it just passes it on. So, your other MQTT clients subscribed to the topic will need to know how to decrypt the data. On NodeMCU this is rather easy. just split the received data into strings via the commas, and do something like the below. note this end will have generated the session ID so already knows it.
1
2.
3.
4.
5.
6
7
8
9
10
staticiv = "abcdef2345678901"
ivkey = "2345678901abcdef"
datakey = "0123456789abcdef"
passphrase = "mypassphrase"
iv = crypto.decrypt("AES-CBC", ivkey, encrypted_iv, staticiv)
nodeid = crypto.decrypt("AES-CBC&quOT;, DATAKEY, ENCRYPTED_NODEID, IV)
data = krypto.decrypt (“AES-CBC”, DATAKEY, ENCRYPTED_DATA, IV)
Fullmessage = Nodeid .. iv .. Data .. SessionID
hmac = crypto.tohex (krypto.hmac (“SHA1”, fullmessage, passphrase))
Sammenlign deretter den mottatte og beregnede HMAC, og uavhengig av resultatet, ugyldiggjør den økt ID ved å generere en ny.
Nød en gang i Python
For et lite utvalg, vurder hvordan vi skulle håndtere dekryptering i Python, hvis vi hadde en MQTT-klient på samme virtuelle maskin som megleren som analyserte dataene eller lagret det i en database. La oss anta at du har mottatt dataene som en streng “nyttelast”, fra noe som den utmerkede Paho MQTT-klienten for Python.
I dette tilfellet er det praktisk å hex kode på krypterte data på Nodemcu før du sender. Så på Nodemcu konverterer vi alle krypterte data til HEX, for eksempel: Encrypted_IV = Crypto.tohex (Crypto.NoCrypt (“AES-CBC”, Ivkey, IV, Staticiv))
Publisering av en randomisert sessionid er ikke diskutert nedenfor, men er lett nok ved hjelp av os.urandom () og Paho MQTT-klienten. Dekrypteringen håndteres som følger:
1
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
1. 3
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
fra Crypto.Cipher Import Aes
Importer Binascii.
fra krypto.hash import sha, hmac
# Definer alle tastene
Ivkey = ‘2345678901ABCDEF’
DATAKEY = ‘0123456789ABCDEF’
Staticiv = ‘ABCDEF2345678901’
Passphrase = ‘myPasshrase’
# Konverter den mottatte strengen til en liste
Data = PayLoad.Split (“,”))
# Extract List Elementer
encrypted_iv = binascii.unhexlify (data [0])
encrypted_nodeid = binascii.unhexlify (data [1])
encrypted_data = binascii.unexlify (data [2])
mottatt_hash = binascii.unhexlify (data [3])
# dekrypter initialiseringsvektoren
Iv_decryption_suite = AES.NEW (IvKey, AES.Mode_CBC, Staticiv)
IV = IV_DECryption_Suite.decrypt (Encrypted_IV)
# dekrypter dataene ved hjelp av initialiseringsvektoren
id_decryption_suite = aes.new (DATAKEY, AES.MODE_CBC, IV)
nodeid = id_decryption_suite.decrypt (encrypted_nodeid)
data_decryption_suite = AES.NEW (DATAKEY, AES.MODE_CBC, IV)
sensordata = data_decryption_suite.decrypt (encrypted_data)
# Compute Hash-funksjonen for å sammenligne med mottatt_hash
FullMessage = S.Join ([Nodeid, IV, Sensordata, SessionID])
hmac = hmac.new (passphrase, fullmessage, sha)
Computered_hash = hmac.hexdigest ()
# Se docs.python.org/2/library/hmac.html for hvordan du sammenligner hashes sikkert
Slutten, begynnelsen
Nå har vi et system som sender krypterte, godkjente meldinger via en MQTT-server til enten en annen ESP8266-klient eller et større system som kjører Python. Det er fortsatt viktige løse ender for deg å knytte seg hvis du implementerer dette selv. Nøklene er alle lagret i ESP8266S-flashminnet, så du vil kontrollere tilgangen til disse enhetene for å forhindre omvendt engineering. Nøklene lagres også i koden på datamaskinen som mottar dataene, her kjører Python. Videre vil du sannsynligvis at hver klient skal ha en annen nøkkel og passord. Det er mye hemmelig materiale for å holde trygt og potensielt oppdatering når det er nødvendig. Å løse det viktigste distribusjonsproblemet er igjen som en øvelse for den motiverte leseren.
Og på en avsluttende notat er en av de fryktelige tingene om å skrive en artikkel som involverer kryptografi, muligheten til å være feil på internett. Dette er en ganske enkel applikasjon av den testede og sanne AES-CBC-modusen med HMAC, så det burde være ganske solidt. Likevel, hvis du finner noen interessante mangler i det ovennevnte, vennligst gi oss beskjed i kommentarene.