SmartConfig 3

From: http://depletionregion.blogspot.com/2013/10/cc3000-smart-config-and-keyphrase.html

intro	Capturing WiFi Packets	Filtering Relevant Packets	SSID Keyphrase Recovery App
Implementation Issues	Channel Hopping	Setting up Aricrack Tshark on Linux	extra-features



CC3000 Smart Config and keyphrase recovery 

Having previously described how the SSID and keyphrase are transmitted to a 
CC3000 enabled device I thought I should put my money where my mouth is and 
prove that it's possible to create an application capable of recovering such 
information.

This proved a bit more difficult than expected but ultimately the code required
turned out to be fairly simple.



 Capturing wifi packets 
While sniffing ethernet packets on a wired network is something pretty much any
computer can do the same is not true for looking at wifi packets. To be able to
look at all packets, not just ones involving the machine doing the sniffing, one
has to be able to enable what's called monitor mode. The ease with which this 
can be done seems to depend on the wifi chipset, the OS and other factors. Even
if one can enable monitor mode one may be able to see the headers for packets 
without being able to see the data portion, again this seems to be dependent on
chipset and other factors.

After much unsuccessful experimentation on Linux I eventually found it was 
actually easier to get things working without any issues or special tricks on my
Mac. I did eventually get things to work on my Linux box and this is described 
later, but as it's rather more involved I'll stick with describing the Mac setup
initially.

I just downloaded and installed the latest Mac version of Wireshark (the 
de-facto standard packet analysis tool). After installation the command line 
version, called tshark, and other tools could be found in /usr/local/bin. Note:
when I installed Wireshark it created /usr/local/bin such that it belonged to a
userid that did not exist and with 0700 permissions, so I did:

$ sudo chmod 755 /usr/local/bin

First I found the wifi device like so:

$ tshark -D

It was en0, then I tested that I could capture packets including the data 
portion like so:

$ tshark -i en0 -I -V

The options tell tshark to capture packets from en0 (-i en0), using monitor 
mode (-I) and to produce verbose output (-V). Verbose output will show the 
binary contents of the data portions of any packets that have data. The fact 
that the data is encrypted isn't important.



 Filtering for relevant packets and 
outputting relevant information 

The -V options shows way more detail than is actually needed and without filters
one sees information about many packets that aren't of interest.

After some experimentation I came up the following:

$ tshark -o 'wlan.enable_decryption:FALSE' \
    -i en0 -I -f 'subtype qos-data' \
    -Y 'wlan.fc.retry==0' -T fields \
    -e wlan.bssid -e radiotap.channel.freq -e wlan.sa -e wlan.da -e data.len

As we can't decrypt the packet data we can't look at the higher level protocol 
information, so we can't simply filter for UDP traffic. But we can ignore all 
packets that can't possibly contain the UDP traffic we're interested in. We do 
this by excluding all packets that are not of subtype QoS data (-f 'subtype 
qos-data'). And we ignore all retransmitted packets (-Y 'wlan.fc.retry==0'), 
this may not sound intuitive but handling them in a meaningful manner is 
difficult and on the whole they tend to duplicate data that we actually already
have rather than providing data that has somehow been missed (which was my 
initial assumption).

The -T fields and subsequent -e arguments are our replacement for -V and only 
output the very limited set of fields values that we are interested in:

    BSSID - the numeric address behind the human readable SSID - see Basic 
		service set identification.
    Channel frequency - see WLAN channels (and more on channel hopping later).
    Source address - the address of the sender of a given packet.
    Destination address - the destination address of a given packet.
    Data length - the length of the encrypted data portion of a given packet.

BBSID and channel frequency are just output for reference - they are not 
actually required by any of the SSID or keyphrase recovery logic.

Note: wireshark and tshark can actually decrypt wifi packets if you provide the
necessary information. If you've already configured and enabled decryption then
tshark will pick this up from your ~/.wirehark file and automatically decrypt 
packets. Above I've actively disabled this behavior with the 
-o 'wlan.enable_decryption:FALSE' option, if you don't have decryption already 
configured you don't need this.



 SSID and keyphrase recovery application 

I've written an application in Java that parses the output of tshark and 
recovers the SSID and keyphrase information from this data. If you've got git 
installed you can just clone the relevant repository from GitHub like so:

$ git clone  
https://github.com/george-hawkins/betaengine

If you don't have git you can just download the repository contents as a zip 
file from here:


https://github.com/george-hawkins/betaengine

The code is fairly simple and short (700 lines in total) and just consists of 
the following classes:

    Consumer - contains the main method and reads and parses the output from 
		tshark.
    Analyzer - maintains a LinkManager per source/destination pair seen.
    LinkManager - looks for data length differences that might indicate Smart 
		Config data.
    LengthDecoder - finds SSID and keyphrase sequences.
    Solver - attempts to combine partial SSID and keyphrase sequences to 
		generate and decode complete sequences.
    EncodedData and Link - trivial support classes.

The files come with a README.md that briefly outlines how to compile and run the
application (for the run instructions look at the "Decoder" section). Basically
you just run tshark and pipe its output directly into the application. If you 
then use a Smart Config application to communicate an SSID and keyphrase to a 
CC3000 enabled devise you should soon see something like:

Solved SSID: [MyPlace]
Solved keyphrase: [LetMeIn]
Scan succeeded

This shows that we succeeded in recovering the SSID, in this case "MyPlace", and
the keyphrase, in this case "LetMeIn". Note that it may find the SSID or 
keyphrase long before the other.

Any characters that are not printable character in the Unicode range 0x20 to 
0xFF are printed as Unicode escapes, e.g. the € symbol would appear as "\u20AC".

If you don't succeed in recovering the password then it maybe that you are not 
listening on the right wifi channel - see the channel hopping section later. 
However generally you will be on the right channel already as a result of having
being previous connected to the relevant wifi access point (AP).

Note that while tshark is running in monitor mode your machine will be 
disassociated from your AP and other applications on the machine will not be 
able to access the network.

If you used AES encryption then the keyphrase displayed will be the still 
encrypted version and will probably appear largely as Unicode escapes due to 
non-printing characters etc. I haven't added AES decryption logic, this is left
as an exercise for the reader, it's simple actually if you create a cipher using
AES Electronic Cook Book transformation with no padding as described briefly in
the middle of this post. Obviously any such logic will need the relevant AES key
to decrypt a given keyphrase.

Important update Dec 8th, 2014: please see this comment from Mark and my reply.
I have not updated my code to reflect any recent changes such as this and do not
plan to do so.



 Implementation issues 

So what were the main difficulties encountered in creating the application?

When I started I thought it would be easier to filter the packets I was 
interested in from all the other packets and I assumed I would see cleaner runs
of packets corresponding to the SSID and keyphrase.

However while one can group packets by source and destination, when one cannot 
decrypt the packets one can only do so much to distinguish between Smart Config
related packets and other similar traffic between a given source and destination.

Using what we know about Smart Config it's possible to filter out many packets 
but we still end up with a combination of extra invalid values and missing 
values between the packets that delimit the SSID and keyphrase. I refer to the 
invalid extra values as spam and the missing values as holes in my code. The 
holes are presumably the product of packet collision, the spam the product of 
unrelated traffic that can't be distinguished from Smart Config traffic due to 
encryption. And remember that the packets that have an appropriate data length 
such that they appear to be Smart Config tags, separators etc. may themselves 
just be the result of unrelated traffic that coincidentally involves packets 
that have the lengths being looked for.

Note: packet collision shouldn't be the issue it is on wired ethernet networks 
due to the need to use CSMA/CA on wifi networks.

The Smart Config application transmits the same sequences over and over again. 
The Solver class takes multiple received sequences, each probably containing 
spam and holes, and tries to construct a clean sequence of the required length 
that obeys the upper nibble rules etc., described in my previous post, that we 
know have to apply.

The current Solver is just one possible implementation, one could imagine 
taking completely different approaches with different pros and cons. It should 
certainly possible to come up with more complex logic that can recover the SSID
and keyphrase from fewer repeats of the underlying sequences in the face of 
greater amounts of spam and collisions.

Note: the current solver tries hard to patch pieces from multiple sequences 
together to create a complete clean sequence. Sometimes it will actually produce
multiple valid solutions and you'll see output like this:

Solved SSID: [MyPlacf]
Solved SSID: [MyPlace]

Obviously only one solution is the right one - with a little extra effort it 
would be possible to generate statistics for each solution, on e.g. things like
how much patching was involved, to give some indication as to how likely a given
solution is to be the right one. Sometimes if an SSID or keyphrase tag gets lost
in transmission the current solver can occasionally produce a largish number of
very poor solutions.



 Channel hopping 

The tshark logic described above will only listen on whatever channel your wifi
device is currently configured for, typically channel 1, 6 or 11. The CC3000 
must presumably do channel hopping to find the relevant channel. Tshark and 
related utilities don't directly support channel hopping - but it's relatively 
easy to setup channel hopping - see the channel hopping section of the Wireshark
wiki page on capture setup.

On Mac things are even easier - one can use the standard, but well hidden, 
airport application that can be found here:

/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport

With this command you can scan for nearby networks and see what channel they're
using, you can disassociate from your current network and change the current 
channel of your wifi device. See e.g. CNET's overview of various Mac network 
related CLI commands for more details.



 Setting up aircrack and tshark on Linux 

As outlined above it proved to be easier getting tshark working on Mac. I did 
eventually get it working on my Ubuntu 12.04 machine. The main issue was 
enabling monitor mode. To do this I required airmon-ng, a tool that's part of 
Aircrack-ng. Aircrack wasn't available via apt-get so I had to download and 
compile it. On doing make install it installed airmon-ng to /usr/local/sbin.

Then I was able to enable monitor mode like so, where wlan0 is the name of my 
wifi device as reported by ifconfig (it may have a different name on your system):

$ sudo /usr/local/sbin/airmon-ng start wlan0 11

It output "monitor mode enabled on mon0" - mon0 is a pseudo device created by 
airmon-ng that tshark will listen to rather than wlan0. However the command also
outputs a warning about processes that may interfere with its operation. They do
indeed interfere but it's not as simple as killing the suggested PIDs as some of
them are related to services that will simply restart them if they're seen to 
die. So I had to stop the relevant services like so:

$ sudo service network-manager stop
$ sudo service avahi-daemon stop
$ sudo service upstart-udev-bridge stop

I then stopped monitor mode - note that this needs to done on mon0 rather than 
wlan0:

$ sudo /usr/local/sbin/airmon-ng stop mon0

Then I started monitor mode, as above, again and this time it only warned about
one process and I killed the listed PID with a normal kill (using sudo).

Note that stopping the above services will disconnect you from your wifi network
, even before you use tshark with the monitor mode enabled pseudo device mon0. 
If you need to reconnect, e.g. if you find initially as I did that you don't 
have tshark installed and need to apt-get it, then just redo the service 
commands above with start instead of stop.

OK - now we're ready to start tshark almost as above on the Mac:

$ tshark -o 'wlan.enable_decryption:FALSE' \
    -i mon0 -f 'subtype qos-data' \
    -R 'wlan.fc.retry==0' -T fields \
    -e wlan.bssid -e radiotap.channel.freq -e wlan.sa -e wlan.da -e data.len

Note that I use the mon0 pseudo device and use -R rather than -Y as the version
of tshark available via apt-get for Ubuntu 12.04 is older than the version I 
have on my Mac and doesn't support the -Y flag. And I don't use -I as mon0 is 
already in monitor mode (and trying to use -I will cause tshark to fail).

Unlike on the Mac, where no special steps need to be taken once one you've 
finished capturing packets with tshark, one should stop the pseudo device as 
shown above and restart the various services (also as described above).

Note that in the airmon-ng start command above we explicitly specify what 
channel we want to monitor, in the example it's channel 11. If you wanted to do
channel hopping see the Wireshark wiki page (also mentioned above).



 Extra features of the Smart Config library 

In a previous post I covered details of the TI Smart Config library that may 
(most likely) be historical left overs from TI's development process or may 
possibly be useable in combination with some non-default configuration of a 
CC3000 device. The only one of these that could affect the ability of the code 
I've written to recover SSIDs and keyphrases is being able to set the length of
the two separator value. Currently my code looks for packets that differ in 
length by the difference between these two values, so this logic would no longer
work if these values were changed. However it would be simple to adjust the 
logic to look for values that reoccur frequently and deduce that they were the 
separator values being used in this particular situation.