Originally posted by Justin Seitz on the Hunchly blog (https://hunch.ly//osint-articles), and used with permission.
URL previews are a nice feature found in most messaging applications. It allows you to paste a URL to a friend or colleague, and have a handy miniature view of the website you are about to view. The downside is that a lot of applications generate these previews without you knowing that it is happening behind the scenes. In some cases this can equate to you disclosing your public IP address in a manner that you likely wouldn’t want. Don’t forget: when you browse to a website your public IP address is exposed. This is just how the Internet works unless you’re using Tor or a VPN to hide it. The difference with URL previews in messaging applications is that you are broadcasting to the website owner that you are discussing the website, as opposed to just browsing to it. This small and subtle change in context is actually quite an important distinction. You’ll see why very shortly…
A Little History
A few years ago I was on a penetration test where I was attempting to spearphish executives at a well known corporation in Europe. They had one of the most brilliant CISOs I had ever met and an absolutely amazing incident response team on staff. After I sent the initial round of phishing emails I was monitoring my command and control server to look for connections from users, anti-virus, or anything else that might indicate that I was either having some success or was about to be caught. After a few hours there was not a lot of activity until my web server received a connection from an IP address that resolved back to Skype. This was a WTF moment for me since my phishing server was brand new and there didn’t seem to be a good reason why a Skype server would be touching it. A few minutes later another hit from a different Skype server. Now I was really pondering what was going on. Then it dawned on me: someone was discussing my command and control system during a Skype chat, and Skype was generating previews of the phishing site I had setup. I performed a couple of quick tests using my own Skype account, and sure enough, I could reproduce the issue easily. I now knew that the incident response team was on to me, and it was time to switch tactics. But this also raised a much larger issue in my mind when it came to online investigations, incident response and running covert online operations.
How Does This Apply to Online Investigations?
There are two viewpoints here: one is from an investigative standpoint and the second is from the standpoint of you running a covert operation through a website. From the investigative standpoint, if you are passing URLs back and forth with a fellow investigator you may end up notifying your target that you are talking about them. This is exactly how I figured out that the incident response team was on to me during my penetration test. You likely don’t want this to happen. The second standpoint is where you are running a website for a covert online operation. You can monitor for these URL previews and determine that someone is discussing your site, potentially letting you know that your ruse is working or that you might be caught out (again, context is important and mission-dependent here). Either way, it is a unique set of behaviours that can be observed that is not general browsing activity.
Test Results from Various Platforms
I did some quick testing of various messaging clients and services. The test was to simply setup a Python web server on a Digital Ocean droplet ($5/month plan is sufficient). The Python web server just printed out the IP address and headers of the connecting client. I also setup a DNS record specific for this testing so that I could try using IP addresses vs. domain names. WhatsApp was the only service tested that responded differently for IP addresses vs. domain names. Every other service was happy to generate previews for an IP address. There was also no difference between using an HTTP vs. HTTPS URL. Here is a summary of findings:
We, like many other companies, live on Slack so this was the first test I performed. Slack was happy to generate URL previews and identified itself with the following User-Agent:
User-Agent: Slackbot-LinkExpanding 1.0 (+https://api.slack.com/robots)
The IP address of the request was from my publicly facing IP address through my office connection in both mobile and desktop versions of Slack.
So Messages was an interesting test that had some pretty unique behaviours. If you post a link from Messages on your desktop/laptop it will generate the preview directly from your public IP address as can be expected. The user agent shows:
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.2.4 (KHTML, like Gecko) Version/9.0.1 Safari/601.2.4 facebookexternalhit/1.1 Facebot Twitterbot/1.0
Pretty interesting that you see the Facebot Twitterbot pieces in there but this was actually picked up by a Reddit user as well. Here is where things can be a bit more interesting: if you are sending an SMS phish to a target you can enhance the URL preview experience a little by ensuring you have a file named:
The Messages app will attempt to retrieve this file once it determines that it can successfully reach the target web page. This file will be used in the preview that is generated and could help to entice your target to click the link. It can also be a way of acknowledging the fact that Messages was the application doing the URL preview in the first place.
Wire is pretty interesting. When you post a URL from the app both on desktop and on your mobile phone your public IP address will show up in the logs. However, there are no User-Agent headers that show up. In fact the only header that Wire sends is:
So this in itself is interesting because many of your HTTP clients (browsers, crawlers, bots, etc.) will send additional headers. By Wire stomping out all information this does become a “tell” that perhaps someone is discussing a target site in the Wire application. Further tracking of how often you see this limited set of client headers would have to be done in order to come up with something more statistically relevant than my single observation. Note that in Wire there is a setting in Preferences -> Options called “Create previews for links you send.” If you disable this it will prevent Wire from doing these URL previews. I recommend you do this. Thanks to Michael Bazzell for assistance with this one.
Facebook also announces itself, but it uses Facebook-owned infrastructure to hit the site for a preview. You will see a User-Agent header of:
User-Agent: facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
It doesn’t use your public IP address but does indicate that someone has posted a link to the target site on their Facebook profile or have sent it via Facebook Messenger. The IP address you see show up will be registered to Facebook so you can use a site like ipintel.io to look it up.
WhatsApp behaves somewhat differently than the other services. It will not honor IP addresses directly but if you type in a domain (and any port) it will attempt to do URL previews. Additionally, it will do continuous requests as you type the URI of the target page as well which generates a lot of traffic. The User-Agent looks like this:
User-Agent: WhatsApp/0.3.1649 N
The request comes from your public IP address.
Services That Didn’t Generate Previews
There were some services that didn’t generate any previews or traffic when pasting links, or typing URLs. Of course you should test this yourself to verify.
Twitter DM (Mobile/Web)
All of the mobile testing was done on an iPhone X so there may be differences with Android that aren’t covered here. There are probably a ton of other messaging apps out there that you could test, and you absolutely should. Feel free to let me know and I can update this post with your results.
There are a few things you can do to help mitigate the risk:
Defang your URLs — This is simply the method where you replace the dots and colons with other characters, or use brackets. An example could be:
Use a VPN — this is a secondary suggestion really as it is isn’t mitigating the original problem but for the services that are spitting out your public IP address this will at least obscure it.