I have not said much about this election, but the recent article by Slate, stating as a fact that there is a “connection” between a Trump server and the bank Alfa in Russia and thus Trump must be in collusion with Putin and Russia, brought my attention because it touches on what I have done for a living for nearly a quarter of a century. It bandies about computer and networking terms and then poorly explains how speculative the conclusions in the article are. Let us go back to the article and go down the points made one by one.
One – DNS as computer forensics.
The first few paragraphs go over the use of DNS logs as forensics. To truly understand the advantages and disadvantages of DNS in computer forensics you have to understand what DNS is. Basically when you type an email address into an email or a web address into the address bar of your browser and then press send or go, your PC sends a request to your internet provider’s DNS server. This server might already know the IP you need because it is a popular site such as Facebook.com, google.com, etc. and if so, would immediately return that IP. Or it could be a more obscure domain such as mine jphogan.net. In which case the DNS server likely will not have it, and will then go up the “chain” looking for the information. Again, the server will inquire further to find the IP and return it to the PC. How it does this is it will typically lookup the domain at a next level server which it’s next level provider supplies. This continues all the way up until it gets an IP or it actually queries the registry which the entity owning the domain registered. In the case of my domain jphogan.net, that would be godaddy.com’s whois servers. They would promptly tell the DNS server of your provider that my information can be found on my DNS servers: NS01.DOMAINCONTROL.COM and NS02.DOMAINCONTROL.COM. These servers are two of default servers provided by GoDaddy to it’s hosting and registry customers.
If you parsed these entries in the database, you might come to the conclusion that CenturyLink (my ISP) has a special relationship with GoDaddy.com, when in fact they are competitors as CenturyLink also sells registration and hosting services and I as a customer of both, cause their servers to talk regularly.
But, let us go a little further in the DNS as forensics and look at how you setup DNS servers. One reason to setup DNS servers is you host a domain and wish to give people wishing to visit, email or otherwise communicate a way to find you. Another reason is that you host many internet connected devices and you wish to curb your bandwidth usage. In plain English, you have dozens, hundreds or thousands of users and you want to stop every time someone types in google.com, from going across the internet to find the IP for Google. Instead, you create a “speed dial” list that includes the most common domains visited by your users. To do this you configure your DNS server to remember the most frequent visited domains information in toto or in part on your server and update it on a regular basis. This way you limit a very significant usage of your internet bandwidth that is very redundant as this information does not change very frequently. But you also increase it in part because even when someone is not there surfing the web, your DNS server will check on Google’s servers and get it’s information to see if it changed. This setting can be set to 2, 4, 8, 24, 48 hours and pretty much any setting in between and more depending on the server. You can also have it not update but just keep the current and only inquire if it gets a failure. Now, take note, the addresses it keeps and stores is decided by it based on traffic. No IT person sits there and decides whether its appropriate for a company to have “bigbreastedwomen.com” or “kkk.org” as frequently visited sites. The users connecting to the sites decide that. What the IT person can decide is how big the “speed dial” list will be. It could be as small as 1. It could be as large as millions. It depends on the storage, memory space, and bandwidth availability of the server as to what that upper limit is. I have seen small DNS servers with “speed dials” as small as 100 or fewer sites. I have seen DNS server farms with “speed dials” in the tens and hundreds of thousands.
What does this mean for DNS forensics? It is a useful tool for establishing trends, and links between domains. It is also very useful in indicating right before a malware or other malicious attack a new connection was created. This allows you to narrow the search area for a perpetrator. As for establishing relationships, it is really not as useful. Since much of DNS is automated and a relationship and communications between servers that last days months and even years can be established with as little as a single email or web query, the existence of such a link or communications means very little in and of itself. Additional data is require to form any conclusions. In other words to prove a relationship you would need to access one or both sides of the conversation directly. The experts in the article state as much themselves but it is downplayed by the author. Most illuminating to me was the name the source of the article chose, “Tea Leaves”. I understand the process of reading DNS logs to be not that dissimilar to the process of reading tea leaves to divine ones future.
Two – Alfa pings server at Trump-Email.com.
The article states: “But what he saw was a bank in Moscow that kept irregularly pinging a server registered to the Trump Organization on Fifth Avenue.” One server “regularly” communicated with another, on the internet. Stop the presses! Not much else is provided here other than a link to the registration of the Trump server and not even the domain name was explicitly stated in the article. No time period was provided. Every four hours? Precisely regular like an automated event? Almost as if they don’t want anyone to do their own research. But, I did.
First the domain that was being pinged by the presumed Alfa server (which they did NOT identify) is registered to the Trump Organization but managed by Cyndyn. This company is basically a Customer Relationship Management company that specializes in Hotels. Surprise! Trump manages hotels. My little bit of searching and googling the company revealed little other than they seem to specialize in marketing over social media and email to prior customers for big chains and high end hotels. They likely do things like send customer surveys, thank you for your business emails, follow up offers and things in this vein. In plain English, they spam your existing customers trying to generate more business for your hotels. OK, maybe not spam, but more “targeted communications”.
OK, all this means that Cyndyn, a company Trump Organization apparently contracts, uses a server setup in Trump’s name. This server is regularly inquired upon by a Russian bank named Alfa. That’s it. Any other conclusion about the facts presented so far is pure unadulterated speculation. It might be “educated” speculation by virtue of the experience and skills of the speculators, but it is still speculation. And speculation they or the author doesn’t want you to be able to duplicate or critique their conclusions based on leaving out of the article several significant facts: How regular? Who registered the “Alfa” servers? The parent company? A Subsidiary? Did they register at the same time as the Trump server? (A two way link would require both servers to be setup at around the same time.)
Three – Irregular lookups mean human intervention
Finally, they inform us that the traffic is “irregular”. They provide us the first hard data in the form of a graph of the traffic. Alright! Something I can sink my teeth into. Let’s see what our fortune teller has for us in the way of real data. First there is two servers communicating with the Trump server. Interesting. This indicates a server and a backup server perhaps? The servers traffic does appear to flip at points like a backup and main server. But, at other points along the graph, they both have significant traffic at the same time. Hmmm. Perhaps a load balancing pair? Perhaps a pair of servers that cover different areas of a building? Hard to say.
That the irregular nature of the inquiries indicate human intervention. Probably. They do appear to spike on certain days and things like that. It is however possible it could be a DNS glitch where the domain trump-email is somehow falling off and getting back on the “speed dial” of their server. This could be by human influence of someone sending email, browsing to a server. This could also be with someone tweaking their DNS server from 1000 “speed dial” entries to 900, then back up to 1500, etc. These kinds of tweaks can occur regularly in most large organizations as they try to adjust for traffic and maximize bandwidth. My conclusion, based on the data provided without little more to go on. Yea, there is probably human intervention here.
But, far more significant is now that I finally get to see the “traffic” graphed here I can see something else. The entire basis for saying there is traffic between the server is DNS lookups. No information on whether any information was actually ever exchanged. A DNS lookup is the telephonic equivalent of calling 411 and getting the number for someone. You would never see a police officer or law enforcement official trying to establish you know a person based on that person having called information for your phone number. No, he would know that means only that person got your number. Maybe they called, maybe they didn’t. Maybe they called and realized you are not the “John Hogan” they wanted to speak with. To determine relationship you would need to know about the call itself. Did it last 10 seconds or more like an hour?
This graph contradicts almost every previous fact they have posited. Regular communications is now “irregular”. The “communications” are now “DNS lookups”. No evidence of actual information flowing between these servers is presented.
There is many reasons you might make a DNS inquiry and not actually connect. For instance, you might type in whitehouse.com trying to connect to the White House. If you did this in the early 2000s, boy would you be clicking stop and clearing your browser cache. It was a porn site then. Now it is vacant and for sale. My main point here is there are thousands of possibilities about this and none of them indicate actual traffic to the trump-email domain much less this specific server.
Four – Server only accepts traffic from limited range of IPs.
To wit they state in the article: That wasn’t the only oddity. When the researchers pinged the server, they received error messages. They concluded that the server was set to accept only incoming communication from a very small handful of IP addresses.
Oddity? Seriously? If you have a server, and you have it open to all IP address ranges, please reach your hand up behind your head and give yourself a Gibbs slap. You are why we have to have internet security and malware removal software. Most servers can and professionally managed servers are limited by what IP’s can connect. Some do it by a “white list” which bans every IP except those coming from people you know. Some do it by a “black list” which bans specific IPs from your server. There are black and white lists you can subscribe to: lists of known spammers, lists of known hackers, lists of rich people, lists of poor people, etc. You name who you want to include or exclude and someone probably maintains a list. However these lists can be imperfect.
Normally I would not even dignify such an asinine statement with rebuttal but it is important for you to understand blocking to understand one of my theories about what might be going on here. As I stated above there are lists. Another thing you can do is ranges. For instance a corporation might use addresses that all start with 172.172. They might conveniently limit it by using a range. But, what if they don’t own all the 172.172 IPs? They might still use it as a limit if they check and none of their competitors use the other IPs in the range. Yea, they can get precise but might not always. To exclude everyone but them might take 20 or 50 more entries in a range list to tighten it down. And if you mess up just one of these statements you might end up locking someone who is in your company out. Someone significant, say your CEO, and that might adversely affect your career. But by limiting it to 172.172 they have significantly lowered their exposure. And also this “list” typically is not maintained in one place but has to be replicated across every router and server and firewall device your company owns. So replicating one broad statement is much easier to manage than replicating a series of 10, 20 or 50 very precise statements. IT people can be lazy. Shhh! Don’t tell the managers.
Five – Paul Vixie conclusion of “secretive” communications.
I don’t know Paul, but I do respect his work. But, as much as I respect him, I think even he would have to admit that there is a myriad of other possibilities based only on the data that was shared in the article. The author seems to emphasize this statement: After studying the logs, he concluded, “The parties were communicating in a secretive fashion. The operative word is secretive. This is more akin to what criminal syndicates do if they are putting together a project.” Yet, a few paragraphs down: And we can’t even say with complete certitude that the servers exchanged email. One scientist, who wasn’t involved in the effort to compile and analyze the logs, ticked off a list of other possibilities: an errant piece of spam caroming between servers, a misdirected email that kept trying to reach its destination, which created the impression of sustained communication. “I’m seeing a preponderance of the evidence, but not a smoking gun,” he said.
The one thing that most makes me disbelieve or question Paul’s conclusion is no evidence is presented that the banks servers are similarly locked down. To secure a communications link one has to lock down both sides. So either Trumps people are inept as to allow a non-secured device to be a part of a secure link, or there was never an intent of a secure link. So, unless there is more data which is not being shared, or the conclusion is based on scant evidence. It would be like saying I know Paul personally because I have googled him and read some of his work. Sorry, nope. Buddy Paul ain’t inviting me to his backyard barbecue this weekend. I am so sad.
Six – No other inquiries.
They point out that there were no other inquiries for the server than from this Russian Bank. But, then they state that Spectrum Health servers inquired for a brief period. This seems to support my theory of an accidental exposure to public of the server. It would be interesting to know if the inquiries from Spectrum Health came from IPs similar to the Trump server or the Russian banks servers or both. No other inquiries means either it was an accidentally exposed private server or the relationship is so significant to warrant a dedicated server. This is very rare and expensive. Granted Trump has the money, but he doesn’t look to me to be one to pay extra for a dedicated server and then not insist the bank on the other side make the same investment.
Kind of hard to draw any conclusions because the data is not shared. Were there truly no other inquiries or did they dismiss them as insignificant?
Conclusions and my own theories
I think the author sums it up himself but buries the statement in his article of innuendos and suppositions: DNS logs reside in the realm of metadata. We can see a trail of transmissions, but we can’t see the actual substance of the communications.
Let me take that a step further, we can’t even be sure there were communications. This complex thing we call the internet is a series of interconnected servers, PCs, routers, switches, and any number of a myriad of other devices. They are constantly talking. Some of these conversations have purpose. Some of them are little more than “I am still here! Look at me!” DNS conversations often fall into the realm of the latter. One person querying Trumps domain from the bank could have been enough to start this whole sequence.
But, what about the IP blocking, why did Trumps server allow the traffic? It could be it ended up on a white list because someone from that bank once stayed at one of his resorts. It could be a coincidental allowance. Let me explain. Let’s say that Trumps servers had a prefix of 190.190.192 and 190.190.193 but he only owned half of 190.190.193. His IT employees might consider allowing those two ranges an acceptable risk. Then this Russian bank got the rest or part of 190.190.193. They were allowed in. Then someone there looks up that email domain or somehow sends an email to it to a bad address, maybe a typo looking for crump-email or drump-email. The possibilities are endless there.
There is also the possibility that Trumps sub-contractor, Cendyn does have a relationship with that bank and allowed it’s IPs in. Perhaps they are using some of that good old fashioned spammer knowledge that is prevalent in Russia. Or their CEO owns stock in the Russian Bank, or the Russian bank owns stock in Cendyn or Trumps organization. Or Trump owns stock in the Russian bank to keep abreast of developments there that might make Russia a possibility for expansion. This is a common tactic to keep a certain amount of stock in a competitor to allow communications you might not otherwise get. For instance I bet if you checked the holdings of any Fortune 500 company, you might find that almost every one of them have a token, 50 or 100 shares in major competitors and potential competitors to get such info the minute it becomes public. It’s prudent and quite truthfully malpractice not to.
So what are my theories and how could we prove them:
- The Russian bank server is simply luckily in the IP hole for Trump organization and a typo or other inquiry led to this. How could we prove it? Look for the communications to end due to a tightening of the the IP allow lists on that server. Which has happened. But hardly proof. Check the IP ranges of Trump’s server against that of the Russian bank. Specifically those servers involved. Do they overlap or exist within a range that might be conveniently used as a white list?
- Cendyn has some kind of relationship with the Russian bank that provides technical services in relation to it’s mass emailing. We could look to see if any Cendyn servers have such links to the same bank. Or Cendyn managed servers for other companies having that link.
- That this was due to an errant email for another domain similarly named as the trump-email domain. We could search the logs of the Russian banks servers for inquiries for similar sounding or spelled domains.
- That the traffic patterns are indicative of server maintenance or not. We could determine this by seeing if the spikes seen in the graph occur with many domains queried by the Russian bank.
Now some other things I left alone from the Slate article were: Alfa banks relationship with Putin and the fact that the owner of the bank is also a technology entrepreneur. To the first, I would say most banks in Russia have some kind of relationship with Putin, especially successful ones. With the rumored corruption in the country, it would seem to be a prerequisite to being a successful Russian bank to jump into bed with those in power. The fact that Mikhail Fridman also invests in technology companies, tends to give credence to the the possibility of a relationship between Cendyn and the Russian bank. But really neither of these are in my sphere of expertise, so I leave them as the author left them to stand on their own.
My conclusion, this is all a lot of smoke and mirrors designed to sway voters. If they had disproved the alternative theories such as I posited here, why did they not say that in the article? One paragraph saying we also looked at x, y and z as a possibility but the data proved them to be non-possibilities. I can understand the data might expose the sources, but on something of this import, you expect us to take your word? Much less accept it when you don’t even tie yourself to the conclusions? When every fact you present has 10.000 other interpretations? When many of your “facts” contradict one another? The other thing to remember is one person can establish a relationship between two companies just as deep as this so-called relationship between Trump and Alfa bank. One typed address in the address bar of your browser, one typo-ed email address. And boom, relationship. Barring any new information or data to back up these claims I rate this Slate article as utter balderdash.