Curb spear phishing? Separate bots from browsers
Connecting state and local government leaders
A Sandia researcher has developed algorithms that separate robotic Web crawlers from people using browsers, a first step toward identifying spear-phishing sources and targets.
Sandia National Laboratories, like many government agencies, gets thousands of visitors each day searching its websites — some human-generated traffic coming through browsers, and some Web crawlers or bots that could be up to no good.
In order to protect the network, analysts have to sift the bot traffic, which can contain various threats, from legitimate human-directed browser traffic.
But even the best security system can be defeated by a gullible user taken in by a spear phishing attack, one that targets specific e-mail addresses that have something the sender wants.
Sandia computer science researcher Jeremy Wendt wants to reduce the number of visitors that cyberanalysts have to check by identifying the bots. He has developed algorithms that separate robotic Web crawlers from people using browsers, according to the lab. Wendt said he believes his work will improve security because it allows analysts to look at the two groups separately and then identify the possible sources of spear phishing.
According to Sandia cybersecurity's Roger Suppona, the ability to identify the possible intent to send malicious content might enable security experts to alert a potential target. “More importantly, we might be able to provide specifics that would be far more helpful in elevating awareness than would a generic admonition to be suspicious of incoming e-mail or other messages,” he said.
According to its Web logs, the lab said its site traffic is about evenly divided between Web crawlers and browsers. Wendt is looking for a computer that doesn’t identify itself or says it’s one thing but behaves like another, and trolls websites in which the average visitor shows little interest.
Some of the differences between bots and browsers include:
Range: Crawlers tend to go all over; browsers concentrate on one place, such as jobs.
Volume: When bots try to index a site, they pull down HTML files far more often than browsers do.
Identification: Browsers often give their browser name and operating system information. Crawlers identify themselves by program name and version number.
Behavior: Browsers go after only one page but want all images, code and layout files for it instantly, or as Wendt calls the behavior, "bursty." Bot requests, on the other hand, are not bursty, and none of the bots identified had a high burst ratio.
Now Wendt needs to bridge the gap between splitting groups and identifying targets of ill-intentioned e-mails. He has submitted proposals to further his research after the current funding ends this spring.
“The problem is significant,” he said. “Humans are one of the best avenues for entering a secure network.”