Checkout our demo site to practice selenium https://magento.softwaretestingboard.com/

0 like 0 dislike
3.1k views
in Selenium by
retagged by

I have got a website where anti bot software distil is implemented. I could open it in local machine w/o automation. But while automating, after two to three execution, application returns 400 (HTTP Error 400 Bad request). 

Kindly advise

1 Answer

0 like 0 dislike
by The go-to Tester (181 points)
selected by
 
Best answer

Distil is mainly used to provide scrap bot protection. According to the company CEO's interview:

Even though they can create new bots, we figured out a way to identify Selenium the a tool they’re using, so we’re blocking Selenium no matter how many times they iterate on that bot. We’re doing that now with Python and a lot of different technologies. Once we see a pattern emerge from one type of bot, then we work to reverse engineer the technology they use and identify it as malicious.

It'll take time and additional challenges to understanding how exactly they are detecting Selenium, but what can we say for sure at the moment:

  • it's not related to the actions you take with selenium - once you navigate to the site, you get immediately detected and banned. I've tried to add artificial random delays between actions, take a pause after the page is loaded - nothing helped
  • it's not about browser fingerprint either - tried it in multiple browsers with clean profiles and not, incognito modes - nothing helped
  • since, according to the hint in the interview, this was "reverse engineering", I suspect this is done with some JS code being executed in the browser revealing that this is a browser automated via selenium WebDriver

Decided to post it as an answer, since clearly:

Can a website detect when you are using selenium?

Yes.


Also, what I haven't experimented with is older selenium and older browser versions - in theory, there could be something implemented/added to selenium at a certain point that Distil Networks bot detector currently relies on. Then, if this is the case, we might detect (yeah, let's detect the detector) at what point/version a relevant change was made, look into changelog and changesets and, maybe, this could give us more information on where to look and what is it they use to detect a WebDriver-powered browser. It's just a theory that needs to be tested.

Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.

For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.

It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.


This site is for software testing professionals, where you can ask all your questions and get answers from 1300+ masters of the profession. Click here to submit yours now!

1.4k questions

1.6k answers

866 comments

1.9k users

...