Do Robots Dream of Internet Explorer? (alt. Can We Still Blame Clippy?)

  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in:

    This forum does not automatically send notices of new content. So if, for example, you would like to be notified by mail when Steve posts an update to his blog (or of any other specific activity anywhere else), you need to tell the system what to “Watch” for you. Please checkout the “Tips & Tricks” page for details about that... and other tips!

    /Steve.
  • Larger Font Styles
    Guest:

    Just a quick heads-up that I've implemented larger font variants of our forum's light and dark page styles. You can select the style of your choice by scrolling to the footer of any page here. This might be more comfortable (it is for me) for those with high-resolution displays where the standard fonts, while permitting a lot of text to fit on the screen, might be uncomfortably small.

    (You can permanently dismiss this notification with the “X” at the upper right.)

    /Steve.

DaveKenny

New member
Sep 30, 2020
2
0
In a recent Security Now episode, Steve cited studies and stats showing that there are still a lot of Internet Explorer browsers in use. I wonder if this might be generated by automated "users" -- as in older programs still used for searching and scraping well-known URLs and other automated tasks.

If someone wrote a program in VB, Microsoft C# or C++, or VBA, and the program made an http or ftp request, how would this query appear to the server? Would the program use instructions or data from an IE binary or DLL to identify itself?
Could such program be ported to another operating system, or be executed in a container or a legacy Windows compatability mode, and yet retain this property?
Would such a program, using only a small portion (presumably) of an older version of IE , necessarily make the new platform vulnerable in the same was as running an older Windows instance with an IE web browser?

I only ask out of curiosity on my part. I used to work in IT, but am retired, so my knowledge of this subject is obsolete and was, frankly, pretty shallow. I retain an interest in internet security, and have observed that older, simple, fundamental but low-impact systems that work are seldom replaced -- even when not secure -- so long as they continue to operate. I know of many private servers and small business systems that still operate on PC or Mac software, sometimes from of a corner of a forgotten closet, and there is no motivation (short of a large fire) to replace them.
 

JulioHM

Active member
Oct 25, 2020
37
15
If such robots are scraping the internet, I would assume most of them to be useless by now. So much has changed in browser compatibility, they would likely be stuck in a sea of errors and unusably rendered pages.

I think the most likely explanation is that some custom designed corporate IT services will never be updated. Developers who wrote server code serving specifically IE6 are now long gone and no one with enough sanity will ever touch that code ever again. While firewalled in corporate Intranet, some of its users are stuck with this ancient Windows desktop, which can't be upgraded because nothing else works as a client to those services.
 
  • Like
Reactions: DaveKenny

DaveKenny

New member
Sep 30, 2020
2
0
If such robots are scraping the internet, I would assume most of them to be useless by now. So much has changed in browser compatibility, they would likely be stuck in a sea of errors and unusably rendered pages.

I think the most likely explanation is that some custom designed corporate IT services will never be updated. Developers who wrote server code serving specifically IE6 are now long gone and no one with enough sanity will ever touch that code ever again. While firewalled in corporate Intranet, some of its users are stuck with this ancient Windows desktop, which can't be upgraded because nothing else works as a client to those services.

Thanks, JulioHM

Perhaps "scraping" was the wrong term to use. I know of a lot of individual set-ups that haven't been changed in a long time, even if the specific functions aren't used. They usually are not officially administered or maintained, in that no one is actually responsible for them.

I mind the Y2K bug, which was slightly different, but which revealed many very old software and some hardware that might have been affected by the end of the decimal epoch. The perceived vulnerability sprouted a substantial business for obsolete cobol and fortran programmers (much of which may have been unnecessary). Steve may have mentioned this in the same context.

I am curious if I remembered correctly -- that an http request can self identify the browser and version, and that the identity text can be arbitrarily set.
 

JulioHM

Active member
Oct 25, 2020
37
15
Yes, they can. HTTP headers usually identify the client or browser; User-Agent being the most prominent.

But headers are only included as a choice from the client, so they can easily be stripped or changed before sending. While it's possible to get a good analytics overview from a large sample, individually they are not reliable at all.
 

PHolder

Well-known member
Sep 16, 2020
664
2
323
Ontario, Canada
So much has changed in browser compatibility, they would likely be stuck in a sea of errors and unusably rendered pages.
Actually no. Basic HTML is all they really want. They dare not run any scripts because that would be dangerous (at a minimum they'd risk getting into infinite loops, or running coin mining or something.) Since most of them are just scraping human readable data, that is all right there in the basic HTML.
 

AlanD

Well-known member
Sep 18, 2020
218
75
Rutland UK
I have got a number of scripts running on customer's servers, but all they do is an html get and look for the response. If it is "200 OK" it knows the server is up and responding, if it is anything else, or nothing, it sends an alert.