Do Robots Dream of Internet Explorer? (alt. Can We Still Blame Clippy?)

  • SpinRite v6.1 Release #3
    Guest:
    The 3rd release of SpinRite v6.1 is published and may be obtained by all SpinRite v6.0 owners at the SpinRite v6.1 Pre-Release page. (SpinRite will shortly be officially updated to v6.1 so this page will be renamed.) The primary new feature, and the reason for this release, was the discovery of memory problems in some systems that were affecting SpinRite's operation. So SpinRite now incorporates a built-in test of the system's memory. For the full story, please see this page in the "Pre-Release Announcements & Feedback" forum.
    /Steve.
  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

DaveKenny

New member
Sep 30, 2020
2
0
In a recent Security Now episode, Steve cited studies and stats showing that there are still a lot of Internet Explorer browsers in use. I wonder if this might be generated by automated "users" -- as in older programs still used for searching and scraping well-known URLs and other automated tasks.

If someone wrote a program in VB, Microsoft C# or C++, or VBA, and the program made an http or ftp request, how would this query appear to the server? Would the program use instructions or data from an IE binary or DLL to identify itself?
Could such program be ported to another operating system, or be executed in a container or a legacy Windows compatability mode, and yet retain this property?
Would such a program, using only a small portion (presumably) of an older version of IE , necessarily make the new platform vulnerable in the same was as running an older Windows instance with an IE web browser?

I only ask out of curiosity on my part. I used to work in IT, but am retired, so my knowledge of this subject is obsolete and was, frankly, pretty shallow. I retain an interest in internet security, and have observed that older, simple, fundamental but low-impact systems that work are seldom replaced -- even when not secure -- so long as they continue to operate. I know of many private servers and small business systems that still operate on PC or Mac software, sometimes from of a corner of a forgotten closet, and there is no motivation (short of a large fire) to replace them.
 
If such robots are scraping the internet, I would assume most of them to be useless by now. So much has changed in browser compatibility, they would likely be stuck in a sea of errors and unusably rendered pages.

I think the most likely explanation is that some custom designed corporate IT services will never be updated. Developers who wrote server code serving specifically IE6 are now long gone and no one with enough sanity will ever touch that code ever again. While firewalled in corporate Intranet, some of its users are stuck with this ancient Windows desktop, which can't be upgraded because nothing else works as a client to those services.
 
  • Like
Reactions: DaveKenny
If such robots are scraping the internet, I would assume most of them to be useless by now. So much has changed in browser compatibility, they would likely be stuck in a sea of errors and unusably rendered pages.

I think the most likely explanation is that some custom designed corporate IT services will never be updated. Developers who wrote server code serving specifically IE6 are now long gone and no one with enough sanity will ever touch that code ever again. While firewalled in corporate Intranet, some of its users are stuck with this ancient Windows desktop, which can't be upgraded because nothing else works as a client to those services.

Thanks, JulioHM

Perhaps "scraping" was the wrong term to use. I know of a lot of individual set-ups that haven't been changed in a long time, even if the specific functions aren't used. They usually are not officially administered or maintained, in that no one is actually responsible for them.

I mind the Y2K bug, which was slightly different, but which revealed many very old software and some hardware that might have been affected by the end of the decimal epoch. The perceived vulnerability sprouted a substantial business for obsolete cobol and fortran programmers (much of which may have been unnecessary). Steve may have mentioned this in the same context.

I am curious if I remembered correctly -- that an http request can self identify the browser and version, and that the identity text can be arbitrarily set.
 
Yes, they can. HTTP headers usually identify the client or browser; User-Agent being the most prominent.

But headers are only included as a choice from the client, so they can easily be stripped or changed before sending. While it's possible to get a good analytics overview from a large sample, individually they are not reliable at all.
 
So much has changed in browser compatibility, they would likely be stuck in a sea of errors and unusably rendered pages.
Actually no. Basic HTML is all they really want. They dare not run any scripts because that would be dangerous (at a minimum they'd risk getting into infinite loops, or running coin mining or something.) Since most of them are just scraping human readable data, that is all right there in the basic HTML.
 
I have got a number of scripts running on customer's servers, but all they do is an html get and look for the response. If it is "200 OK" it knows the server is up and responding, if it is anything else, or nothing, it sends an alert.