Learning ASM

  • Be sure to checkout “Tips & Tricks”
    Dear Guest Visitor → Once you register and log-in please checkout the “Tips & Tricks” page for some very handy tips!

    /Steve.
  • BootAble – FreeDOS boot testing freeware

    To obtain direct, low-level access to a system's mass storage drives, SpinRite runs under a GRC-customized version of FreeDOS which has been modified to add compatibility with all file systems. In order to run SpinRite it must first be possible to boot FreeDOS.

    GRC's “BootAble” freeware allows anyone to easily create BIOS-bootable media in order to workout and confirm the details of getting a machine to boot FreeDOS through a BIOS. Once the means of doing that has been determined, the media created by SpinRite can be booted and run in the same way.

    The participants here, who have taken the time to share their knowledge and experience, their successes and some frustrations with booting their computers into FreeDOS, have created a valuable knowledgebase which will benefit everyone who follows.

    You may click on the image to the right to obtain your own copy of BootAble. Then use the knowledge and experience documented here to boot your computer(s) into FreeDOS. And please do not hesitate to ask questions – nowhere else can better answers be found.

    (You may permanently close this reminder with the 'X' in the upper right.)

I would try a few things all at the same time. 1) write down on a piece of paper what you want and why you want it. 2) try to contact the original organization of what you want if they still exist 3) think about if it is possible that any one besides yourself is aware or could become aware of what you are wanting to do and why 3) pick a few different search engines and try searching for what you want, if you think you found, it make sure it is correct if it is truly important.



As far as Microsoft product and services go, if you listen to SN, twit, or have an insight into commercialization of computer science, then I think it is basically impossible to ignore Microsoft’s influence.
 
Last edited:
I would try a few things all at the same time. 1) write down on a piece of paper what you want and why you want it. 2) try to contact the original organization of what you want if they still exist 3) think about if it is possible that any one besides yourself is aware or could become aware of what you are wanting to do and why 3) pick a few different search engines and try searching for what you want, if you think you found, it make sure it is correct if it is truly important.
I am sorry, but I don't even understand this response! Can you explain?
 
The MASM32 Project is probably the best maintained group around: http://www.masm32.com/
There are many terrific resources there, and I believe that the MASM32 downloads contain Microsoft's Macro Assembler itself. It also includes a really terrific collection of useful code libraries and an extensive set of forums.
 
Whats the differences between MASM and NASM?

Since I didn't know (and thus a warning I can't vouch for this answer) I asked ChatGPT 3.5 on a lark:

MASM (Microsoft Macro Assembler) and NASM (Netwide Assembler) are both popular assemblers used to convert assembly language code into machine code. Here are some key differences between the two:

Syntax:
MASM uses Intel syntax, which is more closely aligned with the syntax used in Intel's official documentation.
NASM uses a more flexible syntax that resembles traditional Unix assembly syntax. It is often referred to as "AT&T syntax."

Platform Support:
MASM is primarily associated with the Windows platform and is often used for developing Windows applications and drivers.
NASM is cross-platform and can generate code for a variety of operating systems, including Windows, Linux, macOS, and others.

Licensing:
MASM is proprietary software owned by Microsoft and is typically distributed as part of the Microsoft Visual Studio suite.
NASM is open-source software distributed under the GNU General Public License (GPL), which means it is free to use and modify.

Features:
MASM has been around for a long time and has extensive documentation and support for Windows-specific features and APIs.
NASM is known for its simplicity, speed, and portability. It may lack some of the advanced features found in MASM, but it is highly customizable and suitable for a wide range of applications.

Assembler Directives:
Both MASM and NASM support assembler directives, but the syntax and set of directives may differ between the two assemblers.

Community and Resources:
MASM has a large user base and extensive documentation provided by Microsoft. There are also many online resources and tutorials available for learning MASM programming.
NASM also has a strong community of users and contributors, with documentation and resources available on its official website and various online forums and communities.

Overall, the choice between MASM and NASM often depends on factors such as the target platform, licensing preferences, and familiarity with the syntax and features of each assembler.
 
  • Like
Reactions: CSPea
You know, I think I touched on this before. I like assembly, I am not very good at it, because I figure out a way to give it the time and attention it would require. I also like C++, at least when I watch the trainsec.net videos I notice C++ features that I don’t understand well. If you really want to learn x86 assembler, I would recommend the intel software developers manual. It covers a lot of concepts that would be required to learn assembly. When comparing different assemblers, at least x86 assemblers, I think there are two different syntaxs, intel and gas. Both NASM and MASM are intel syntax I think. Kip Irvine’s work is also very practical and if you slow down and read it slowly and carefully while well rested you will learn important assembly concepts. In my mind people that pay attention to details would be best at assembly (or really any computer programming).

Another resource, I read about half of it, is “Computer Systems” by Stanley Warford. He has also posted his college lectures on youtube. “Computer System” assembly language is not based on a real chip, it is a simplified theoretical instruction set.


 
It's not the assembly language that one needs to focus on. That's actually a fallacy. It's not a language, per se. Though some assemblers are languages, i.e. IBM Assembler F and especially Assembler H, for their macro languages. One needs to wrap one's head around the machine architecture.

Learning your first machine architecture will be a steep learning curve. Mine was IBM S/360 (and the S/370 series of machines) back in the day. When I picked up my first book on 8086 architecture, the idea of certain registers with only one function felt so foreign to me. The amd64 (x86_64) architecture includes a good number of general registers. Learning your second architecture will feel less steep than your first, i.e. if you already understand Intel x86 picking up ARM will be easier because all hardware generally works similarly to each other. It's a way of "thinking" about it.

IMO AMD architecture docs are an easier read than the Intel documentation. But in the beginning I found The 8086 Book by Russell Rector and George Alexy was a good Intel primer for me at the time. It's a bit dated but has excellent diagrams outlining the operation of each instruction. The diagrams are worth a thousand words. Moving from that to AMD or Intel documentation should make it less daunting.

Hope this helps.
 
When I picked up my first book on 8086 architecture, the idea of certain registers with only one function felt so foreign to me.
Right. This is very typical of microprocessor design, since it allows for a surprisingly significant reduction in chip complexity and code size by allowing for "implied" registers for many opcodes. Mainframe architectures tend to be for more “orthogonal.” One of the biggest problems created by non-orthogonal (microprocessor) architectures is that they tend to be hostile to higher-level compilation. When I'm coding for x86, I'm able to design my implementation around the specific needs (strong biases) created by the processor's design. The result is insanely efficient code, but it is not forward-looking. It's not the future.
 
Right. This is very typical of microprocessor design, since it allows for a surprisingly significant reduction in chip complexity and code size by allowing for "implied" registers for many opcodes. Mainframe architectures tend to be for more “orthogonal.” One of the biggest problems created by non-orthogonal (microprocessor) architectures is that they tend to be hostile to higher-level compilation. When I'm coding for x86, I'm able to design my implementation around the specific needs (strong biases) created by the processor's design. The result is insanely efficient code, but it is not forward-looking. It's not the future.
Except for the M68K, though it did have separate general data registers and general address registers. This was not nearly as restrictive as the 8086 was. The intel 64-bit architecture does have eight general registers in addition to the legacy registers. That was welcome.

z/Architecture is hostile to higher level languages such as C, C++, and the like, because these languages use the stack to pass arguments. On an architecture with no stack, the stack must be emulated using multiple instructions.

Agreed. The most efficient code I've ever written was in assembly.
 
  • Like
Reactions: hyperbole
photo_2024-05-02_22-45-38.jpg


This example of ASM code look super clean. Love how it starts with [BITS 16] and [ORG 0100H].
 
Last edited:
Yes, the few time I have followed an assembly program in a debugger it was much harder than in a high level language. A few days ago I did get a FreeDOS VM working and used MASM's CodeView to debug a program. It was a little bit difficult because I had to pop out CodeView into the dos program and then back to CodeView, everything was crowded on the screen in CodeView. It must have been much more difficult to write software back in the days of a single CRT monitor. I do think it is possible to have two DOS VMs and have one running the program and one running the debugger. What would be best would to be able to use a graphical environment like windows and have a debugger connect into DOS VM. I am not sure how that would be possible. But in any case, people still use assembly in modern operating system like Windows and Linux. There is a programs called WinASM that has a DOS template for assembler, I loaded it once, but it was not obvious how to use a debugger with that. For writing assembly programs in Windows, you can use WinDBG, it is not to hard to use. What do you have for an operating system to test these assembly programs? Modern Windows or Linux? A DOS virtual machine? Or maybe an old Pentium 1 from 1996?
 
View attachment 1208

This example of ASM code look super clean. Love how it starts with [BITS 16] and [ORG 0100H].
One thing about assembly programs on Windows is that when you write a console program for example and use C or C++, the main function is started by the runtime of the language (the C or C++ runtime) in assembler I think it is a little different, if I understand correctly, there really would not be any runtime like that. In Windows, I think when a program is started, it is NTDLL.dll that starts it and I think it does things like allocate a process and load up the required DLLs, maybe start the threads.

That program in the picture you posted is of a dos .com file. I think those were limited to 64K. I am not sure I know the exact reason for that. However the parts with int 21H is how I think the DOS API works, I think the exact function is specified in AH and then you call int 21h. The int instruction is used for some specific reason, which I am not sure I fully understand, but I do know that interrupt service routines are important in understanding operating systems. For example, I think some exceptions are defined with interrupt service routines. Such as divide by zero, I think that when a processor divides by zero, it would look up the interrupt service routine for than and run it. Therefore in theory you could write your own ISR for dividing by zero. You may have noticed I said “I think” many times here. That is because much of this I picked up by reading and not testing. So I am not sure if my understanding is correct.
 
Right. This is very typical of microprocessor design, since it allows for a surprisingly significant reduction in chip complexity and code size by allowing for "implied" registers for many opcodes. Mainframe architectures tend to be for more “orthogonal.” One of the biggest problems created by non-orthogonal (microprocessor) architectures is that they tend to be hostile to higher-level compilation. When I'm coding for x86, I'm able to design my implementation around the specific needs (strong biases) created by the processor's design. The result is insanely efficient code, but it is not forward-looking. It's not the future.


That is interesting, it is not the first time that I have heard something like that. I am not sure of the reasoning behind that. I do recall that you (Steve) have pointed out that you like x86 because it is a complex instruction set. Saying that writing assembly in for a RISC is not all that rewarding and too difficult I suppose. Maybe I will do a google search comparing Motorola 68000 to intel. This is the reason I’ve said before Steve should write book. (In all his spare time). I will have to check around on Amazon for microprocessor design books, maybe somebody covers these ideas that Steve and @cschuber are discussing. I will look for that 8086 Book by Russell Rector, but I am guessing is like an 80s book. I wish they would put all those 80s and 90s programming books on kindle. Thanks for your input @cshuber, keep it coming, sometimes learning things is easier in small chunks over time and sometimes it good and set aside sometime and do a “deep dive” I will think about that, learning the machine architecture and less the assembly “language”.

I guess with high level languages, I often suggest people focus on things like learning to use a debugger, the call stack, control structures, types and data structures, and threads.
 
That is interesting, it is not the first time that I have heard something like that. I am not sure of the reasoning behind that. I do recall that you (Steve) have pointed out that you like x86 because it is a complex instruction set. Saying that writing assembly in for a RISC is not all that rewarding and too difficult I suppose. Maybe I will do a google search comparing Motorola 68000 to intel. This is the reason I’ve said before Steve should write book. (In all his spare time). I will have to check around on Amazon for microprocessor design books, maybe somebody covers these ideas that Steve and @cschuber are discussing. I will look for that 8086 Book by Russell Rector, but I am guessing is like an 80s book. I wish they would put all those 80s and 90s programming books on kindle. Thanks for your input @cshuber, keep it coming, sometimes learning things is easier in small chunks over time and sometimes it good and set aside sometime and do a “deep dive” I will think about that, learning the machine architecture and less the assembly “language”.

I guess with high level languages, I often suggest people focus on things like learning to use a debugger, the call stack, control structures, types and data structures, and threads.
Having worked on CISC and RISC, both are rewarding. Though IMO working in IBM S/360, S/370, z/Series mainframes were most rewarding simply because I could do things like branch tables, i.e.,

Code:
         EX    R4,LOCSTART          LOCATE START OF NEXT TOKEN
         BC    7,*+4(R2)            SOMETHING FOUND
         B     MAINNFND             00 - NOTHING FOUND
         B     MAINRBRK             04 - RIGHT BRACKET FOUND
         B     MAINEQAL             08 - EQUAL SIGN FOUND
         B     MAINLBRK             0C - LEFT BRACKET FOUND
         B     MAINCOMA             10 - COMMA FOUND
         B     MAINSTRT             14 - START OF TOKEN FOUND
         B     MAINSPEC             18 - SPECIAL CHARACTER...
...
LOCSTART TRT   0(*-*,R3),0(R5)        LOCATE START OF NEXT TOKEN
...
#NXTTOKN HTRTAB (C')',4),(C'=',8),(C'(',X'0C'),(C',',X'10'),           X
               (C'A'-C'I',C'J'-C'R',C'S'-C'Z',C'0'-C'9',X'4A',C'.',    X
               C'<',C'|',C'&&',C'!',C'$',C'*',C';',X'5F',C'/',C'%',    X
               C'_',C'>',C'?',C':',C'#',C'@',C'''',C'"',X'14')

Here an instruction located a character and if matched translated it into an integer of multiple 4, placing it into register 2. This in turn was an index into a branch table immediately following. Thereby avoiding multiple compare and branch (jump) instructions. What would have normally taken many more than 14 instructions to implement this case construct in x86 (also a CISC architecture), it took only 8 instructions on S/370. (Note that HTRTAB is not a real instruction but a macro that defines a 256 byte table.) Try doing this with these few instructions on Intel or any other microprocessor today.

No, you can't compare CISC with RISC. Each architecture is different. Not all CISC architectures are the same.

The most satisfying thing about writing code in assembler is the satisfaction of reducing the code to the minimum instructions to do the job

Macro assemblers are certainly more satisfying than your average simple assembler.

I think I bought the book after 1986.

I should mention that Steve did say in the last podcast he uses exclusive or (XOR on Intel) to zero out registers because it's faster. Indeed on many architectures it is. But I do recall IBM releasing notes about some models of mainframes saying that on those models subtract register (SR) took fewer cycles than exclusive or (XR). Though, typically it's accepted practice to use exclusive or, one must keep in mind that this is not always the case and that it is model dependent; that some machines may take fewer cycles to execute subtract register than exclusive or. The take away from this is one must have intimate knowledge of the machine architecture one is working on.

Learning an assembler is just the tip of the iceberg. It is an exciting world to explore.
 
I have a couple of recommendations based on my own experiences and across a couple different learning styles, I hope they may be helpful to you.

Books​

If you prefer the classic/academic approach (reading books, e.t.c.) I started learning x86 assembly in the mid-2000's using this book:

1716638425316.png


Its focus is more on introducing you to programming with x86 assembly as the language, which I think is part of what makes it a particularly good book (and I imagine Steve would appreciate such an approach?).

You can buy it on Amazon in its most recent edition:

https://www.amazon.com/Programming-Ground-Up-Jonathan-Bartlett/dp/1616100648

(but unfortunately that version dropped the iconic purple cover)

The author was also very generous to release the original version under the GNU Free Documentation License, so you can find it for free on gnu.org:

https://download-mirror.savannah.gnu.org/releases/pgubook/

It's a bit old, but scanning over it again today I have a sense that it didn't age too poorly.

---

In recent years I also purchased and read this book:

1716639010344.png


(https://www.amazon.com/Introduction-Assembly-Programming-RISC-V/dp/6500158113)

Admittedly this one is short, not the absolute best quality and ultimately purpose-focused on RISC-V assembly, but I got a lot out of this book during the pandemic because I was looking for something to do and had purchased some RISC-V SOCs for some projects. If you're like me and you think RISC-V is the future of computing, you might enjoy this book. The cost of the book is helpfully very low.

Practical​

If you prefer a more practical or "hands on" approach, there is a game called "Turing Complete" which I found to be an absolute joy:

1716639583993.png


It sells itself as a "game" (which is partly true because it provides levels and goals in a game-like fashion) but it's really just a robust simulator for building logic gates and components to build your own computer essentially from scratch. The game will step you up from the basics, to building your own Arithmetic Logic Unit (ALU), to ultimately having a fully working computer with an assembly language you define yourself!

I love this game, and I keep coming back to it every few months. It's available DRM-free on GOG.com:

https://www.gog.com/game/turing_complete

(it runs great under wine, if you're on Linux)

Or if you can tolerate DRM and don't mind not really truly owning a copy of the game, it's available on Steam as well:

https://store.steampowered.com/app/1444480/Turing_Complete/

---

I noticed mention here about the 6502, and if you want to get truly hands-on I think this video series on building your own 6502 computer is amazing:

1716640875691.png


https://www.youtube.com/playlist?list=PLowKtXNTBypFbtuVMUVXNR0z1mu7dp7eH

This (like the game above) is a bit more focused on the fundamentals behind assembly programming than it is directly focusing on syntax and practices, but you mentioned that this is something you mostly wanted to do for fun, and I remember having a blast building this kit myself. The creator sells kits too, if you don't have components laying around or if you don't want to have to find everything over on Mouser or something like that (but you can otherwise source all the parts yourself if you prefer).

AI​

I never got formal training on assembly (when I was in college it was touched on, but we mostly learned C and Java) so I'm self taught. However (and while this may be contentious), I have found modern Large Language Models (LLMs) to be reasonably useful as tutors or guides when learning or refreshing yourself on programming topics such as x86 assembly, for instance using Mistral AI:

1716641868739.png


You can go on to ask it how to build the code, and it will do a pretty good job. Like humans however LLMs can produce inaccuracies, but generation over generation they get better and better. For instance, all I had to do was give it some of my compiler output (I was using the GNU assembler, it was providing NASM):

1716642393412.png


Again, not "perfect" and you definitely have to remain skeptical while using them but they do a good job the bulk of the time and importantly you can dig into very specific subjects and syntax very quickly (i.e. "explain what `.section` does, e.t.c.).

I personally use Mistral AI and recommend it because it's open-source (Apache 2.0 License) and I can run it locally, but many of the prominent providers will also give good results.

---

So those were some highlights from my own experiences, I hope they'll help you with your own!
 
  • Like
Reactions: Badrod