Learning data structures and algorithms in different programming languages

  • Thread starter Thread starter elias001
  • Start date Start date
  • Tags Tags
    Algorithms
AI Thread Summary
Learning data structures and algorithms in C provides a solid foundation, but transitioning to other languages like Java or Lisp may present challenges due to differences in paradigms and memory management. While C requires manual pointer management, modern languages often abstract these complexities, allowing for a focus on core concepts. Data structures are universal across languages, though terminology and implementation details may vary. Understanding recursion and the specific features of each language, such as object-oriented programming in Java, is crucial for successful implementation. Familiarity with graph theory can aid in understanding graph data structures, but its direct application in programming may be limited.
  • #51
@sysprog1 , Thanks. I stand corrected. And I see in the link you gave that the author has several books on parallel algorithms and computations. I can see the connection to graph theory in that subject.
 
Technology news on Phys.org
  • #53
elias001 said:
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C.
Regardless of what language you use to implement your data structures, I suggest you brief yourself on database normalization. The one thing that is seldom covered in texts on this topic is that normalization is a design process. If, during your system design, you describe the data you will be working with in normal form, then you will be driven to ask design-critical questions about that data - questions that you would otherwise fail to run into until you were deep into the implementation. Once you have a fully normalized (or at least 3NF) description your data structures, then you might continue the design process by opting to selectively denormalizing parts of those database relations to optimize certain DB transactions over others.
 
  • Like
Likes FactChecker
  • #54
@.Scott , That's a very good point. One treacherous thing about data bases is that the need for normalization may not be obvious until a database is implemented and the data keeps changing, needing consistent updating. The only thing I would disagree about is that the texts I learned from (long ago) discussed this a great deal.
 
  • #55
@.Scott Isn't database normalization covered if students are taking an university course in database and get taught how to actually build one from scratch. I briefly look up the term and I is discussed in books on database systems and management.
 
  • #56
.Scott said:
[…] Almost anything has structures.
Javascript, C++, and some other languages (but not C) allow you to put functions within structures (or classes). […]

I’m just teasing here, but technically you can have function pointers in C structs. Although it doesn’t help towards proper encapsulation. Also, I can’t off the top of my head, find a good reason for doing so.

C:
typedef int (*PFooFunc)(int, int);
typedef struct _foo {
    int bar;
    char baz;
    PFooFunc func;
} FOO, *PFOO;

EDIT: hmm the typedefs are artifacts from my Windows days. Bad habit.
 
  • #57
sbrothy said:
technically you can have function pointers in C structs. Although it doesn’t help towards proper encapsulation. Also, I can’t off the top of my head, find a good reason for doing so.
It has been used quite a lot for adding polymophism and similar constructions to allow object-oriented design. For example, I recall the X11/Xlib client/server system written (originally) in C made extensive use of this to extend behavior.
 
  • #58
FactChecker said:
@.Scott , That's a very good point. One treacherous thing about data bases is that the need for normalization may not be obvious until a database is implemented and the data keeps changing, needing consistent updating. The only thing I would disagree about is that the texts I learned from (long ago) discussed this a great deal.
In reading Codd's original work, wiki articles, and quite a few more recent articles, I see very good technical descriptions of what a normalized DB looks like; what some of the technical procedures are to achieve that normalization description; and what the "selling points" are of a normalized DB over one that is not.

But that last item is the problem. There is a selling point to the normalization exercise that is unrelated to the characteristics of a normalized database. Even if you know that you will denormalize your DB (or even if you have a DB that you will not be normalizing), a normalization should be performed (or should have been performed) and documented on that data.

The "new" information that a normalization document will describe could include keys or fields that are not atomic with parts that do not belong in the same relation or which require checks or other processing.

More generally, the minimal amount of information required to determine whether a DB design is fully normalized requires a crystal-clear understanding of what that data is and how it is used - something that is quite beyond the technical skills of your "customers" to unambiguously describe. When the designers are confronted with the highly technical requirements for how normalization applies to a particular data field, they will find themselves carefully recasting these technical questions into "telling" questions digestible to their customers.

In the mid-1980's, I was tasked to lead a small team to design a relational DB for the Air Force procurement center - which, at the time, was an extensive part of the Wright-Patterson AFB in Ohio. I was eventually able to identify (all numbers are approximated from memory) 4200 originally-identified fields representing 1700 actual fields in 180 record types. The process involved interviews with over a hundred users and managers of the procurement process. The final document became a kind of "procurement bible" with Wright-Patterson immediately requesting 200 copies for distribution across the procurement center. It allowed anyone to find their datasets and trace out how it was connected to the overall procurement process.

That document did eventually result in both changes to the procurement process and further automation of that process - although the DB was so large and active that with 1980's technology it was several years before they could break away from a "batch" like process.
 
  • Like
Likes FactChecker
  • #59
@Filip Larsen @.Scott, @FactChecker There was this post about learning C++. What i want to know is i know there is something call managed/smart pointers in C++ and null pointers in C. Is the former better than the later. If deciding between using C vs C++ bases on memory safety issues is comparing buying best armor before going on your fantasy dragon/griffin/wyvern slaying or goblin/troll/orc/ogre/mega slime hunting adventure. Then choosing C because of null pointer feature is like choosing the best armor because of the material is made of and will max out your defense points. But choosing C++ because of smart or managed pointers is like picking the best armor that give you the most defense point increase but also increase points for your other attributes like your health and wisdom stats for magic casting as well as give you an increase immunity to various status effects that will kick in automatically should you experience any of those nasty status effects.
 
  • #60
elias001 said:
@.Scott Isn't database normalization covered if students are taking an university course in database and get taught how to actually build one from scratch. I briefly look up the term and I is discussed in books on database systems and management.
The last time I took any kind of data processing course was in 1973 - so I can't address your question directly.

There is certainly a limit on what can be presented in a classroom. In that sense, I was very fortunate to have the entire US Air Force procurement system available for interrogation - and had a paycheck coming in so I could spend my time on it.
 
  • #61
@.Scott i think I will keep your advice in mind when I start learning about databases on my own.
 
  • #62
I have deleted this post.
I didn't read the question correctly - and my response was off target.
See below for @sbrothy better response.
 
Last edited:
  • #63
@.Scott i am coming from the point of writing code and respecting memory management and safety issues. I did a quick Google search and apparently smart pointers is not in C, so does that mean C doesn't get the buff treatment compare to C++.
 
  • #64
elias001 said:
@.Scott i am coming from the point of writing code and respecting memory management and safety issues. I did a quick Google search and apparently smart pointers is not in C, so does that mean C doesn't get the buff treatment compare to C++.
Does C get the "buff treatment"? I can interpret that phrase in a few ways - in all cases the answer is "No, it doesn't".
Smart pointers is one way of avoiding memory leaks. It's not the only way. There are two ways that structures can be automatically cleaned up. One is with smart pointers that keep track of how many pointers exist for a given object. As soon as that count reaches zero, the object is deleted. Since deleting an object may involve more than just deallocating its memory, the destructors (a C++ feature) are one of the things that make smart pointers easy and effective in C++.
The other way is when a structure falls out of scope (out side enclosing squiggly { brackets }). This happens automatically in C and C++ - but does not apply to "malloc"ed objects.

When you say "buff treatment", you may be referring to what is considered "best practices" or "highest standards". "Best practices" means using tools appropriately - not going for overkill - and matching coding standards to the requirements. I have never seen smart pointers discussed in coding standards - but I could imagine certain situations where some kind of pro-smart-pointer standard could work.

As for "highest standards", these are used in life critical or mission critical applications - think transportation, medical, and military. In those situations, it's more important to get things right all the time without getting into mechanisms like smart pointers that could fix a memory leak at the expense of hiding a more fundamental coding problem. When the highest standards are needed, code reviews and static analysis tools (such as Coverity) are specified.
 
  • #65
elias001 said:
@.Scott i am coming from the point of writing code and respecting memory management and safety issues. I did a quick Google search and apparently smart pointers is not in C, so does that mean C doesn't get the buff treatment compare to C++.
Although C++ is more powerful, in the development of automotive radar at a former employer, C was used. Many of the C++ features were seen as violating MISRA requirements and were not allowed in the actual product code. So, in that case, "C" was viewed as "more buff".
 
  • #66
@.Scott as I learn more about pointers, I am sure i will revisit this topic later. I am glad I have folks here i can ask. So far I am learning what pointers do, and becareful about the issues of buffer overflows. I keep hearing from various youtubera on cyber security about C's problem with memory safety related issues and pointers and C are not taught anymore as first programming language for beginners. Also how buffer overflow is always associated to C. I am not at the point where i can find out how pointers if at all can be used to protect against buffer overflows. I know what it is. I do know that the concern has to do with as being an exploitable vulnerability.
 
  • #67
.Scott said:
Null pointers are available in both C and C++. You are the first person I have heard describe them as "safe" or "armor". They are more commonly associate with terms like "Achilles heel", vulnerable, dangerous. Of course, they are easy to use. Malloc some memory for a structure and cast the null pointer to a pointer for that structure.
If your primary programming interest are dragons, etc, then safety really isn't an issue. The worse that can happen is that the program will crash because of a bad pointer or memory leak and ruin the game.
With more serious applications, there is a tool called "Coverity" that can do a very decent job of tracking through how you are using null pointers (which it will complain about) and other pointers - calling out bad indices, memory leaks, and score of other issues.

Must admit that message also gave me pause. A null pointer is not a “feature” of C or C++. It’s good practice to initialize a pointer to NULL when creating it and check it’s value often and paranoically. Either of these are OK:

C:
void *p = 0;
void *p = NULL;

and in C++

C++:
void *p = nullptr;

assert(p); // Both C and C++ have this macro as <assert.h> and <cassert> respectively. As it’s a macro you don’t have the std:: in front in C++.

and if in C:

C:
assert(p);
if(!p) {
    perror(“Unexpected  NULL pointer (p): “);
    exit(EXIT_FAILURE);
}


and in C++:

C++:
assert(p);
if(!p)
    throw std::invalid_argument(“Unexpected NULL pointer (p).”); // or any other exception of your choice.

In C++ you have the

C++:
std::shared_ptr<Bespoke_Object> smart_ptr = std::make_shared<BespokeObject>( /*optional args to object constructor> */);

smart_ptr->bespoke_object_func();
smart_ptr.reset(new BespokeObject()); // or just plain reset if you’re the last client, else let it go out of scope.

Just my quick 2 cents. There’s much much more to the smart_ptr! Consult the reference.
 
Last edited:
  • Like
Likes Filip Larsen
  • #68
.Scott said:
design a relational DB for the Air Force procurement center - which, at the time, was an extensive part of the Wright-Patterson AFB in Ohio
Wow! That sounds like a mind boggling task! My hat's off to you!
 
  • #69
On the subject of language comparisons regarding speed, pointer safety, etc., I found this about Rust ( Why Everyone's Switching to Rust (And Why You Shouldn't) )interesting.
Rust seems to be the latest fad. It checks a lot at compile time. That brings back my PTSD, created by programming in Ada. ;-)
 
  • #70
@FactChecker I saw from a youtube videos where it was shown that the creator of C++ publicily rally the C++ community to defend C++ because it is under threat or some such due to the memory safety issues. I am getting a bunch of books on C pointers. By the way, there should be a special pointer in C called either 'the middle finger' or 'the bird'. 😉

Also you @Filip Larsen @.Scott @sbrothy and @jedishrfu can i ask you folks to comment or give your opinion about this post on some CS books. Thank you in advance.
 
  • #71
elias001 said:
@FactChecker I saw from a youtube videos where it was shown that the creator of C++ publicily rally the C++ community to defend C++ because it is under threat or some such due to the memory safety issues. I am getting a bunch of books on C pointers. By the way, there should be a special pointer in C called either 'the middle finger' or 'the bird'. 😉

Also you @Filip Larsen @.Scott @sbrothy and @jedishrfu can i ask you folks to comment or give your opinion about this post on some CS books. Thank you in advance.
There are many complaints about pointers - and all pointers used in C and C++ have a pro and con constituency - with the 'con' group finding them "too dangerous".
In the extreme, an integer such as 0xFF804500u might be cast to the a pointer to a large structure and then that pointer could be used to read and write from that "random" memory location. That's about as dangerous an ability as you can create. And yet, that is exactly what is done when you control a device using memory mapped registers - a very common method.

So, it's like a car or a shovel - potentially dangerous tools. So what do you do? Ban shovels and cars?

Hopefully you can help the users dig and drive safely.

It's the same with pointers. Any language that does not allow free use of pointers will be unable to support some applications. Or maybe you say for some certain applications, you stick to Python or some language that is very pointer controlled?

I would stick to the "driving a car" example. You might say that if the trip is less than a kilometer, you should do the "safe thing" and walk. But is walking all that safe - and in particular, is it safer than driving? It depends on the weather/health/neighborhood situation - and I would say it is best left to the traveler.

Similarly, I think the coding standards (including the selection of the coding language) should be left to the system and software designers/developers.
 
  • #72
.Scott said:
There are many complaints about pointers - and all pointers used in C and C++ have a pro and con constituency - with the 'con' group finding them "too dangerous".
In the extreme, an integer such as 0xFF804500u might be cast to the a pointer to a large structure and then that pointer could be used to read and write from that "random" memory location. That's about as dangerous an ability as you can create. And yet, that is exactly what is done when you control a device using memory mapped registers - a very common method.

So, it's like a car or a shovel - potentially dangerous tools. So what do you do? Ban shovels and cars?

Hopefully you can help the users dig and drive safely.

It's the same with pointers. Any language that does not allow free use of pointers will be unable to support some applications. Or maybe you say for some certain applications, you stick to Python or some language that is very pointer controlled?

I would stick to the "driving a car" example. You might say that if the trip is less than a kilometer, you should do the "safe thing" and walk. But is walking all that safe - and in particular, is it safer than driving? It depends on the weather/health/neighborhood situation - and I would say it is best left to the traveler.

Similarly, I think the coding standards (including the selection of the coding language) should be left to the system and software designers/developers.
Exactly, choosing the right tool for the job at hand. Procedural, functional, OO. Whatever fits.
 
  • #73
@.Scott and @sbrothy well I am not sure if it is the media or whoever, but they make it sound like programs and softwares that uses languages which uses manual memory management tools are more unsafe than a Bowser castle in a typical Mario Brothers game, and Hackers can exploit vulnerability and steal data like the way the Nintendo Kirby character sucks up bad guys.
 
  • #74
elias001 said:
@.Scott and @sbrothy well I am not sure if it is the media or whoever, but they make it sound like programs and softwares that uses languages which uses manual memory management tools are more unsafe than a Bowser castle in a typical Mario Brothers game, and Hackers can exploit vulnerability and steal data like the way the Nintendo Kirby character sucks up bad guys.
Hackers exploit careless coding - or simplistic passwords - or users that will had over control of the computer in exchange for a "Thank you". Unfortunately, under some conditions - as when Windows was being developed - there is a huge incentive to code carelessly.

If you drive into a tree, don't blame the tree, and don't blame the car. And don't say you didn't know the risks - or that only trains should be able to travel as fast as you did.
 
  • #75
elias001 said:
@.Scott and @sbrothy well I am not sure if it is the media or whoever, but they make it sound like programs and softwares that uses languages which uses manual memory management tools are more unsafe than a Bowser castle in a typical Mario Brothers game, and Hackers can exploit vulnerability and steal data like the way the Nintendo Kirby character sucks up bad guys.

Handling pointers takes care and discipline. Perhaps agreed upon procedures. But that is true of so many things. Hackers (or rather: “penetration testers” is the proper term - especially if you want to search and learn about it seriously) will exploit any weakness they can. Open ports, buffer overflow, SQL injection, in short: any lazy coding or environment/hardware weakness, eg rowhammer, heartbleed, spectre.

Handling things yourself gives you more control but also more responsibility.

EDIT: or “pentesting” to use the jargon.
 
Last edited:
  • #76
@.Scott when you say careless in coding. Actually, do many software developers/coders from when Windows was first released to now knew how windows were exploited, or have they even try to do it themselves. If windows was written in C, so was Linux. The latter had less vulnerability issues? I asked whether software developers had seen or trying to hack windows themselves, does it involve decompiling windows into its source code and manually edit the source code? i don't think that can be done. I think i am like the general public, when we learn about coding and vulnerability, as a beginninger, unless we have seen how not careful management of pointers, how that could lead to problems. Because all the talk about this being unsafe feels like talking about it in the abstract.

Is less real than talking about fatalities and casualty in war as a result of thr execution of certain military strategy.
 
  • #77
Linux had less of a problem because it was less of a target and because it had better review.
I believe the number one code vulnerability exploited by hackers is buffer overruns that are triggered by trusting that incoming data will be properly formed.

In general, trusting that any major interface will be used properly needs to be thought through. Even if the party on the other side of a library function call is internal and trusted (even if it is you), doing validation checks on the calls through that interface allows you to fence in any problems that you run into later. For example, if you write a trig library for yourself, you don't want arcsine(2) to result in a core dump, a memory leak, or any memory overwrite. It's not that you might turn evil on yourself - but if you make a mistake, you want to be able to track it down routinely.
Of course, if you have no reason to trust the other side of an interface, you need to be paranoid. If you receive a record over the internet that's 30 bytes containing a field that claims to be 3000 bytes long, don't malloc 30 bytes and then copy 3000 bytes into it. That may sounds bizarrely obvious, but that's exactly what hackers look for and find.
In a lot of cases, it's not the core operating system itself but software provided by app vendors or hardware vendors. In many cases, that software needs special OS access privileges - but the code is completely trusting - in fact, it a lot of cases it will only work under near perfect conditions. And it didn't help that Microsoft Windows started out with convoluted and poorly document hardware driver rules. MSDOS was simple enough. But from there through XP, the best source of documentation was often hunting for examples in the Microsoft source code to find code that did something similar to what you needed. That source was (like still is) available is special SDK packages.

But things are getting better. Static analysis tools can track through the code and flag any path where memory leaks, memory overwrites, and such are unchecked. So, if you want to write bullet-proof code, you can. All you need to do take that static reports serious and spend the time to determine exactly what they are reporting.

Of course, if you don't want to do static analysis - hackers can do it later.
 
  • #78
@.Scott i think a good analogy is the one you yse for driving. People learn to drive, but nobody had ever seen what a car crash look like or had never experienced one, of course, they all have heard about it. But that is for other dare devils to try it. All they are told when they are in driving lessons is that they should drive safely.

Also, how did hackers get access to MS source code? I thought the decompiling process would never given you back exactly the original source code that was exactly how developers wrote it.
 
  • #79
.Scott said:
Linux had less of a problem because it was less of a target and because it had better review.
I believe the number one code vulnerability exploited by hackers is buffer overruns that are triggered by trusting that incoming data will be properly formed.

In general, trusting that any major interface will be used properly needs to be thought through. Even if the party on the other side of a library function call is internal and trusted (even if it is you), doing validation checks on the calls through that interface allows you to fence in any problems that you run into later. For example, if you write a trig library for yourself, you don't want arcsine(2) to result in a core dump, a memory leak, or any memory overwrite. It's not that you might turn evil on yourself - but if you make a mistake, you want to be able to track it down routinely.
Of course, if you have no reason to trust the other side of an interface, you need to be paranoid. If you receive a record over the internet that's 30 bytes containing a field that claims to be 3000 bytes long, don't malloc 30 bytes and then copy 3000 bytes into it. That may sounds bizarrely obvious, but that's exactly what hackers look for and find.
In a lot of cases, it's not the core operating system itself but software provided by app vendors or hardware vendors. In many cases, that software needs special OS access privileges - but the code is completely trusting - in fact, it a lot of cases it will only work under near perfect conditions. And it didn't help that Microsoft Windows started out with convoluted and poorly document hardware driver rules. MSDOS was simple enough. But from there through XP, the best source of documentation was often hunting for examples in the Microsoft source code to find code that did something similar to what you needed. That source was (like still is) available is special SDK packages.

But things are getting better. Static analysis tools can track through the code and flag any path where memory leaks, memory overwrites, and such are unchecked. So, if you want to write bullet-proof code, you can. All you need to do take that static reports serious and spend the time to determine exactly what they are reporting.

Of course, if you don't want to do static analysis - hackers can do it later.

One of Linux’s strengths and weaknesses is the opensource concept. Blackhat pentesters are able to search for weaknesses but on the other hand the other side is constantly closing them. The real danger is probably the human element making configuration errors.
 
  • #80
I still don't understand how did hackers get MS windows source code? Also, there are thousands of files that make up windows. How did they know which one to decompile and try to read those source code. Sorry if I am asking an incorrect question in the sense that what I am describing us not probably how is done by hackers.
 
  • #81
elias001 said:
I still don't understand how did hackers get MS windows source code? Also, there are thousands of files that make up windows. How did they know which one to decompile and try to read those source code. Sorry if I am asking an incorrect question in the sense that what I am describing us not probably how is done by hackers.

I don’t know the particular case if there is one. Disassembling is one possibility, social manipulation or inside info are others.
.Scott said:
Linux had less of a problem because it was less of a target and because it had better review.
I believe the number one code vulnerability exploited by hackers is buffer overruns that are triggered by trusting that incoming data will be properly formed.

In general, trusting that any major interface will be used properly needs to be thought through. Even if the party on the other side of a library function call is internal and trusted (even if it is you), doing validation checks on the calls through that interface allows you to fence in any problems that you run into later. For example, if you write a trig library for yourself, you don't want arcsine(2) to result in a core dump, a memory leak, or any memory overwrite. It's not that you might turn evil on yourself - but if you make a mistake, you want to be able to track it down routinely.
Of course, if you have no reason to trust the other side of an interface, you need to be paranoid. If you receive a record over the internet that's 30 bytes containing a field that claims to be 3000 bytes long, don't malloc 30 bytes and then copy 3000 bytes into it. That may sounds bizarrely obvious, but that's exactly what hackers look for and find.
In a lot of cases, it's not the core operating system itself but software provided by app vendors or hardware vendors. In many cases, that software needs special OS access privileges - but the code is completely trusting - in fact, it a lot of cases it will only work under near perfect conditions. And it didn't help that Microsoft Windows started out with convoluted and poorly document hardware driver rules. MSDOS was simple enough. But from there through XP, the best source of documentation was often hunting for examples in the Microsoft source code to find code that did something similar to what you needed. That source was (like still is) available is special SDK packages.

But things are getting better. Static analysis tools can track through the code and flag any path where memory leaks, memory overwrites, and such are unchecked. So, if you want to write bullet-proof code, you can. All you need to do take that static reports serious and spend the time to determine exactly what they are reporting.

Of course, if you don't want to do static analysis - hackers can do it later.
Indeed; and you can be sure they will. But yeah, frequent checking and validating - to the point of paranoia (remember: you’re not paranoid if they really are out to get you!) - is good practice.
 
  • #82
sbrothy said:
I don’t know the particular case if there is one. Disassembling is one possibility
Back in the late 80s or so, the college where I worked had a class that covered the basics of computing on PCs. One part of the class dealt with common uses of DOS (either MSDOS or PCDOS) for dealing with files and directories; e.g., copy files, delete files, etc. This was before Windows really started to take off. One skill that was taught was how to rename a directory.

The procedure was as follows:
  1. Create a new directory with the desired name using MD (short for make directory).
  2. Copy the files from the old directory using CP (short for copy).
  3. Delete the files from the old directory using DEL (short for delete).
  4. Delete the old directory using RD (short for remove directory).
I had a copy of Norton Utilities, one utility of which was the capability of renaming directories. It seemed unrealistic to me that Norton would go through the procedure listed above, so I used a disassembler I had to look at a 32KB executable file that contained the directory rename code. At the time, DOS consisted of a bunch of external commands including the ones I listed above, as well as a lot of very low-level functionality that was available only to assembly language code. All of the low-level DOS commands used a specific assembly interrupt instruction, INT 21h in combination with certain register values to indicate which DOS functionality to execute.

With the disassembler I identified about 30 different places with INT 21h instructions, and looked at the register values just prior to the where the interrupts were executed. At one place I found that a low-level DOS instruction was used to rename a file, and that was the place where Norton Utilities was renaming a directory. It hadn't occurred to me before then, but after realizing that as far as DOS was concerned, the only difference between a file and a directory was a single bit set or not in the file attribute byte.

After discovering how Norton did things I was able to write my own utility, part in assembly, and part in C, that prompted the user for the names of the directory to change and the new name for the directory.

That access went out the window (pun intended) with Windows 95 and the change to a 32-bit code base. Programmers no longer had access to the INT 21h functionality.
 
  • #83
Mark44 said:
Back in the late 80s or so, the college where I worked had a class that covered the basics of computing on PCs. One part of the class dealt with common uses of DOS (either MSDOS or PCDOS) for dealing with files and directories; e.g., copy files, delete files, etc. This was before Windows really started to take off. One skill that was taught was how to rename a directory.

The procedure was as follows:
  1. Create a new directory with the desired name using MD (short for make directory).
  2. Copy the files from the old directory using CP (short for copy).
  3. Delete the files from the old directory using DEL (short for delete).
  4. Delete the old directory using RD (short for remove directory).
I had a copy of Norton Utilities, one utility of which was the capability of renaming directories. It seemed unrealistic to me that Norton would go through the procedure listed above, so I used a disassembler I had to look at a 32KB executable file that contained the directory rename code. At the time, DOS consisted of a bunch of external commands including the ones I listed above, as well as a lot of very low-level functionality that was available only to assembly language code. All of the low-level DOS commands used a specific assembly interrupt instruction, INT 21h in combination with certain register values to indicate which DOS functionality to execute.

With the disassembler I identified about 30 different places with INT 21h instructions, and looked at the register values just prior to the where the interrupts were executed. At one place I found that a low-level DOS instruction was used to rename a file, and that was the place where Norton Utilities was renaming a directory. It hadn't occurred to me before then, but after realizing that as far as DOS was concerned, the only difference between a file and a directory was a single bit set or not in the file attribute byte.

After discovering how Norton did things I was able to write my own utility, part in assembly, and part in C, that prompted the user for the names of the directory to change and the new name for the directory.

That access went out the window (pun intended) with Windows 95 and the change to a 32-bit code base. Programmers no longer had access to the INT 21h functionality.
In fact, breaking in to a Windows box is as simple as moving the hdd to another computer and access it as a slave. No big deal (unless it’s encrypted of course.)
 
Last edited:
  • #84
Also “deleting” files in Windows doesn’t necessarily physically erase a file from the HDD. Even “emptying the trashcan” isn’t a guarantee.

Linux at least has the shred command although it may be obsolete by now.
 
Back
Top