Learning data structures and algorithms in different programming languages

  • Thread starter Thread starter elias001
  • Start date Start date
  • Tags Tags
    Algorithms
Click For Summary
Learning data structures and algorithms in C provides a solid foundation, but transitioning to other languages like Java or Lisp may present challenges due to differences in paradigms and memory management. While C requires manual pointer management, modern languages often abstract these complexities, allowing for a focus on core concepts. Data structures are universal across languages, though terminology and implementation details may vary. Understanding recursion and the specific features of each language, such as object-oriented programming in Java, is crucial for successful implementation. Familiarity with graph theory can aid in understanding graph data structures, but its direct application in programming may be limited.
  • #61
@.Scott i think I will keep your advice in mind when I start learning about databases on my own.
 
Technology news on Phys.org
  • #62
I have deleted this post.
I didn't read the question correctly - and my response was off target.
See below for @sbrothy better response.
 
Last edited:
  • #63
@.Scott i am coming from the point of writing code and respecting memory management and safety issues. I did a quick Google search and apparently smart pointers is not in C, so does that mean C doesn't get the buff treatment compare to C++.
 
  • #64
elias001 said:
@.Scott i am coming from the point of writing code and respecting memory management and safety issues. I did a quick Google search and apparently smart pointers is not in C, so does that mean C doesn't get the buff treatment compare to C++.
Does C get the "buff treatment"? I can interpret that phrase in a few ways - in all cases the answer is "No, it doesn't".
Smart pointers is one way of avoiding memory leaks. It's not the only way. There are two ways that structures can be automatically cleaned up. One is with smart pointers that keep track of how many pointers exist for a given object. As soon as that count reaches zero, the object is deleted. Since deleting an object may involve more than just deallocating its memory, the destructors (a C++ feature) are one of the things that make smart pointers easy and effective in C++.
The other way is when a structure falls out of scope (out side enclosing squiggly { brackets }). This happens automatically in C and C++ - but does not apply to "malloc"ed objects.

When you say "buff treatment", you may be referring to what is considered "best practices" or "highest standards". "Best practices" means using tools appropriately - not going for overkill - and matching coding standards to the requirements. I have never seen smart pointers discussed in coding standards - but I could imagine certain situations where some kind of pro-smart-pointer standard could work.

As for "highest standards", these are used in life critical or mission critical applications - think transportation, medical, and military. In those situations, it's more important to get things right all the time without getting into mechanisms like smart pointers that could fix a memory leak at the expense of hiding a more fundamental coding problem. When the highest standards are needed, code reviews and static analysis tools (such as Coverity) are specified.
 
  • #65
elias001 said:
@.Scott i am coming from the point of writing code and respecting memory management and safety issues. I did a quick Google search and apparently smart pointers is not in C, so does that mean C doesn't get the buff treatment compare to C++.
Although C++ is more powerful, in the development of automotive radar at a former employer, C was used. Many of the C++ features were seen as violating MISRA requirements and were not allowed in the actual product code. So, in that case, "C" was viewed as "more buff".
 
  • #66
@.Scott as I learn more about pointers, I am sure i will revisit this topic later. I am glad I have folks here i can ask. So far I am learning what pointers do, and becareful about the issues of buffer overflows. I keep hearing from various youtubera on cyber security about C's problem with memory safety related issues and pointers and C are not taught anymore as first programming language for beginners. Also how buffer overflow is always associated to C. I am not at the point where i can find out how pointers if at all can be used to protect against buffer overflows. I know what it is. I do know that the concern has to do with as being an exploitable vulnerability.
 
  • #67
.Scott said:
Null pointers are available in both C and C++. You are the first person I have heard describe them as "safe" or "armor". They are more commonly associate with terms like "Achilles heel", vulnerable, dangerous. Of course, they are easy to use. Malloc some memory for a structure and cast the null pointer to a pointer for that structure.
If your primary programming interest are dragons, etc, then safety really isn't an issue. The worse that can happen is that the program will crash because of a bad pointer or memory leak and ruin the game.
With more serious applications, there is a tool called "Coverity" that can do a very decent job of tracking through how you are using null pointers (which it will complain about) and other pointers - calling out bad indices, memory leaks, and score of other issues.

Must admit that message also gave me pause. A null pointer is not a “feature” of C or C++. It’s good practice to initialize a pointer to NULL when creating it and check it’s value often and paranoically. Either of these are OK:

C:
void *p = 0;
void *p = NULL;

and in C++

C++:
void *p = nullptr;

assert(p); // Both C and C++ have this macro as <assert.h> and <cassert> respectively. As it’s a macro you don’t have the std:: in front in C++.

and if in C:

C:
assert(p);
if(!p) {
    perror(“Unexpected  NULL pointer (p): “);
    exit(EXIT_FAILURE);
}


and in C++:

C++:
assert(p);
if(!p)
    throw std::invalid_argument(“Unexpected NULL pointer (p).”); // or any other exception of your choice.

In C++ you have the

C++:
std::shared_ptr<Bespoke_Object> smart_ptr = std::make_shared<BespokeObject>( /*optional args to object constructor> */);

smart_ptr->bespoke_object_func();
smart_ptr.reset(new BespokeObject()); // or just plain reset if you’re the last client, else let it go out of scope.

Just my quick 2 cents. There’s much much more to the smart_ptr! Consult the reference.
 
Last edited:
  • Like
Likes Filip Larsen
  • #68
.Scott said:
design a relational DB for the Air Force procurement center - which, at the time, was an extensive part of the Wright-Patterson AFB in Ohio
Wow! That sounds like a mind boggling task! My hat's off to you!
 
  • #69
On the subject of language comparisons regarding speed, pointer safety, etc., I found this about Rust ( Why Everyone's Switching to Rust (And Why You Shouldn't) )interesting.
Rust seems to be the latest fad. It checks a lot at compile time. That brings back my PTSD, created by programming in Ada. ;-)
 
  • #70
@FactChecker I saw from a youtube videos where it was shown that the creator of C++ publicily rally the C++ community to defend C++ because it is under threat or some such due to the memory safety issues. I am getting a bunch of books on C pointers. By the way, there should be a special pointer in C called either 'the middle finger' or 'the bird'. 😉

Also you @Filip Larsen @.Scott @sbrothy and @jedishrfu can i ask you folks to comment or give your opinion about this post on some CS books. Thank you in advance.
 
  • #71
elias001 said:
@FactChecker I saw from a youtube videos where it was shown that the creator of C++ publicily rally the C++ community to defend C++ because it is under threat or some such due to the memory safety issues. I am getting a bunch of books on C pointers. By the way, there should be a special pointer in C called either 'the middle finger' or 'the bird'. 😉

Also you @Filip Larsen @.Scott @sbrothy and @jedishrfu can i ask you folks to comment or give your opinion about this post on some CS books. Thank you in advance.
There are many complaints about pointers - and all pointers used in C and C++ have a pro and con constituency - with the 'con' group finding them "too dangerous".
In the extreme, an integer such as 0xFF804500u might be cast to the a pointer to a large structure and then that pointer could be used to read and write from that "random" memory location. That's about as dangerous an ability as you can create. And yet, that is exactly what is done when you control a device using memory mapped registers - a very common method.

So, it's like a car or a shovel - potentially dangerous tools. So what do you do? Ban shovels and cars?

Hopefully you can help the users dig and drive safely.

It's the same with pointers. Any language that does not allow free use of pointers will be unable to support some applications. Or maybe you say for some certain applications, you stick to Python or some language that is very pointer controlled?

I would stick to the "driving a car" example. You might say that if the trip is less than a kilometer, you should do the "safe thing" and walk. But is walking all that safe - and in particular, is it safer than driving? It depends on the weather/health/neighborhood situation - and I would say it is best left to the traveler.

Similarly, I think the coding standards (including the selection of the coding language) should be left to the system and software designers/developers.
 
  • #72
.Scott said:
There are many complaints about pointers - and all pointers used in C and C++ have a pro and con constituency - with the 'con' group finding them "too dangerous".
In the extreme, an integer such as 0xFF804500u might be cast to the a pointer to a large structure and then that pointer could be used to read and write from that "random" memory location. That's about as dangerous an ability as you can create. And yet, that is exactly what is done when you control a device using memory mapped registers - a very common method.

So, it's like a car or a shovel - potentially dangerous tools. So what do you do? Ban shovels and cars?

Hopefully you can help the users dig and drive safely.

It's the same with pointers. Any language that does not allow free use of pointers will be unable to support some applications. Or maybe you say for some certain applications, you stick to Python or some language that is very pointer controlled?

I would stick to the "driving a car" example. You might say that if the trip is less than a kilometer, you should do the "safe thing" and walk. But is walking all that safe - and in particular, is it safer than driving? It depends on the weather/health/neighborhood situation - and I would say it is best left to the traveler.

Similarly, I think the coding standards (including the selection of the coding language) should be left to the system and software designers/developers.
Exactly, choosing the right tool for the job at hand. Procedural, functional, OO. Whatever fits.
 
  • #73
@.Scott and @sbrothy well I am not sure if it is the media or whoever, but they make it sound like programs and softwares that uses languages which uses manual memory management tools are more unsafe than a Bowser castle in a typical Mario Brothers game, and Hackers can exploit vulnerability and steal data like the way the Nintendo Kirby character sucks up bad guys.
 
  • #74
elias001 said:
@.Scott and @sbrothy well I am not sure if it is the media or whoever, but they make it sound like programs and softwares that uses languages which uses manual memory management tools are more unsafe than a Bowser castle in a typical Mario Brothers game, and Hackers can exploit vulnerability and steal data like the way the Nintendo Kirby character sucks up bad guys.
Hackers exploit careless coding - or simplistic passwords - or users that will had over control of the computer in exchange for a "Thank you". Unfortunately, under some conditions - as when Windows was being developed - there is a huge incentive to code carelessly.

If you drive into a tree, don't blame the tree, and don't blame the car. And don't say you didn't know the risks - or that only trains should be able to travel as fast as you did.
 
  • #75
elias001 said:
@.Scott and @sbrothy well I am not sure if it is the media or whoever, but they make it sound like programs and softwares that uses languages which uses manual memory management tools are more unsafe than a Bowser castle in a typical Mario Brothers game, and Hackers can exploit vulnerability and steal data like the way the Nintendo Kirby character sucks up bad guys.

Handling pointers takes care and discipline. Perhaps agreed upon procedures. But that is true of so many things. Hackers (or rather: “penetration testers” is the proper term - especially if you want to search and learn about it seriously) will exploit any weakness they can. Open ports, buffer overflow, SQL injection, in short: any lazy coding or environment/hardware weakness, eg rowhammer, heartbleed, spectre.

Handling things yourself gives you more control but also more responsibility.

EDIT: or “pentesting” to use the jargon.
 
Last edited:
  • #76
@.Scott when you say careless in coding. Actually, do many software developers/coders from when Windows was first released to now knew how windows were exploited, or have they even try to do it themselves. If windows was written in C, so was Linux. The latter had less vulnerability issues? I asked whether software developers had seen or trying to hack windows themselves, does it involve decompiling windows into its source code and manually edit the source code? i don't think that can be done. I think i am like the general public, when we learn about coding and vulnerability, as a beginninger, unless we have seen how not careful management of pointers, how that could lead to problems. Because all the talk about this being unsafe feels like talking about it in the abstract.

Is less real than talking about fatalities and casualty in war as a result of thr execution of certain military strategy.
 
  • #77
Linux had less of a problem because it was less of a target and because it had better review.
I believe the number one code vulnerability exploited by hackers is buffer overruns that are triggered by trusting that incoming data will be properly formed.

In general, trusting that any major interface will be used properly needs to be thought through. Even if the party on the other side of a library function call is internal and trusted (even if it is you), doing validation checks on the calls through that interface allows you to fence in any problems that you run into later. For example, if you write a trig library for yourself, you don't want arcsine(2) to result in a core dump, a memory leak, or any memory overwrite. It's not that you might turn evil on yourself - but if you make a mistake, you want to be able to track it down routinely.
Of course, if you have no reason to trust the other side of an interface, you need to be paranoid. If you receive a record over the internet that's 30 bytes containing a field that claims to be 3000 bytes long, don't malloc 30 bytes and then copy 3000 bytes into it. That may sounds bizarrely obvious, but that's exactly what hackers look for and find.
In a lot of cases, it's not the core operating system itself but software provided by app vendors or hardware vendors. In many cases, that software needs special OS access privileges - but the code is completely trusting - in fact, it a lot of cases it will only work under near perfect conditions. And it didn't help that Microsoft Windows started out with convoluted and poorly document hardware driver rules. MSDOS was simple enough. But from there through XP, the best source of documentation was often hunting for examples in the Microsoft source code to find code that did something similar to what you needed. That source was (like still is) available is special SDK packages.

But things are getting better. Static analysis tools can track through the code and flag any path where memory leaks, memory overwrites, and such are unchecked. So, if you want to write bullet-proof code, you can. All you need to do take that static reports serious and spend the time to determine exactly what they are reporting.

Of course, if you don't want to do static analysis - hackers can do it later.
 
  • #78
@.Scott i think a good analogy is the one you yse for driving. People learn to drive, but nobody had ever seen what a car crash look like or had never experienced one, of course, they all have heard about it. But that is for other dare devils to try it. All they are told when they are in driving lessons is that they should drive safely.

Also, how did hackers get access to MS source code? I thought the decompiling process would never given you back exactly the original source code that was exactly how developers wrote it.
 
  • #79
.Scott said:
Linux had less of a problem because it was less of a target and because it had better review.
I believe the number one code vulnerability exploited by hackers is buffer overruns that are triggered by trusting that incoming data will be properly formed.

In general, trusting that any major interface will be used properly needs to be thought through. Even if the party on the other side of a library function call is internal and trusted (even if it is you), doing validation checks on the calls through that interface allows you to fence in any problems that you run into later. For example, if you write a trig library for yourself, you don't want arcsine(2) to result in a core dump, a memory leak, or any memory overwrite. It's not that you might turn evil on yourself - but if you make a mistake, you want to be able to track it down routinely.
Of course, if you have no reason to trust the other side of an interface, you need to be paranoid. If you receive a record over the internet that's 30 bytes containing a field that claims to be 3000 bytes long, don't malloc 30 bytes and then copy 3000 bytes into it. That may sounds bizarrely obvious, but that's exactly what hackers look for and find.
In a lot of cases, it's not the core operating system itself but software provided by app vendors or hardware vendors. In many cases, that software needs special OS access privileges - but the code is completely trusting - in fact, it a lot of cases it will only work under near perfect conditions. And it didn't help that Microsoft Windows started out with convoluted and poorly document hardware driver rules. MSDOS was simple enough. But from there through XP, the best source of documentation was often hunting for examples in the Microsoft source code to find code that did something similar to what you needed. That source was (like still is) available is special SDK packages.

But things are getting better. Static analysis tools can track through the code and flag any path where memory leaks, memory overwrites, and such are unchecked. So, if you want to write bullet-proof code, you can. All you need to do take that static reports serious and spend the time to determine exactly what they are reporting.

Of course, if you don't want to do static analysis - hackers can do it later.

One of Linux’s strengths and weaknesses is the opensource concept. Blackhat pentesters are able to search for weaknesses but on the other hand the other side is constantly closing them. The real danger is probably the human element making configuration errors.
 
  • #80
I still don't understand how did hackers get MS windows source code? Also, there are thousands of files that make up windows. How did they know which one to decompile and try to read those source code. Sorry if I am asking an incorrect question in the sense that what I am describing us not probably how is done by hackers.
 
  • #81
elias001 said:
I still don't understand how did hackers get MS windows source code? Also, there are thousands of files that make up windows. How did they know which one to decompile and try to read those source code. Sorry if I am asking an incorrect question in the sense that what I am describing us not probably how is done by hackers.

I don’t know the particular case if there is one. Disassembling is one possibility, social manipulation or inside info are others.
.Scott said:
Linux had less of a problem because it was less of a target and because it had better review.
I believe the number one code vulnerability exploited by hackers is buffer overruns that are triggered by trusting that incoming data will be properly formed.

In general, trusting that any major interface will be used properly needs to be thought through. Even if the party on the other side of a library function call is internal and trusted (even if it is you), doing validation checks on the calls through that interface allows you to fence in any problems that you run into later. For example, if you write a trig library for yourself, you don't want arcsine(2) to result in a core dump, a memory leak, or any memory overwrite. It's not that you might turn evil on yourself - but if you make a mistake, you want to be able to track it down routinely.
Of course, if you have no reason to trust the other side of an interface, you need to be paranoid. If you receive a record over the internet that's 30 bytes containing a field that claims to be 3000 bytes long, don't malloc 30 bytes and then copy 3000 bytes into it. That may sounds bizarrely obvious, but that's exactly what hackers look for and find.
In a lot of cases, it's not the core operating system itself but software provided by app vendors or hardware vendors. In many cases, that software needs special OS access privileges - but the code is completely trusting - in fact, it a lot of cases it will only work under near perfect conditions. And it didn't help that Microsoft Windows started out with convoluted and poorly document hardware driver rules. MSDOS was simple enough. But from there through XP, the best source of documentation was often hunting for examples in the Microsoft source code to find code that did something similar to what you needed. That source was (like still is) available is special SDK packages.

But things are getting better. Static analysis tools can track through the code and flag any path where memory leaks, memory overwrites, and such are unchecked. So, if you want to write bullet-proof code, you can. All you need to do take that static reports serious and spend the time to determine exactly what they are reporting.

Of course, if you don't want to do static analysis - hackers can do it later.
Indeed; and you can be sure they will. But yeah, frequent checking and validating - to the point of paranoia (remember: you’re not paranoid if they really are out to get you!) - is good practice.
 
  • #82
sbrothy said:
I don’t know the particular case if there is one. Disassembling is one possibility
Back in the late 80s or so, the college where I worked had a class that covered the basics of computing on PCs. One part of the class dealt with common uses of DOS (either MSDOS or PCDOS) for dealing with files and directories; e.g., copy files, delete files, etc. This was before Windows really started to take off. One skill that was taught was how to rename a directory.

The procedure was as follows:
  1. Create a new directory with the desired name using MD (short for make directory).
  2. Copy the files from the old directory using CP (short for copy).
  3. Delete the files from the old directory using DEL (short for delete).
  4. Delete the old directory using RD (short for remove directory).
I had a copy of Norton Utilities, one utility of which was the capability of renaming directories. It seemed unrealistic to me that Norton would go through the procedure listed above, so I used a disassembler I had to look at a 32KB executable file that contained the directory rename code. At the time, DOS consisted of a bunch of external commands including the ones I listed above, as well as a lot of very low-level functionality that was available only to assembly language code. All of the low-level DOS commands used a specific assembly interrupt instruction, INT 21h in combination with certain register values to indicate which DOS functionality to execute.

With the disassembler I identified about 30 different places with INT 21h instructions, and looked at the register values just prior to the where the interrupts were executed. At one place I found that a low-level DOS instruction was used to rename a file, and that was the place where Norton Utilities was renaming a directory. It hadn't occurred to me before then, but after realizing that as far as DOS was concerned, the only difference between a file and a directory was a single bit set or not in the file attribute byte.

After discovering how Norton did things I was able to write my own utility, part in assembly, and part in C, that prompted the user for the names of the directory to change and the new name for the directory.

That access went out the window (pun intended) with Windows 95 and the change to a 32-bit code base. Programmers no longer had access to the INT 21h functionality.
 
  • #83
Mark44 said:
Back in the late 80s or so, the college where I worked had a class that covered the basics of computing on PCs. One part of the class dealt with common uses of DOS (either MSDOS or PCDOS) for dealing with files and directories; e.g., copy files, delete files, etc. This was before Windows really started to take off. One skill that was taught was how to rename a directory.

The procedure was as follows:
  1. Create a new directory with the desired name using MD (short for make directory).
  2. Copy the files from the old directory using CP (short for copy).
  3. Delete the files from the old directory using DEL (short for delete).
  4. Delete the old directory using RD (short for remove directory).
I had a copy of Norton Utilities, one utility of which was the capability of renaming directories. It seemed unrealistic to me that Norton would go through the procedure listed above, so I used a disassembler I had to look at a 32KB executable file that contained the directory rename code. At the time, DOS consisted of a bunch of external commands including the ones I listed above, as well as a lot of very low-level functionality that was available only to assembly language code. All of the low-level DOS commands used a specific assembly interrupt instruction, INT 21h in combination with certain register values to indicate which DOS functionality to execute.

With the disassembler I identified about 30 different places with INT 21h instructions, and looked at the register values just prior to the where the interrupts were executed. At one place I found that a low-level DOS instruction was used to rename a file, and that was the place where Norton Utilities was renaming a directory. It hadn't occurred to me before then, but after realizing that as far as DOS was concerned, the only difference between a file and a directory was a single bit set or not in the file attribute byte.

After discovering how Norton did things I was able to write my own utility, part in assembly, and part in C, that prompted the user for the names of the directory to change and the new name for the directory.

That access went out the window (pun intended) with Windows 95 and the change to a 32-bit code base. Programmers no longer had access to the INT 21h functionality.
In fact, breaking in to a Windows box is as simple as moving the hdd to another computer and access it as a slave. No big deal (unless it’s encrypted of course.)
 
Last edited:
  • #84
Also “deleting” files in Windows doesn’t necessarily physically erase a file from the HDD. Even “emptying the trashcan” isn’t a guarantee.

Linux at least has the shred command although it may be obsolete by now.
 
  • #86
Old person here who wrote code in a gazillion languages including close to the metal. Elias001 is ahead of the game by learning C and assembler to understand precisely how memory is being used in a program. However, as programs historically grew more complex and larger, people found that C and C++ were unsafe to use because the coder can (deliberately if a hacker, inadvertently if tired or distracted) destructively overwrite memory in active use. It was also very difficult to adapt C and assembler programs for different OS's and instruction set architectures.

Newer languages tend to run on a virtual machine. Virtual machines have been tuned to a high level these days and have huge advantages over actual hardware. The VM also is a standard that remains stable over time just like an instruction set architecture. If you don't understand why VMs are important, I suggest you divert your questions about specific languages into that direction for a time and research it instead.

And yes, despite it's popularity, Python has several disadvantages as a programming language. You generally won't find people talking about them because authors of books snd Wikipedia pages about Python are generally fans and not always looking with completely clear eyes. To name some drawbacks: it is not standardized so that new releases can break existing code; it is not suitable for high performance applications; coders are given so much freedom that it is not always precisely clear how the program is using memory and it is possible for programs to behave unpredictable as a result (duck typing).
 
  • Like
Likes berkeman and sbrothy
  • #87
harborsparrow said:
Old person here who wrote code in a gazillion languages including close to the metal. Elias001 is ahead of the game by learning C and assembler to understand precisely how memory is being used in a program. However, as programs historically grew more complex and larger, people found that C and C++ were unsafe to use because the coder can (deliberately if a hacker, inadvertently if tired or distracted) destructively overwrite memory in active use. It was also very difficult to adapt C and assembler programs for different OS's and instruction set architectures.

Newer languages tend to run on a virtual machine. Virtual machines have been tuned to a high level these days and have huge advantages over actual hardware. The VM also is a standard that remains stable over time just like an instruction set architecture. If you don't understand why VMs are important, I suggest you divert your questions about specific languages into that direction for a time and research it instead.

And yes, despite it's popularity, Python has several disadvantages as a programming language. You generally won't find people talking about them because authors of books snd Wikipedia pages about Python are generally fans and not always looking with completely clear eyes. To name some drawbacks: it is not standardized so that new releases can break existing code; it is not suitable for high performance applications; coders are given so much freedom that it is not always precisely clear how the program is using memory and it is possible for programs to behave unpredictable as a result (duck typing).
Heh. “Close to the metal.” :smile:
 

Similar threads

  • · Replies 43 ·
2
Replies
43
Views
6K
Replies
22
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 25 ·
Replies
25
Views
513
  • · Replies 107 ·
4
Replies
107
Views
9K
  • · Replies 13 ·
Replies
13
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
9
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
6K
  • · Replies 10 ·
Replies
10
Views
2K