Reading open source software code

AI Thread Summary
Open source software, like Emacs, allows for collaborative modification, but understanding the source code can be challenging, especially for newcomers. To effectively navigate and contribute to such projects, it's essential to engage with existing developers and familiarize oneself with the codebase gradually. New programmers should start by reviewing header files to identify functions and their inputs/outputs, taking detailed notes to track the code's flow. Understanding a large codebase can take considerable time, ranging from days for experienced programmers to months for novices. Utilizing tools such as profilers and debuggers can help identify critical functions and their interactions, allowing for a focused approach to learning specific subsystems without needing to grasp the entire codebase. Engaging with mailing lists can also be beneficial, but it's important to conduct preliminary research to formulate specific questions, as vague inquiries may lead to unhelpful responses.
Eus
Messages
93
Reaction score
0
Hi Ho! ^^v

I know that open source software can be reworked by many people.
Because of that, it grows rapidly.
Frankly, I wonder how people can rework it if they don't have a good technique in reading the source code.
When reading Emacs source code, I really confused from where I should work.

How about if I want to know what the original source code before any modification is? Is there any way to figure it out?

Is there any suggestions regarding this problem?

Thank you very much! ^_^
 
Computer science news on Phys.org
I think the "code development cycle" and "coding practices" are the things that are kept common so that all developers can understand each other.
 
what do "reworked" mean ?
 
Eus said:
Hi Ho! ^^v

I know that open source software can be reworked by many people.
Because of that, it grows rapidly.
Frankly, I wonder how people can rework it if they don't have a good technique in reading the source code.
When reading Emacs source code, I really confused from where I should work.

How about if I want to know what the original source code before any modification is? Is there any way to figure it out?

Is there any suggestions regarding this problem?

Thank you very much! ^_^

I think you are looking at this wrong. You expect professional programmers to be able to open a project like EMACS up and immediatly know what is going on. That never happens. Generally, a programmer new to a project will talk to programmers currently involved in the project. The new programmer will sit down with a pad a paper and a pencil and read through the header files (the .h files) to understand what functions are found inside and to see what inputs and outputs each function needs and returns. The new programmer will take a lot of notes while tracking the flow of the new project. The new programmer will not simply sit down and open up EMACS and understand what is going on.

This is true for large projects and small projects. If I were to read through a new project I'd do as I described above. Even if it was a small project or I was only working on a small portion of a larger project the above must still be followed. No one can sit down and immediatly know where to begin on a new project.

Reading through EMACS and understanding everything could take a few days or even weeks for a professional to a few months for a novice.
 
This is my technique. Your mileage may vary.

I use profilers and break points. Additionally most software trees also have a mechanism for generating trace messages (you just have to look through the docs to find the right switch or compile constant). Use the profiler and trace mechanism to run the program you are interested in under a couple of typical scenarios. Find out which functions are most heavily involved (require the most time, called most often) and understand them first. You can use this technique to find out what functions compose what subsystem and which functions call each other.
Next, if the system is really large run the process under a debugger, or attached a debugger to a running process, and put some break points in the section of code you are most interested in (say for instance emac's buffer switching mechanism). Upon the break, view the stack to understand how/why this function is called. Step the function to help understand what it does.
Then you should be reasonably familiar with whatever subsystem you need to modify to add/enhance your feature. NOTE: you do not need to understand the entire source tree!

As faust9 suggests, take notes and mailing lists are valuable resources. But only if you do your homework beforehand. If you ask a general question you will most likely only receive unhelpful or insulting replies.
 
This week, I saw a documentary done by the French called Les sacrifiés de l'IA, which was presented by a Canadian show Enquête. If you understand French I recommend it. Very eye-opening. I found a similar documentary in English called The Human Cost of AI: Data workers in the Global South. There is also an interview with Milagros Miceli (appearing in both documentaries) on Youtube: I also found a powerpoint presentation by the economist Uma Rani (appearing in the French documentary), AI...
Back
Top