Storing program source as relations in a database instead of text file

Click For Summary
Storing program source as relations in a database rather than text files offers several advantages, including enforcing strong typing rules, simplifying refactoring, and reducing redundancy in naming conventions. This approach allows for detailed tracking of dependencies between objects, which can enhance insights into program structure and behavior. While traditional programming languages primarily use text files, representing programs as relations could facilitate more efficient compilation and error detection during the editing process. However, the implementation of such a system may be complex and is likely more beneficial for large coding teams rather than individual programmers. Overall, the concept remains largely theoretical, with potential for innovation in programming environments.
  • #31
elcaro said:
It could be simply implemented as a text field as part of the object.
In other words, not as a relation in a database, just as source code text stored in a database instead of in a text file. Which is not what you originally described.

elcaro said:
there are different ways of representing or storing the actual code of a function body
So far the only way you have described is the way we all already know about: text. Storing the text in a database doesn't change the fact that it's text.

What you originally described, and what the title of this thread says, is storing program source as relations in a database. Relations in a database are not text.
 
Technology news on Phys.org
  • #32
elcaro said:
Both PostgreSQL and Oracle databases have stored procedures, which store the actual code of the procedure or function in the database.
These aren't relations either, so they aren't what you were originally describing.
 
  • #33
PeterDonis said:
These aren't relations either, so they aren't what you were originally describing.
It depends of course how far are you willing to go in breaking down the source text into relations, you could break it further down below the level of function body as nested blocks, each block as a sequence of statements, etc. But then you in fact get an abstract syntax tree, and maybe you would only leave it as that, so storing the original source text as text and the pre-parsed representation as AST. Otherwise, supposedly, it would get too complicated to represent as relations.
 
  • #34
PeterDonis said:
These aren't relations either, so they aren't what you were originally describing.
Partly they are, what gets stored for example is that proc A depends on proc B, for example.
 
  • #35
elcaro said:
It depends of course how far are you willing to go in breaking down the source text into relations
Your OP and thread title imply that everything is expressed in terms of relations. So if there is anything that isn't, you would need to break things down further.

elcaro said:
what gets stored for example is that proc A depends on proc B, for example.
That's storing the function call, not what either function actually does.
 
  • #36
elcaro said:
Otherwise, supposedly, it would get too complicated to represent as relations.
In other words, what you originally claimed in the OP and in the title of this thread is actually impossible?
 
  • #37
PeterDonis said:
In other words, what you originally claimed in the OP and in the title of this thread is actually impossible?
It would not be impossible in theory I guess. But it would imply your database is able to store the exact details of the AST of the grammer of the programming language (and specific version) you use. The issue is not so much if that is possible to implement, but what the benefit of that would be in terms of costs, etc.
 
  • #38
elcaro said:
it would imply your database is able to store the exact details of the AST of the grammer of the programming language (and specific version) you use.
You would be storing them as relations in the database. That's what your OP and thread title said. Every database can store relations. Are you now doubting that the exact details of an AST could be stored as relations in a database?
 
  • #39
PeterDonis said:
You would be storing them as relations in the database. That's what your OP and thread title said. Every database can store relations. Are you now doubting that the exact details of an AST could be stored as relations in a database?
Didn't say that. Theoretically it is doable, but quite complicated. The only issue is if that is worth doing and needed for the purpose of the tool.
 
  • #40
PeterDonis said:
You would be storing them as relations in the database. That's what your OP and thread title said. Every database can store relations. Are you now doubting that the exact details of an AST could be stored as relations in a database?
It is a similar problem as let's say storing adress information in a database. You could store it in specific tables, a table for country, city, street, postal code, etc., but that requires you have a complete table of all countries, all cities and all streets and postal codes that exist around the planet (and update them every time this info changes), for storing your adress information as relations. A more simple storage format is using text fields to represent the adress.

Would you then also argue that unless the adress info is also stored in relational form, there is no use in using a database for storing customer information?
 
  • #41
elcaro said:
Theoretically it is doable, but quite complicated. The only issue is if that is worth doing and needed for the purpose of the tool.
Since the stated purpose of the tool is to store program source as relations in a database, it would seem to me that if anything is not stored as relations, the purpose of the tool is not being met.

elcaro said:
It is a similar problem as let's say storing adress information in a database.
You didn't say "store program source in a database". You said "store program source as relations in a database". Big difference.
 
  • #42
PeterDonis said:
Since the stated purpose of the tool is to store program source as relations in a database, it would seem to me that if anything is not stored as relations, the purpose of the tool is not being met.You didn't say "store program source in a database". You said "store program source as relations in a database". Big difference.
Yes, but nether storing programs as text nor storing it as relations in a database is a goal in itself, the real goal is of course in terms of being able to develop and maintain programs in a better way, increasing productivity and lowering costs, etc.
 
  • #43
elcaro said:
nether storing programs as text nor storing it as relations in a database is a goal in itself, the real goal is of course in terms of being able to develop and maintain programs in a better way, increasing productivity and lowering costs, etc.
Then what is the point of this thread? Are we discussing storing program source as relations in a database, as the thread title says? Or are we just waving our hands about stuff that might or might not improve various things about programming?
 
  • #44
PeterDonis said:
Then what is the point of this thread? Are we discussing storing program source as relations in a database, as the thread title says? Or are we just waving our hands about stuff that might or might not improve various things about programming?
Database are used by definition for storing relations. If the only thing we did was using the database for storing text as text, we would not need a database.
But there is not perse a requirement that ALL relations are stored in a database (see the example I gave in a previous post about storing adress information in a customer system) for the use of a database to still be usefull.

The only thing we can argue about is wether or not that that is the case in this particular case for storing program source.

I gave some arguments why it still could be usefull (for generating automatic build/make scripts for example) for that purpose, without the requirement of breaking the program source down to relations at the deepest level.
 
  • #45
It sounds to me like you are talking about using a database to hold the information that is in a "set/use" table. There are problems with using that to represent the entire program code. Here are a couple of thoughts that I have on the subject:
1) As others have said, it is hard to imagine any other way than code to represent the details of what a function actually does. Something like "x=y+z" is already simply represented in the code.
2) A library is a database in that a calling program can look up functions it needs by name and execute it. There is a lot of work that has been done along those lines (including OOD) to allow the correct functions to be retrieved from the correct library.
 
  • #46
I think the only fundamental issue to answering this question is whether modern sql is Turing complete. I'll go with "maybe", but I think it would be painful and terribly inefficient to try and create a complex program with this approach.
 
  • #47
The OP mentions (and dismisses?) smalltalk computer language but also consider LISp and offshoots such as CLOS. (I mentally substituted lists for relations while attempting to understand the original question.) C and its derivatives and FORTRAN were designed to be sparse, particularly considering hardware limitations, then and now.

Very large software systems are commonly databased including templates for coding applications, platform dependent system code, test cases, and documentation.
 
  • #48
PeterDonis said:
Obviously you've never programmed in C and had to write a header file. I wish I could say the same, as it would mean I would have avoided much pain and frustration. :wink:
GADS. It's been so long since I've done C/C++ that I forgot all about it.
o:)
 
  • #49
Mark44 said:
This is common C and C++ parlance. It's basically a declaration of the function, including the number and types of parameters, and the return value type.
See post #48 directly above
 
  • #50
phinds said:
GADS. It's been so long since I've done C/C++ that I forgot all about it.
o:)
C++ is evolving rapidly on good ways.
 
  • #51
elcaro said:
Database are used by definition for storing relations.
They can store relations, but that is not the only thing they can store. The term "relational database" does not mean that the only thing the database can store is relations. It just means the database engine is optimized for dealing with relations.

elcaro said:
The only thing we can argue about is wether or not that that is the case in this particular case for storing program source.
I'm still confused about the point of this thread. By your OP and the title of the thread, I had understood you to be proposing that program source be stored as relations in a database. Is that what you're proposing, or not? Does the title of the thread need to be changed?
 
  • #52
PeterDonis said:
They can store relations, but that is not the only thing they can store. The term "relational database" does not mean that the only thing the database can store is relations. It just means the database engine is optimized for dealing with relations.I'm still confused about the point of this thread. By your OP and the title of the thread, I had understood you to be proposing that program source be stored as relations in a database. Is that what you're proposing, or not? Does the title of the thread need to be changed?
I was under the impression that the question is if a dbms can execute an arbitrary program. Otherwise, we have things like uml to model a lot of the concepts being referenced, which certainly can be stored in a database.
 
  • #53
valenumr said:
I was under the impression that the question is if a dbms can execute an arbitrary program.
I don't know where you're getting that impression, since it is not what either the thread title or the thread OP says.
 
  • #54
PeterDonis said:
I don't know where you're getting that impression, since it is not what either the thread title or the thread OP says.
Maybe a leap of logic. If the system can fully describe a program, it isn't a huge stretch to think that it can interpret it, but perhaps that's a stretch too far.
 
  • #55
elcaro said:
I gave some arguments why it still could be usefull (for generating automatic build/make scripts for example) for that purpose, without the requirement of breaking the program source down to relations at the deepest level.
But the thread title is "Storing program source as relations in a database instead of text file." Since you seem to have given up on that idea, it would probably be best to give this thread a different title.
 
  • #56
Here is where it gets a little weird to me. If one can store a program description, I think it follows that one should also be able to retrieve the program and all the information it contains. And if one can extract all information of the program as stored, one should be able to interpret it's intent.
 
  • #57
valenumr said:
Here is where it gets a little weird to me. If one can store a program description, I think it follows that one should also be able to retrieve the program and all the information it contains. And if one can extract all information of the program as stored, one should be able to interpret it's intent.
IMO, it is certainly impractical to store all the information in a relational database. I won't say it is impossible, but it is definitely beyond my ability to envision.
 
  • #58
Rhetorical Question:
How does the initial proposal vary from a Flow Chart?
 
  • #59
elcaro said:
...
But intrinsically a program can also be seen as a collection of relations between different objects, which can be stored as tuples in a relational database system.
...
The relations between parts of the programs structure no doubt could be stored as tuples in a relational database using SQL. If I remember correctly, the relational model insists that the mechanisms for database maintenance are also implemented and stored relationally so I guess the usual RDBMS is a sort of example already. BUT... most other tasks and programs have to use extensive sequential and procedural logic, which is not a database strength, in fact, even storing derived detail is frowned on by RDBMS analysis unless it is a speed optimisation of some sort.
I can see the logic in the idea, but practically it would be a nightmare. Programs would have to be written in SQL. Eg to add two variables - the SQL program would have to apply read locks, access the variables' current value in tables and use the + operator in the call, yada yada yada ... the SQL would be horrendous after a while and it would run very slowly, with most RDBMS not exactly optimised for non-standard use. And that is avoiding the question of defining operators' actions in what contexts and so on.

However the logic certainly has value - I have found Codd's excellent rules for RDBMS are also surprisingly good at promoting good program structure, when (loosely) applied to writing program code not anywhere near RDBMS :) !
Eg thinking of a line of code like a row of fields in a table, and transactions (as far as realistic) - a line should not have dependency within the line, a line of code should achieve one action only, sequential lines of a composite action should be in a defined block that can be tried as a single action or rolled back, and so on . (Not surprising when you think about it - just IMHO)
 
  • Informative
Likes Tom.G
  • #60
synch said:
the SQL program would have to apply read locks, access the variables' current value in tables and use the + operator in the call, yada yada yada ...
I must admit that I was interested in the OP idea and I had problems following this discussion as I wasn't sure I understood other people's criticisms. But I never got from the OP that the program would be run by SQL. Just like no program is run directly from a text file.

What I understood was that instead of saving a text file, you saved everything as objects stored in tables - with names like 'variables', 'operator', 'function', 'control_structure', 'function', 'classes', 'namespace', 'expression' - all of them having some attributes or requirements. Of course, it will still have to be compiled before being executed. The compiler will have to already understand that, for example, using the operator with operator_ID = 1 means that it must perform an addition (or whatever). That is no different than a compiler reading a text file and it understands that when reading a '+' means that it must perform an addition.

The interesting advantages cited in the OP are that you can:
  • follow the relations and enforce the rules before compiling
  • automate certain processes (ex.: if function A is used, called library X)
  • study relations in complex programs more easily
  • let programmers choose their own language (or maybe even their own grammar; you could choose that an expression is ended by a semicolon or a new line) when converting from/to a human-readable text file (maybe even reading/writing flow charts)
  • etc... (as I'm basically restating the OP)
It doesn't affect the speed of programming as, in the end, the SQL file must still be compiled to a binary file to be executed.

I guess the basic concept is that it would be storing the code in a state just before code optimization is done in a compiler, maybe even just before machine code generation. So preprocessing, lexical analysis, parsing, and semantic analysis are already done prior to saving the file. This would reduce file size and also reduce compilation time. Think of Javascript programs that are compiled (and maybe even downloaded) every time a web page is open.
 

Similar threads

Replies
7
Views
3K
  • · Replies 50 ·
2
Replies
50
Views
8K
  • · Replies 29 ·
Replies
29
Views
3K
  • · Replies 41 ·
2
Replies
41
Views
5K
  • · Replies 22 ·
Replies
22
Views
2K
Replies
81
Views
7K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K