Question about input format file (for creating a net)

Click For Summary

Discussion Overview

The discussion revolves around the best format for an input text file to create a net in a C++ program. Participants explore various file formats including JSON, XML, CSV, and others, considering their advantages and disadvantages for defining nodes and connections in the net.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Exploratory

Main Points Raised

  • ORF suggests three formats: JSON, XML, and CSV, expressing a preference for JSON but noting potential issues with runtime net creation and input errors.
  • Another participant clarifies that CSV is not a primitive version of JSON, emphasizing its nature as comma-separated values and shares their preference for JSON based on personal experience.
  • A different viewpoint states that a net can be structured or tabular data, recommending CSV for specific node edits but JSON for ease of parsing when connections are complex.
  • XML is noted for its ability to label nodes with names and IDs, although parsing can be sensitive to errors in formatting.
  • Property files are proposed as an alternative, using key-value pairs to define node attributes and connections.
  • One participant mentions the potential for using an SQL database for larger datasets, suggesting it could alleviate size limitations of text-based formats.
  • A serialization scheme is also mentioned as a way to store node information in binary, which would be managed by the program without user interaction.
  • Concerns are raised about the difficulty of parsing JSON files compared to XML and CSV, questioning the availability of effective JSON libraries.

Areas of Agreement / Disagreement

Participants express differing opinions on the best format for the input file, with no clear consensus reached. Some favor JSON for its flexibility, while others prefer CSV or XML for their simplicity and ease of use.

Contextual Notes

Participants highlight limitations such as the potential for input errors in JSON and XML, the cumbersome nature of large text files, and the challenges of parsing JSON compared to other formats.

ORF
Messages
169
Reaction score
19
Hello

I am not sure about what should be the best format for a input text file. I think before start shooting in the dark, it's better to ask experienced people.

The (C++) program has to create a net. The nodes will be defined in a input text file.

For convenience, the net will be a bunch of std::vector (so each path inside the net, starting from the first node, will correspond to a single std::vector). I know a bunch of std::vector is not the best way of storing a net, but it's the best way for the function which will use the net.

The nodes may be connected up to 4 nodes.

I thought in three different input format, but I don't know what could be the best option, and if the input format can be optimized:

-JSON: the nodes can be defined one-by-one. In addition to the node information, the next nodes should be defined. The problem of defining the nodes individually is that you should create the net at runtime. The net is not defined in a visual/intuitive-way (that means the probability of writing a wrong input is high).

-XML: implementing hierarchy is relatively easy, but for such number of connections, the problem is similar to the previous option: the net is not defined in a visual/intuitive-way.

-CSV: as a primitive version of JSON format.

To sum up, I think JSON would be the best option, but a strict control section is needed. I don't know if there are other ways of defining a net/tree from a text file.

Thank you for your time.

Regards,
ORF
 
Technology news on Phys.org
CSV is not a primitive JSON, it's just comma separated values.

I would use JSON personally but that may be because I simply have the most experience with it. I use it all the time for nets, I have hundreds of neural networks whose state is saved in JSON files. You can create a net by adding a list of linked items in your object:

Code:
[
   "obj1" :{
      "links": []
   },
   "obj2" :{
      "links": [ "obj1" ]
   }
]
I use libjson to work with JSON in C++. It's fast and has an easy interface. I also wrote it.
 
  • Like
Likes   Reactions: ORF
A net can be defined as structured data or as tabular data.

JSON and XML are pretty much two ways to store structured data in a text file format.

CSV format is good for tabular data.

I would choose CSV if I only needed to edit specific node contents but not its connections. I also would only use CSV if each node had no more than n connections with being 6 or less like north south east west up down directions.

Otherwise I’d go with JSON as it is less of a hassle to parse when reading in.

One advantage XML brings is that you can label each node with a name and ID that you can use to locate its connecting nodes in your file. XML parsing is sensitive to proper begin and tagging and may get when you make a mistake during an edit.

Property files are another way you could consider where you use property keys that identify the node and it’s attributes.

Node1.text=xxxxxxxxx
Node1.east=Node5

...

In each of these cases you don’t really want to store more than 10,000 lines of text as that becomes cumbersome for a text editor.

The last option would be an application specific sql database using a table to hold your node information and using sql to traverse your net. This could free you from potential size limitations that the other schemes impose for in memory storage.

The last option would be to use a serialization scheme like in Java and store node info in binary. Your program builds and maintains the net and you never look at the binary serialization. You also need to mark nodes as you save them be sure you don’t get caught in a loop.
 
Personally, I haven't found an easy way to parse JSON input files, whereas there are generally builltin libraries to handle either of the other two options. If there is a good JSON input library, that it is fine.
 
  • Like
Likes   Reactions: ORF

Similar threads

  • · Replies 33 ·
2
Replies
33
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 11 ·
Replies
11
Views
8K
  • · Replies 9 ·
Replies
9
Views
5K
  • · Replies 9 ·
Replies
9
Views
12K
  • · Replies 5 ·
Replies
5
Views
9K
  • · Replies 2 ·
Replies
2
Views
22K
Replies
1
Views
3K