Why do programming languages usually not implement number types with units?

Vanadium 50 · Sep 26, 2021

Jarvis323 said:

There is actually a boost library to do this.

My understanding from reading the description is that this would be perfectly happy adding an energy to a torque.

Jarvis323 · Sep 26, 2021

Vanadium 50 said:

My understanding from reading the description is that this would be perfectly happy adding an energy to a torque.

I think it allows you to define what can be added to what. But it will give a compile time error if the evaluated expression type cannot be converted to the return type. So if energy and torque have a conversion defined, so that one can be converted to the other, then you could add them, otherwise you would get a compiler error.

Vanadium 50 · Sep 26, 2021

Energy and torque have the same dimensions, but are not the same thing, so one cannot convert between them.

As I understand it, that Boost library will throw an error if I attempt to add an energy to a position, but not when I attempt to add an energy to a torque. Or a Reynolds number to a Mach number.

Baluncore · Sep 26, 2021

Vanadium 50 said:

My understanding from reading the description is that this would be perfectly happy adding an energy to a torque.

That is true. It would also be happy to add your height to your chest measurement, and to subtract that from the distance to Paris. Dimensional analysis cannot be relied upon to trap all silly errors.

elcaro · Sep 26, 2021

Vanadium 50 said:

Energy and torque have the same dimensions, but are not the same thing, so one cannot convert between them.

As I understand it, that Boost library will throw an error if I attempt to add an energy to a position, but not when I attempt to add an energy to a torque. Or a Reynolds number to a Mach number.

The type system could treat them then as distinct types despite them having the same physical dimensions (units). The backdraw of that is that any calculation that produces something with the dimension of energy or torgue, you would have to specify which type you intend it to be.

Baluncore · Sep 26, 2021

elcaro said:

The backdraw of that is that any calculation that produces something with the dimension of energy or torgue, you would have to specify which type you intend it to be.

The solution to the problem is forever expanding. It is actually better to verify that the dimensions are correct before you waste time writing code. Simply plugging numbers into equations without careful thought, is not something to be encouraged.

Jarvis323 · Sep 26, 2021

Vanadium 50 said:

Energy and torque have the same dimensions, but are not the same thing, so one cannot convert between them.

As I understand it, that Boost library will throw an error if I attempt to add an energy to a position, but not when I attempt to add an energy to a torque. Or a Reynolds number to a Mach number.

You could differentiate those types in a special way such that no new conversion functions are required, but you have compiler flags to either give an error or a warning if the compiler trys to convert one to the other.

e.g. something like this

delineate unit torque as Nm from joule warn
torque theTorqu = calculateTorque()
joule energy = theTorque

//compiler says warning: implicit conversion of delineated unit torque as Nm to joule.

elcaro · Sep 26, 2021

pbuk said:

Precisely. So we deal with the unit conversions in the presentation layer, not the impementation layer (and certainly not embedded in the language).

The point is that you only want these constraints enforced at the compilation phase without any runtime performance penalty.

Jarvis323 · Sep 26, 2021

One thing that would be pretty cool is if a tool could use the units to generate latex in the autodocs. It would be useful probably for research papers where the code is coupled with the paper, and you have to describe the code mathematically and algorithmically and link it to the theory.

pbuk · Sep 26, 2021

elcaro said:

The point is that you only want these constraints enforced at the compilation phase without any runtime performance penalty.

No, my point was not about performance (which is irrelevant when talking about the user interface, humans have a pathetic clock rate), it was about the Separation of Concerns. The business logic of my program should not have to worry about the UI.

Every language is a balance between 'enforcing' things and being easy to code in. Because you can only prevent a subset of coding errors at compile time I believe that the balance should be low on enforcement and high on ease of coding (and unit testing). This belief comes partly from experience with Ada (including a project abandoned with USD20m in today's money on the clock), which as has been mentioned upthread is so strict in its persecution of errors that can be caught at compile time that it is almost impossible to create a system of any size that ever gets to run and exhibit the algorithmic errors that can't be caught!

Anyway it has been established above that among general purpose languages Measure types exist in at least F# and Haskell edit: and in C++ via a Metaprogramming Library: isn't that enough for you

?

elcaro · Sep 26, 2021

No, my point was not about performance (which is irrelevant when talking about the user interface, humans have a pathetic clock rate), it was about the Separation of Concerns. The business logic of my program should not have to worry about the UI.

The idea is that also outside the UI you can use numbers with measure, but this won't generate code, just checks that your unit usage is consistent.

Every language is a balance between 'enforcing' things and being easy to code in. Because you can only prevent a subset of coding errors at compile time I believe that the balance should be low on enforcement and high on ease of coding (and unit testing). This belief comes partly from experience with Ada (including a project abandoned with USD20m in today's money on the clock), which as has been mentioned upthread is so strict in its persecution of errors that can be caught at compile time that it is almost impossible to create a system of any size that ever gets to run and exhibit the algorithmic errors that can't be caught!

Errors in usage of units should be catched, it is a programming error.

Anyway it has been established above that among general purpose languages Measure types exist in at least F# and Haskell edit: and in C++ via a Metaprogramming Library: isn't that enough for you ?

For me that is ok, not an active programmer anymore, I meant the physics community.

Vanadium 50 · Sep 26, 2021

I agree that performance is often way overemphasized. There is no benefit to getting an incorrect answer faster than everybody else.

The idea that sometimes certain errors can be caught by the compiler, and sometimes the programmer has to think about what she intends seems to me not so helpful. The programmer doesn't need to know any less. The programmer doesn't need to think any less. If the compiler gives the code a clean bill of health do we know it's OK? No, we don't.

Further, this adds to the complexity. You want fewer programmer errors? Then you want the code less complex and not more.

C++ (and many other languages) provide features to do this, and in a more flexible way that fiffling around with the intrinsic types.

Rive · Sep 27, 2021

Vanadium 50 said:

I agree that performance is often way overemphasized.

Maybe, but not necessarily. Somewhere above (life) critical systems were mentioned.
Those usually does not have the newest CPU and top notch hardware. Just something reliable.

Vanadium 50 said:

You want fewer programmer errors? Then you want the code less complex and not more.

Yeah. Less code, more documentation and engineering :doh:

Especially in case of 'critical' systems.
But if you have that, then why is this whole thing needed?

The only place I can think it would be slightly useful is for some specialized physics- or engineering oriented language (for beginners/ non-programmers).

jbergman · Sep 30, 2021

Vanadium 50 said:

I agree with this.

First, it's not entirely clear what is being proposed. If it is that "length in meters" is an internal type and "length in centimeters" is a different internal type such that they cannot be added without explicit conversion, that means that the only units that can ever be used are the ones built-in to the language.

Now, if one says, "no, this can be extended in the language to go beyond these intrinsic types", well, we're there now. I can do this in C++ today, where length_in_meters is an instance of the length class, which has two members: the value, and the units. (And if you like, length and area are derived classes from a base class)

This is completely wrong.

Look at the f# example I posted earlier. In F# you can define your own units of measure types. Then you annotate numeric types as to what unit of measure they are an you also define conversion functions.

See this blog for a more detailed exposition.

jbergman · Sep 30, 2021

elcaro said:

Perhaps these cases could be handeld by performance optimization, that can be done automatically as last step without loosing the constraints on units enforced by the compiler.

Agree. I believe that F# removes the unit of measure checks as part of the compilation process.

jbergman · Sep 30, 2021

Vanadium 50 said:

My understanding from reading the description is that this would be perfectly happy adding an energy to a torque.

In F# units of measure you get an error if you add types which are not of the same measure of convertible to the same measure.

https://fsharpforfunandprofit.com/posts/units-of-measure/

jbergman · Sep 30, 2021

Rive said:

Maybe, but not necessarily. Somewhere above (life) critical systems were mentioned.
Those usually does not have the newest CPU and top notch hardware. Just something reliable.Yeah. Less code, more documentation and engineering
Especially in case of 'critical' systems.
But if you have that, then why is this whole thing needed?

The only place I can think it would be slightly useful is for some specialized physics- or engineering oriented language (for beginners/ non-programmers).

This comment is ludicrous.

Vanadium 50 · Sep 30, 2021

jbergman said:

This is completely wrong.

Which part?

Is it that things are completely clear? Not to me!

Is it that checking dimensions is not the same as checking units? I think that's self-evident, but in any event examples have been provided where this does not work.

Is it that we can achieve this today without changing intrinsic variables? Again the C++ examples have been discussed.

"You're just wrong" is not so helpful.

Rive · Sep 30, 2021

jbergman said:

This comment is ludicrous.

I can imagine that some would see so.
Care to elaborate?

jack action · Sep 30, 2021

I don't know if I'm hi-jacking the thread, but I would prefer to have a language that can carry the error throughout the calculations before one taking care of units. Something like the inputs are 12.34 (±0.01) and 1.234 X 10⁶ (±1000), then [calculations, calculations, calculations], and the output turns out to be 5.32846895031784 (±0.1). This way, I know that the final answer is 5.3 and all other decimals are superfluous, except if used in other calculations. You change the accuracy of the inputs and your answer might become 5 or 5.3285.

elcaro · Sep 30, 2021

jack action said:

I don't know if I'm hi-jacking the thread, but I would prefer to have a language that can carry the error throughout the calculations before one taking care of units. Something like the inputs are 12.34 (±0.01) and 1.234 X 10⁶ (±1000), then [calculations, calculations, calculations], and the output turns out to be 5.32846895031784 (±0.1). This way, I know that the final answer is 5.3 and all other decimals are superfluous, except if used in other calculations. You change the accuracy of the inputs and your answer might become 5 or 5.3285.

Seems a usefull addition. It is part of extended numeric types that carry information one needs when doing calculculations in for example physical models. Including the unit of measure and the error.

Vanadium 50 · Sep 30, 2021

jack action said:

I would prefer to have a language that can carry the error throughout the calculations before one taking care of units.

Do you need it to be a change in how intrinsics behave or can it be a class?

Given that error propagation can be non-trivial, it is a lot easier to do as a class.

Jarvis323 · Sep 30, 2021

Vanadium 50 said:

Given that error propagation can be non-trivial, it is a lot easier to do as a class.

I agree it is not trivial. Usually it is the work of mathematicians to apply theory based on the numerical algorithms, to get error bounds on things like matrix operations.

But then you also have the issue that high level code may not compile to what you expect.

If you have error propegation in a class, then the class has to know the hardware it will run on, and how the code the compiler will generate, in addition to being able to apply complex global analysis of the dataflow and algorithms.

Vanadium 50 · Sep 30, 2021

Jarvis323 said:

the class has to know the hardware it will run on

Why?

Why is arithmetic used in error propagation different from any other use of arithmetic?

jack action · Sep 30, 2021

Vanadium 50 said:

Do you need it to be a change in how intrinsics behave

I don't know if I need it, but it would be nice if, when I write an equation with basic expressions, the program identifies the error based on how the number is written (say ±1 on the last significant digit) and gives the final answer rounded up. How idiotic is it when you get a float set to 3.00000000000000008 as an answer? It is literally a wrong answer. I think basic computing could correct that very easily. And getting 3.0 or 3.0000, instead of 3, would add meaning to the number.

I've done the unit conversion thing because it is something that bothers me as well. I've tried to do the error propagation, but it is much more complex to do (in a general way) without replacing all expressions (say, 'a + b' becomes 'add(a, b)') and all your programs become much harder to read (and write).

Jarvis323 · Sep 30, 2021

Vanadium 50 said:

Why?

Why is arithmetic used in error propagation different from any other use of arithmetic?

As an example, some compilers/platforms have compiler flags to do reduced precision floating point operations or approximations.

Vanadium 50 · Sep 30, 2021

You're talking about significant figures, which is a quick-and-dirty reflection of real error propagation. If you build that into your intrinsics, won't it make it that much harder to program in the more correct procedure?

I confess that I find the logic of this thread puzzling. What is proposed can be done in various flexible languages. But no, that's not good enough. It has to be by changing how the language deals with intrinsic types, often in incompatible ways.

Rive · Oct 1, 2021

Jarvis323 said:

I agree it is not trivial. Usually it is the work of mathematicians to apply theory based on the numerical algorithms, to get error bounds on things like matrix operations.

Yes. Things - again! - burns down to the 'engineering' part of the software development.

Jarvis323 said:

But then you also have the issue that high level code may not compile to what you expect.

Quite a bullseye. All these fancy additions/modifications would be a nightmare to validate.

jack action · Oct 1, 2021

Vanadium 50 said:

won't it make it that much harder to program in the more correct procedure?

I guess I'm talking more about what a floating-point represents versus an integer.

Math with integers is easy: 1 is exactly 1, 3 is exactly 3, thus ##\frac{1}{3} = 0.\bar{3}##. It works very well in theoretical work.

But with more practical problems, you rarely work with such certainty. You deal with some level of precision and you have 2 limits: the significand length that the program can handle and the precision of the inputs. You cannot escape the former but, somehow, nobody cares about the latter. To me, ##\frac{1.0}{3.0} \neq \frac{1.00}{3.00}## and neither of them is equal to ##0.\bar{3}##. I would at least expect an answer that doesn't have more than 2 or 3 significant figures. But to be able to perform other calculations, having the error following the number is crucial.

It would be nice if a computer program could keep track of this throughout its calculations. So a floating-point would store a significand, an exponent, and an error.

pbuk · Oct 1, 2021

So in summary, we have the following requirements for floating point numerics:

they should have unit information attached so we can't add feet to metres
they should have dimensional information attached so we know that a velocity times a time is a distance
they should have physical information attached so we can't add a torque to an energy
they should have measurement error attached (according to what standard?) so we can present results appropriately
they should have round-off error attached so we can track its propogation
they should have truncation error attached so we can track this without having to understand the underlying algorithm (nobody mentioned this but I thought I'd add it for good measure)

I think it should now be obvious why these requirements are not implemented in general purpose languages, however subsets of these requirements are implemented in certain special purpose languages and also in modules for certain general purpose languages. Also, where a system is highly specialised and safety-critical, the general features of many special purpose languages can be used to implement numeric classes implementing whatever subset of these features is appropriate.

Mods, are we done?

Mark44 · Oct 1, 2021

pbuk said:

Mods, are we done?

Seems like it to me ...
Thread closed

Why do programming languages usually not implement number types with units?

Similar threads

Hot Threads

Recent Insights