I do I eliminate the 2 strlens from this?

  • Thread starter Thread starter Jamin2112
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around optimizing a C code snippet that concatenates strings representing derivatives and an operation. Participants explore ways to eliminate redundant calls to the strlen function and improve the overall efficiency of the string manipulation process.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants express a desire to eliminate the use of strlen for efficiency, suggesting that it is redundant since strcat iterates over the strings.
  • One participant proposes rewriting strcat to handle the operation and two character pointers, potentially reducing the number of calls to strlen.
  • Another participant suggests using sprintf to simplify the concatenation process, although they acknowledge the need to determine the buffer size for dfdx.
  • A different approach is introduced, advocating for the use of C++ and std::string, which allows for simpler string concatenation using the "+" operator.
  • Some participants argue that making code more compact may not always be beneficial, especially if it reduces readability.
  • One participant emphasizes that in C, there is no way to avoid calling strlen twice due to the nature of C-style strings, while suggesting a method to allocate memory based on known lengths.
  • Concerns are raised about the need for parentheses in certain operations to maintain correct mathematical expressions.
  • Another participant mentions the potential need for a symbolic mathematics library for further simplification of expressions.

Areas of Agreement / Disagreement

Participants express differing views on the necessity of calling strlen multiple times and the implications of code compactness versus readability. There is no consensus on the best approach to optimize the code.

Contextual Notes

Participants note that C-style strings do not store their length, necessitating the use of strlen for memory allocation. There are unresolved questions regarding the handling of specific operations and the potential need for additional tools for symbolic differentiation.

Jamin2112
Messages
973
Reaction score
12
I have a chunk of code that is like

Code:
  if (rt->op) // if rt is of the form rt = gx op hx
  {
  char * dgdx = deriveFromTree(rt->gx); // g'(x)
  char * dhdx = deriveFromTree(rt->hx); // h'(x)
  char thisop = *rt->op;
  if (thisop == '+' || thisop == '-')
    {
      //  ADDITION/SUBTRACTION RULE:
     //  dfdx = dgdx + thisop + dhdx
      long n = strlen(dgdx) + strlen(dhdx) + 2;
      dfdx = malloc(sizeof(char) * n); dfdx[n-1]='\0';
      dfdx = strcat(dfdx, dgdx);
      dfdx = strcat(dfdx, charToString(thisop));
      dfdx = strcat(dfdx, dhdx);
     }

and I want to do it without the strlen(dgdx) and strlen(dhdx) because they seem redundant considering that the implementation of strcat iterates over all the characters of the string again. How do I redo this overall crappy procedure?
 
Technology news on Phys.org
Jamin2112 said:
I have a chunk of code that is like

Code:
  if (rt->op) // if rt is of the form rt = gx op hx
  {
  char * dgdx = deriveFromTree(rt->gx); // g'(x)
  char * dhdx = deriveFromTree(rt->hx); // h'(x)
  char thisop = *rt->op;
  if (thisop == '+' || thisop == '-')
    {
      //  ADDITION/SUBTRACTION RULE:
     //  dfdx = dgdx + thisop + dhdx
      long n = strlen(dgdx) + strlen(dhdx) + 2;
      dfdx = malloc(sizeof(char) * n); dfdx[n-1]='\0';
      dfdx = strcat(dfdx, dgdx);
      dfdx = strcat(dfdx, charToString(thisop));
      dfdx = strcat(dfdx, dhdx);
     }

and I want to do it without the strlen(dgdx) and strlen(dhdx) because they seem redundant considering that the implementation of strcat iterates over all the characters of the string again. How do I redo this overall crappy procedure?


So you're trying to just add or subtract two numbers by trying to incorporate the operation into a string, that takes up the appropriate amount of memory as defined by the numbers and the operation, so then you can solve it I am guessing? Or what's the end goal of this supposed to be? There are also vastly better, infinitely easier languages to do this in then C.

I suppose you could always rewrite your own strcat function that takes a character (for the operation) and two character pointers to count how many characters are in the string and then use malloc from that result, which would let avoid calling strlen twice and enable you to only have to call modified strcat once making your code much cleaner and more efficient. But does that small amount of efficiency really even matter? Are you writing for a microcontroller or a mega database or something?
 
How about a very short expression to do everything below // ADDITION...? (except the malloc part)

n = sprintf(dfdx, "%s %c %s", dgdx, thisop, dhdx);
 
An alternative is to use C++ and std::string rather than C and C-style strings. Then you can just add the strings, using the "+" operator:

Code:
std::string dgdx = deriveFromTree(rt.gx); // g'(x)
std::string dhdx = deriveFromTree(rt.hx); // h'(x)
std::string dfdx;
if (rt.op == '+' || rt.op == '-')
{
    //  ADDITION/SUBTRACTION RULE:
    dfdx = dgdx + rt.op + dhdx;
}
 
serp777 said:
There are also vastly better, infinitely easier languages to do this in then C.

The point is to make things harder
 
Svein said:
How about a very short expression to do everything below // ADDITION...? (except the malloc part)

n = sprintf(dfdx, "%s %c %s", dgdx, thisop, dhdx);

That greatly helps make my procedure more compact, but I still need a way to know how large to make buffer dfdx.
 
Jamin2112 said:
That greatly helps make my procedure more compact, but I still need a way to know how large to make buffer dfdx.

There was a long thread recently about the advantages of making code more compact (https://www.physicsforums.com/threads/code-readability-for-higher-level-languages.816168/). The upshot was that there really isn't much advantage in making the source code more compact, especially if it makes the code more opaque to the reader.
Jamin2112 said:
I want to do it without the strlen(dgdx) and strlen(dhdx) because they seem redundant considering that the implementation of strcat iterates over all the characters of the string again.
I don't see the calls to strlen() being redundant, as you say. You need to know the lengths of the two strings that you are going to concatenate. The fact that strcat copies character-by-character doesn't obviate the need to know how large the buffer for dfdx needs to be.
 
Jamin2112 said:
The point is to make things harder
There is little point in doing that in professional programming. A major aspect of professional programming is making things easier. From other threads, you are an aspiring C++ programmer. If that's still the case, you should stop thinking like a C programmer, or like a Java programmer. Well-written C++ has zero calls to malloc and free, and very few calls to new and delete.

Given that, ...
Jamin2112 said:
That greatly helps make my procedure more compact, but I still need a way to know how large to make buffer dfdx.

Unlike many other languages, C-style strings don't store the length as a separate property. That coupled with the need to allocate storage means there's no way around calling strlen twice for this problem. There is however something you can do since you know the lengths of the strings:
Code:
char * dgdx = deriveFromTree(rt->gx); // g'(x)
char * dhdx = deriveFromTree(rt->hx); // h'(x)
char thisop = *rt->op;
std::size_t dgdx_len = std::strlen(dgdx);  // Omit std:: if you are using C.
std::size_t dhdx_len = std::strlen(dhdx);
char *dfdx;
if ((thisop == '+') || (thisop == '-'))
{
   dfdx = new char[dgdx_len+1+dhdx_len+1]; // Use malloc instead of new in C.
   std::strcpy (dfdx, dgdx);
   dfdx[dgdx_len] = thisop;                // There's no need for calling strcat here.
   std::strcpy (dfdx+dgdx_len+1, dhdx);    // Nor here.
}
...
delete[] dfdx; // Replace with free if you are using C, but never omit this.

There are some issues with the above. What if thisop is '-' and dhdx is "a+b*x+c*x^2"? You probably want to put parentheses around the embedded dhdx. I'll leave that as an exercise for the OP. You may also want to simplify the resultant expression. That is not an exercise for the OP. It means you need a symbolic mathematics library or a symbolic mathematics tool.

You might want to rethink your use of C/C++ here.
 
Last edited:
If you still want to use C:

Declare a local character array with space for the longest possible string plus some extra headroom: char temp[MAX_OP_SIZE];
Then do what I said above (and introduce the parentheses suggested by D H):
n = sprintf(temp, "(%s) %c (%s)", dgdx, thisop, dhdx);
Then store the result:
dfdx = strdup(temp);
And you want to check for errors:
if (dfdx==NULL)
// Run in circles, scream and shout...
 
  • #10
D H said:
There are some issues with the above. What if thisop is '-' and dhdx is "a+b*x+c*x^2"? You probably want to put parentheses around the embedded dhdx.


Good call. I can't believe I didn't think of that.

You may also want to simplify the resultant expression. That is not an exercise for the OP. It means you need a symbolic mathematics library or a symbolic mathematics tool.

I'm handrolling a tool for symbolic differentiation and don't have any plan to simplify. Right now it looks atrocious (http://codepad.org/mRv44sr1), but is close to working. The insertInTree function is what's giving me trouble (I asked a question about it here: http://stackoverflow.com/questions/...algorithm-to-insert-a-node-in-a-function-tree).