I want to better write this JavaScript procedure

  • Context: Java 
  • Thread starter Thread starter Jamin2112
  • Start date Start date
  • Tags Tags
    Javascript Procedure
Click For Summary

Discussion Overview

The discussion revolves around optimizing a JavaScript function designed to convert text into an array of words. Participants explore various optimization techniques, the efficiency of string manipulation, and the relevance of such optimizations in modern programming contexts.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant seeks help to reduce the number of comparisons in the provided text2words function, emphasizing the importance of optimization.
  • Another participant argues that unless the code is executed on a massive scale, the time spent on optimization may not be justified given modern computing speeds.
  • A different viewpoint suggests that writing reasonably optimal code from the start is more beneficial than excessive optimization later, highlighting potential inefficiencies in the current string manipulation approach.
  • Concerns are raised about the performance implications of adding characters to a string one at a time, with suggestions to handle substrings instead.
  • One participant proposes using an array of flags for character checks as a faster alternative to the current is_letter function.
  • Another participant questions the efficiency of string addition versus using the concat() method, seeking clarification on their performance differences.
  • Discussion includes the immutability of JavaScript strings and how it affects performance when manipulating strings.

Areas of Agreement / Disagreement

Participants express differing opinions on the necessity and practicality of code optimization, with some advocating for it and others suggesting it may be unnecessary in many cases. There is no consensus on the best approach to optimize the function.

Contextual Notes

Participants mention potential limitations of the current implementation, including the handling of Unicode characters and the efficiency of dynamic memory allocation in string manipulation.

Jamin2112
Messages
973
Reaction score
12
As you may have figured out, I'm obsessed with spending exorbitant amounts of time making optimizations to my code. Can anyone help me cut down on the number of comparisons in the following text2words function? The intent of the function should be easy to figure out from the comments.

Code:
		function is_letter(c)
		{
		    /* Returns whether a character c is in the set {'a', 'b', ..., 'z', 'A', 'B', ..., 'Z'} 
		    */
		    return ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'));
		    
		}
		
		function text2words(S)
		{
		    /* Given a string S, return an array A of all the words
		       in S, a word being defined as a sequence of consecutive
		       characters in the set {'a', 'b', ..., 'z', 'A', 'B', ..., 'Z'}.
		    */
		    var A = new Array();
		    var thisword = new String();
		    var i = 0, j = S.length;
		    while (i < j)
		    {
		       if (is_letter(S[i]))
		       {
		           while (i < j && is_letter(S[i]))
                        thisword += S[i++];
		           A.push(thisword);
		           thisword = "";
		       }
		       else
		       {
		           ++i;
		       }
		    }
		    return A;
		}
 
Technology news on Phys.org
Intellectually, this kind of optimization can be interesting, but in practical terms, unless this code is going to run on ZILLIONS of records, ever night and needs to be done by 6am, you are wasting time optimizing it, given the speed of modern computers. I used to do exactly the same thing just on general principles 'cause I grew up with computers when they were slow, but I gave it up years ago as a waste of time.
 
I agree optimizing code can be rather pointless, but the best trick is to learn to write reasonably optimal code first time around. Most programmers of a certain age learned to do that from necessity!

Adding characters to the string "thisword" one at a time could be slow, and hammer dynamic memory allocation if the implementation isn't very clever. It doesn't complicate the code much to search for the end of the word and then deal with the substring of S[] just once.

If your isletter() function is only working on 8-bit characters, the quickest way is to set up an array of true/false flags with 256 elements, instead of doing a function call, up to 4 comparisons, and some logical operations.

If you are reading Unicode text, your isletter function is completely broken anyway. There are of "letters" that are not encoded in 7-bit ascii, even in European languages.
 
AlephZero said:
Adding characters to the string "thisword" one at a time could be slow

What faster way is there of adding characters to the end of a string? JavaScript's string class doesn't have any kind of push() function.
 
Jamin2112 said:
That accomplishes the same task, but why is it more efficient than string addition? Is the implementation different?

Javascript strings are effectively immutable (I say "effectively" because pedants might jump on a definition of "immutable" which doesn't quite apply to Javascriot.)

So every time you add one character to a string, you actually create a new string. For short strings, you lose because allocating the new memory area is expensive compared with copying a few bytes of data. For long strings, you lose because you are repeatedly copying large amounts of data, which not only eats up CPU clock cycles, but also eats up the CPU's fast memory cache.
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K