in Search
Welcome to Neopoleon - Sign in | Join | Help
Navigation: Home | Forums | Galleries

Optimization (not always a good thing)

There was a good thread in Chris Sells' OT list this morning.

Began as a list of questions that might be asked during an interview (.NET job). People were chiming in with their opinions, and someone eventually mentioned asking an interviewee to flesh out a string reversal function.

Stuart was quick off the block with a VB.NET answer:

Private Function Reverse(ByVal str As String) As String
    Dim rev As New StringBuilder(str.Length)
    For x As Integer = str.Length - 1 To 0 Step -1
        rev.Append(str.Substring(x, 1))
    Next
    Return rev.ToString
End Function

Starting with Stuart's version, the conversation quickly turned to providing potential implementations, most of which (if not all) were in C#.

It began with a couple improvements to Stuart's function, but spiraled into this obscure academic exercise chock full of unsafe blocks and pointer mumbo-jumbo.

Arguments ensued as disagreements arose over the finer points of unsafe coding. Stuart's example was discarded, and the resulting code snippets started looking more and more as though they had been run through obfuscators.

Now, I don't know about you, but if I were interviewing someone for a .NET position, and if I asked that person to implement a string reversal function, I'd want something more like Stuart's answer.

Was Stuart's the best? The fastest? The most compact?

No, no, and no.

But it probably took him about 12 seconds. The arguments over the unsafe pointer-based methods, though, continued. Some of the versions offered wouldn't even compile, while others totally ignored some important feature, and so on.

All the while, Stuart's could have been chunking away on some giant string. By the time the dust settled, Stuart's code could have reversed The Brothers Karamazov and restored it to its original orientation.

Before knowing what the string reversal function was going to be used for, and before knowing how often it was going to be used, these guys were pumping out more and more complex (and fragile in some cases) versions of the function.

As someone interviewing for the position, this would worry me to no end.

Take my day for example. I wrote a log parsing app a while back that took a while to do its thing. It was processing about 30 1 gig files, and sometimes took as much as an hour to complete. This involved file IO as well as some database mucky-muck.

If you can believe it, I wrote the app with no other intention than getting it to work. Goal #1 was for the thing to function. It wasn't until after I had finished that I decided to optimize.

Running a few tests, I found one block of code that was taking much longer than the others. An hour later, I had fixed the trouble spot, and the parser was down to about fifteen minutes to do the same job.

Now, what if I had approached the app with the intention of squeezing every last bit of performance out of it from the get go? Without being able to test the app, and without any profiling, I'd be coding blind - maybe spending hours to save milliseconds, and ignoring the real bottleneck which could very well be elsewhere.

That would suck.

If the app needed a string reversal function, and if I had to write it, then I would have cranked out something like Stuart's version. If it turned out that it was the source of some major performance issue, then I would have gone back and tweaked it.

I would have hired Stuart because I would have seen that Stuart wasn't going to waste my money by optimizing code that might have been just fine.

It seems to me that people expend a lot of effort on a regular basis for little reward. It's like these people who will circle a building for 20 minutes, waiting for a parking spot to open up. Instead of parking three blocks away and getting inside 15 minutes sooner, they just circle, and circle, and circle, and circle...

Published Thursday, February 05, 2004 12:36 AM by Rory

Filed Under:

Comments

 

GuyIncognito said:

Why any modern day String class doesn't have it's own Reverse() method is baffling to me...

It should have been something like this:

string Reverse(string value)
{
char[] chars = value.ToCharArray();
Array.Reverse(chars);
return new String(chars);
}

February 5, 2004 12:53 AM
 

GuyIncognito said:

Or rather if it were a member of the actual String class:

string Reverse()
{
char[] chars = ToCharArray();
Array.Reverse(chars);
return new String(chars);
}
February 5, 2004 12:55 AM
 

Ian Leff said:

Imports Microsoft.VisualBasic

...

str = StrReverse(str)
February 5, 2004 12:56 AM
 

GuyIncognito said:

Of course if you're using SQL Server you just bust out the REVERSE() function.
February 5, 2004 12:56 AM
 

Kevin Dente said:

Computer geeks run amok...
:)

I wonder what the collective productivity cost of that message thread was.
February 5, 2004 12:56 AM
 

Rory said:

Guy -

That was one of the options suggested.

Makes sense, and for .NET, a hell of a lot more sense than dropping blocks of unsafe code.

Either way, you and Stuart came up with sane string reversal functions that didn't try any complex mumbo-jumbo to save a few milliseconds where they might not even matter.

Which is good :)
February 5, 2004 12:57 AM
 

Rory said:

Ian -

"str = StrReverse(str)"

That's certainly a fine way :)

The idea, though, is to see how far someone's going to go if they *have* to write the function, and to see what kind of optimization they're going to do before knowing where the problems are.
February 5, 2004 12:58 AM
 

GuyIncognito said:

As for optimization, I've always found that the fastest way of doing something, is not doing it at all...
February 5, 2004 1:08 AM
 

Eric said:

As for the parking... it's not about the speed you get inside the facility, it's the speed you get outside to your car. :)
February 5, 2004 1:18 AM
 

TJ said:

Bud Light Presents "Real Men of Genius".

This bud is for you Mr. Write Complex and obscure code man.

You will write a 200 line method that could have been done in 20, just so you can say "Hey look at my code its complex". While your other coworkers go out for drinks after 5pm you stay behind just so you can pump out some more complex code.

So here is to you master of the obvious.

http://www.mizzou.edu/~npvytc/Budlight/Budlight.html
February 5, 2004 1:24 AM
 

Rory said:

Eric -

"As for the parking... it's not about the speed you get inside the facility, it's the speed you get outside to your car."

I take it you've been known to circle buildings in your car, waiting for the perfect spot :)
February 5, 2004 1:36 AM
 

Rory said:

TJ -

"You will write a 200 line method that could have been done in 20, just so you can say "Hey look at my code its complex". While your other coworkers go out for drinks after 5pm you stay behind just so you can pump out some more complex code."

*Exactly*.

Complexity for the sake of complexity is something I can't stand. It reminds me of people I knew when I was a teen who wrote "poetry" just to write poetry. The thought it had to be complex for it to be good - every other word had to be at least 19 syllables long :) Coders are guilty of the same thing.

Also, not only can your coworkers go out for drinks at 5pm, but so can the poor SOB who's going to have to maintain the code someday...
February 5, 2004 1:38 AM
 

John said:

Erm, if the Yahoo mail server wasn't playing up, everyone would have got my post informing them that the VB.NET version was in fact the _most performant_ solution!

http://groups.yahoo.com/group/win_tech_off_topic/message/26078
February 5, 2004 2:04 AM
 

John said:

Amen to Rory's post. Its a pet peeve of mine.

I once worked at a company where the manager of another group was giving me a hard time about a solution I was creating. He assumed everyone in my group were script kiddies. He asked things like would it be apartment threaded, why are you going to make it a dual interface etc. His focus was on me making sure it didn't slow down the production system.

I created the solution with no mind for performance. I went to the guy with the performance numbers. I said to him, "Is 70 transactions a second good enough?" (bear in mind the production system could not break single digits). He looked kinda shocked. I then added, "Cause if its not I can compile it in release mode!"

I have found that a simple well thought out design usually outperforms an optimized ill thought out one.

Rory hit the nail on the head too with his comment "Before knowing what the string reversal function was going to be used for, and before knowing how often it was going to be used..."

Did I mention its a pet peeve of mine?
February 5, 2004 3:14 AM
 

Scott said:

Jeebus Freist, was THAT why I got 3 digest versions of that mailing list in my inbox today?

I'm all for mental masturbation but c'mon.
February 5, 2004 3:19 AM
 

Scott said:

I think a better question would be "Give a real world example that requires you to reverse a string."
February 5, 2004 3:23 AM
 

Andy said:

I couldn't agree more. Design then refine. I sketch out a flow of the potential solution on a white board, kick it around a bit, then write it out, then go back through look for places to refine it, then one final time to comment and annotate. It's like writting a paper you don't write one paragraph in your thesis in the middle first then refine it until it's the best paragraph in the world. You have have an outline, then a rough draft, then it get's edited, then you redraft and refine and then you have a paper. Code is language, language is code, the rules of good writing apply to both.
February 5, 2004 3:29 AM
 

Stuart said:

Hey, thanks for the props, mister!

"Was Stuart's the best? The fastest? The most compact?

No, no, and no."

...um...I think...? :)

February 5, 2004 3:41 AM
 

John said:

Um, this is the part where I have to admit to making a blatent mistake in my tests. I blame a temporary brain explosion. The VB.NET version is in fact the _slowest_ (34 seconds) and I made a mistake when testing it. Sorry to those who don't get this message. (Mental note: don't post assertions to forums ever again).

Sorry. :(
February 5, 2004 4:09 AM
 

Rory said:

Scott -

"I think a better question would be 'Give a real world example that requires you to reverse a string.'"

RSE - Really Shitty Encryption :)
February 5, 2004 4:23 AM
 

Rory said:

Stuart -

"Hey, thanks for the props, mister!"

I've been a serious fan of yours lately <shrug>. You're one of the sanest coders I know.

I still get a warm glow when I think about your comment in the ADO.NET post...

"...um...I think...?"

Trust me - I knew that part would come off sounding a little harsh, but it was supposed to. I wanted to point out that the best *solution* might not be the most incredible implementation.

In other words, I thought your *approach* was best, if not your function :)
February 5, 2004 4:27 AM
 

Rory said:

Andy -

"Code is language, language is code, the rules of good writing apply to both."

I have some similar feelings:

http://neopoleon.com/blog/posts/404.aspx
February 5, 2004 4:28 AM
 

john said:

well, everyone has already said what I have to say. make it work, then refactor it if you need it to run faster, better, stronger.


February 5, 2004 8:04 AM
 

chrootstrap said:

Instead of making the function complex, is can be more amusing to make it as simple as possible. Quite often (depending on if 'simple' means 'using the secretly complex convenience method') this also yields highly optimized code. Short and simple is not a bad design philosophy.

An underlying question with a reversing strategy is 'do you want to do it in place?' The style and conventions of most HLLs would suggest that one would return a new, unique string. The memory allocation and object instantiation is likely to take more time than the iteration, under x length.

What comes to mind in a dynamic object oriented language is to merely change the underlying character generator of the string to produce reversed results. Assuming that there is a fundamental (e.g.) getChars method in the string, if that is over ridden with one that reverses the iteration, problem solved. You'd just create that method once and hook it into the string. Or, even more fun make the getChars() use an internal Parity member to determine which way to iterate and toggle that. A new class, ReversableString, for example.

Now, is that utterly ridiculous or what? ;-)
February 5, 2004 8:31 AM
 

Jeroen Frijters said:

I'm sorry but I cannot agree with you that Stuart's version is the best.

He did:
> rev.Append(str.Substring(x, 1))

That should have been:
> rev.Append(str.Chars(x))

There is no excuse for using Substring in this case.

BTW, the point of the unsafe version was not to show a "better" version, but just to have some fun. I hope some people learned something from it (and I pray that the lesson wasn't "I'd better use unsafe code for everything, because it is marginally faster").
February 5, 2004 8:52 AM
 

Mike Parsons said:

February 5, 2004 2:25 PM
 

Stuart said:

Just so y'all know, I understand that strings are allocated in heap memory, immutable, etc.., and I'm well aware that my use of Substring this case was bad, bad, bad.

I've been doing a bunch of VB.NET, C#, and C++ coding lately (VB.NET is my lingua franca), and when I wrote the VB function (which I truly did spend about 12 seconds doing) the first thing I typed was rev.Append(str(x)) in an attempt to grab the char (this is the thing to do in C# and C++). VB didn't like it so I scanned the Intellisense list for what I wanted, didn't see it, cursed VB for its inefficiency, and used Substring.

So anyway, you're absolutely correct, Jeroen. If I would have been writing "real" code, I would have taken the extra few seconds to find what I was looking for and used it.

The other thing that I would have included in real code was short-circuiting for zero-length and one-length strings, which was also mentioned on the OT list.

Kind of odd, though, that Jeroen says he can't agree with Rory that my code was the best when Rory stated, "Was Stuart's the best? ... No..."

Oh well, it was interesting to see how other people would handle such a problem and fascinating that there can be so much controversy over a simple string reversal function! :)
February 5, 2004 3:33 PM
 

Nils said:

“Premature optimization is the root of all evil.” —Donald Knuth
February 5, 2004 3:34 PM
 

John said:

Reminds me of the company that optimized their processor. They did the right thing by profiling to see where the processor spent most of its time. They then spent all their effort on that piece of code. At the end of their optimizations they had a very fast idle loop!!!

From "The Practice of Programming - Brian W. Kernighan and Rob Pike"
February 5, 2004 4:33 PM
 

Ian said:

Yep , I 100% agree!

See my ramble on .NET and charting.
I'm going through the same kind of log file hell, and my first thought was 'lets get it working!'
Its running at an OK speed, given it has to dump a ton of records into arrays, sort by various things and run some maths'y bits.
At some stage I'll figure out how to make it quicker.
As a side note,I threw up a quick progress dialog and updated its window with each logline read and performance tanked, so I stopped doing that! I guess I need to look at my "please wait" dialog code as I can't believe its C# being super slow across threads.
For now they'll have to live with a 'I'm still working.." window!

I had to write a string rev function for a job interview years ago with Dr Solomons (they offered and I turned down). I was very tempted to just use the built in because quite frankly who wouldn't?
I guess the question has some value but pretty much anyone can write it so I think the only real value it adds to the interview is to check if they can think quick or want to go off for hours on the theoretical.
February 5, 2004 4:36 PM
 

Rory said:

Ian -

"As a side note,I threw up a quick progress dialog and updated its window with each logline read and performance tanked"

I did the same thing, except that total time spent on updating the progress bar for each file never exceeded one second.

I wonder why it would have hurt performance for you so much... I *expected* it to for my app, but it didn't at all.

Hm...
February 5, 2004 4:44 PM
 

Ian said:

Yeah, it killed my performance. it took it down to reading maybe 500 records a second instead of 2 or 3k (I have to do a lot of parsing of each line to figure out if it's one we care about, if its valid etc)

its based on the same threading as the sample on code project (http://www.codeproject.com/cs/miscctrl/progressdialog.asp) but without any of the progress etc. it does update the text of a label though and I'm wondering if thats an issue. I don't have time to fiddle with it for now but I'll come back to it soon and take a closer look
February 5, 2004 4:56 PM
 

chrootstrap said:

An easy way to do multithreaded progress feedback is to have the feedback thread check the current progress at 't' interval. By sleeping between checks, the effect on the work thread is tunable and likely very low, particularly if it simply takes a ratio of a member to the end value.
February 5, 2004 5:41 PM
 

Steve Hiner said:

I wrote a popup progress dialog that I use a lot and I've found that I can rescue the performance of a tight loop by only updating the bar even 10 or 100 iterations. I think the performance killer in mine is that I call DoEvents after updating the text and progress bar so the form will update. Throw a DoEvents into any tight loop and you'll kill performance.
February 5, 2004 7:49 PM
 

Mike Dimmick said:

Throw a DoEvents into a tight loop and you'll have a headache. Use a thread. Use a C++ object if you have to get that thread, but use a thread.

Almost no-one is prepared for the unlimited reentrancy that DoEvents causes, allows, and the stack overflows that inevitably seem to result. I recently had to deal with a situation where a process was trying to make an out-of-process COM call while already having made one. The culprit? DoEvents.

As for optimisation: make it simple. The simpler your code, the better job the optimiser does. Optimise for space, not speed, unless the function is absolutely time critical. What you save on small time shavings is normally obliterated by overflowing the processor's instruction cache, additional paging overhead, and larger working set.
February 5, 2004 8:21 PM
 

John said:

I think that some who uses DoEvents probably doesn't know what DoEvents does..
February 6, 2004 12:10 AM
 

TrackBack said:

Thinking outside the box
February 5, 2004 2:24 PM
New Comments to this post are disabled

About Rory

I *own* this site, you loser.