Archive for December, 2008

Not Using Common Sense Considered Harmful

Friday, December 26th, 2008

I’m a little crosseyed from looking at code non-stop for the past few days, so I’ll take a minute to wax philosophical.

I’ve been seeing a few posts here and there that take a very dogmatic approach to things. Take any number of “Considered Harmful” posts, or the “You’re doing it wrong” post. The most recent example is “Comments Considered Harmful” at the PHP In Action blog. All of these posts invariably end up with the author beckpedaling either in the comments or in a subsequent blogpost.

Me, I find all this silly. It’s very obvious that the same set of “rules” won’t apply to every environment. While these sorts of posts tend to “stimulate” discussion, which Dagfinn actually claims as his goal, I definitely think there are better ways to achieve the same thing. I’m sure that a post titled “Making code more readable by modularization” with the exact same content would produce a much healthier, yet equally useful dialog.

The answer lies quite frequently in moderation.

Splitting out complex logic into smaller, well-named function helps, but no amount of modularization will help a developer revisiting my code a year later that performs a 6-fold union to determine what resources a user has permissions for based on multiple relationship paths. That shit is confusing even to me. Nor will an ORM be able to do this work for me, at least not efficiently.

Using a framework is nice, but if you’re in an environment where the stack has been hacked up and customized, they frequently won’t work or will produce unexpected behavior. Naturally, there’s always the performance issue. There is nothing wrong with writing your own “framework,” whatever that may mean to you, that suits your own specific needs.

A careful examination of a problem tends to lead to a good solution. There are very few right solutions.

Here’s my list of “You’re doing it wrong if:”

  • You don’t take the time to fully understand the problem you’re trying to solve
  • You don’t document the thought process in a centralized, well known location
  • You don’t discuss the problem with anybody
  • You discuss the problem with EVERYBODY, including people with their own agendas that have no interest in that particular problem
  • You don’t consider how future changes will affect your work
  • You aren’t aware of already-existing solutions  (I see SO much crappy code being written while there is a well supported solution already available, sometimes from within the company)
  • You refuse to give up the crappy half-written solution you’ve created in favor of a better one you’ve discovered, having already started – out of pride
  • You don’t take ANY pride in what you do, and commit crappy code “just to get it done”
  • You don’t write any unit tests – you know that something broke when a user notices it
  • You write a unit test for EVERYTHING – your build takes longer than it would take to manually execute every usecase, and you’ve likely missed deadlines
  • You don’t know enough about the tools available to pick the right ones
  • You don’t follow the development of tools within your stack (saw a lot of that with the PHP4->PHP5 transition)
  • You don’t ask for input from someone that knows more than you do
  • You throw the book at everything – “Gang of Four” etc. are all books, but I think it’s a waste of time to write double the code needed for something, just to say that it follows a certain pattern (this will likely be its own post sometime in the future)
  • You don’t learn from past mistakes or refuse to admit they were mistakes
  • You give up common sense in favor of dogmatic principles like “Comments are harmful” and “Never write your own framework”

At the end of the day, the last bullet can probably sum up this entire post: use your common sense, and you’ll probably be fine. Naturally, if you’re a moron and lack any semblance thereof, disregard everything I’ve written, find some polarizing views on some blogs, and cherish ‘em like they’re your last pair of clean underpants the day your girlfriend gets back into town.

Gettin My Mits on YUI 3 Widgets: Piemenu

Tuesday, December 23rd, 2008

I was given a small assignment at work – to develop a jazzed up front page for my group with links to all the apps/systems we offer (for those catching up, I work in Yahoo! Ops, making internal tools).

My manager suggested a pie menu as one of the ideas. “That’s crazy talk,” I thought at first. Pie menus are for non-profitable european video social networks. Then I thought about it, and decided to just write it – see if I’ve got some javascript chops (turns out, I do). If Schill and Dimitry can make pages with Javascript that look like they were made in Flash, so can I, damn it!

I’d been taking an occasional look at YUI3 PR2, and decided I’d try it on for size for this project. I must say, it fits. Extending the Widget class properly using Y.mix to add attributes and Y.extend to add methods ends up handling a lot of the dirty work for you, such as parameters and defaults. This is definitely a case where the framework is well designed, and staying within its bounds has enormous benefits.

Though I’m a huge code hoarder by nature, I’m just gonna throw the initial wireframe up on Github and link it here, feedback is welcome.

Docs on extending the widget class are here.

A small sample widget is here.

UPDATE: a simple proof of concept is here (click on the wrenches, and don’t expect too much)

Define Failure…

Saturday, December 20th, 2008

failure graphicJust saw a salient example of something I notice quite a bit in various documentation sources.

Lots of manuals and API references will say stuff like “Returns FALSE on failure” with little to no clarification as to what that means. Though it is usually intuitive, there are frequent cases where a little more attention should be paid. Example: PHP manual page for Memcache::delete.

The method takes two parameters: the key to be deleted and the optional timeout. The second parameter specifies how long Memcache should wait before deleting the key. Like many others, the function “Returns TRUE on success or FALSE on failure.”

The primary use case of this method is obviously just deleting a value stored in memcached. But let’s actually examine the possible outcomes.

  • Key is found and successfully deleted. Obvious Success.
  • Memcached server cannot be reached. Obvious Failure.
  • Key is found, but some sort of network or server glitch prevents it from being deleted. Obvious Failure.
  • Key doesn’t exist. How do we classify this? On one hand, the “goal” of calling the method is accomplished – the key is not in Memcache. However, the method itself didn’t technically do what it was supposed to do. Its purpose is to delete a key, and it didn’t. Furthermore, it appears that the developer was mistaken as to the state of the cache. Though in a lot of cases that’s fine, what if the developer is privately counting on the key actually being there (this will be more fleshed out in the second usecase). Matters are further complicated by things like memcachedb – a persistent storage backend that uses the memcache protocol. Here, a missing key could present a serious problem, and the developer should definitely know about it, granted this could just be another argument against putting persistent storage behind a protocol meant for the opposite. One way or another, it’s not clear from the documentation what the method would return in this event.

Things get a little more complicated once the second parameter is invoked. The timeout parameter allows the developer to delay the deletion of the key. The first three bullets above roughly apply the same way with minor obvious adjustments (i.e, in the third bullet replace “being deleted” with “having its ttl adjusted”).

The fourth bullet, however, is even more salient. The setting of a timeout value implies that the developer indeed not only expects the key to be there, but is counting on the key to be there n seconds later.

Naturally, I fully understand that one should rarely COUNT on certain things being in the cache, so the aforementioned concerns will likely be irrelevant in most applications. However, I’m not singling out PHP or Memcache: the same concern applies to plenty of other APIs. I remember wondering what “Failure” meant to the YUI Get utility (I thought it was just a non-200 HTTP code, but directing it to a non-existent URL didn’t seem to trigger it, so it’s unclear), and there are plenty of other cases.

I’m not a fan of returning error codes and having to use huge switch statements to determine what to do in the event of every failure in the application, nor do I advocate throwing finegrained exceptions left and right (the two are nearly identical in my mind). However, I do believe that more care should be taken to document what constitutes a failure and, for functions that don’t have a clearly defined return value, success. Perhaps @failure and @success tags can be added to the docblock spec to facilitate such documentation. For every place in a method where false is returned to signify failure, a @failure block would be added, and likewise for success.

PS: if anybody happens to know the correct answer to my Memcache::delete question, let me know!

A Couple of Quick PHP Tricks

Friday, December 19th, 2008

More seasoned PHP hackers probably already know this stuff, but I thought I’d share a couple things that made me think “Man this is a cool fucking feature” when I used them in my work yesterday.

array_keys is smarter than it looks

The short description for this function in the PHP docs says “array_keysReturn all the keys of an array.” Unfortunately, that is as far as a lot of people read. However, there is an incredibly useful feature hidden in the optional parameters: “If the optional search_value is specified, then only the keys for that value are returned.” So apart from just dumbly getting all the keys the array contains, you can actually parse out some really useful stuff.

Use Case:

You have a table that lists a bunch of entities, and you have to give the user the ability to perform n actions where n > 1 and the actions are mutually exclusive. Simply defining a bunch of checkbox arrays won’t work, so you have to use radio buttons. (You could use checkboxes and use javascript to ensure the exclusivity, but then you’d be a jackass).

Radio buttons, as you know, group by name. So you have to have N radio buttons per row, each with a different value, and a name that indicates which entity the input pertains to. If you’re really smart, you could make each radio group a member of an array, where the array’s key indicates the ID of the entity:

<input type="radio" name="action[<?= $entity->id ?>]" value="delete" />

When the form is submitted, you’ll end up with $_POST['action'][12] => ‘delete’, etc.

Now that you’ve got your input in a nice, tidy array, you can start the magic:

$deletions = array_keys($_POST['action'], 'delete');
$approvals = array_keys($_POST['action'], 'approve');

Now you have just the ID’s of the entities that need to be deleted or approved. Unless you have some really shitty logic libraries, that should make yoru life incredibly easy. I obviously skipped stuff like validation for the sake of brevity and readability, but I would also recommend mapping your actions to an indexed array, so that your radio value were actually more like 0, 1, 2, 3, etc so as to minimize the amount of data posted. Some would also say that you shouldn’t use PHP short tags, but I really just don’t care.

I should also mention that in PHP 5 array_keys has a third parameter bool $strict, which causes it to use === comparison instead of == when parsing the array. Full manual here.

array_splice’s 4th optional parameter might be its most useful

I found myself somewhat dumbfounded when a co-worker asked me if there was a standard function in PHP to insert a set of values at a given point inside an indexed array – I couldn’t think of it! My intuition drew me towards array_splice, even though I knew that the default behavior of that function was the opposite. Having just used the aforementioned obscure feature of array_keys, I guessed that array_splice would either have a similar feature or would direct me to the manual page for its compliment function. The former was correct.

The 4th parameter of array_splice is $replacement. It is pretty self explanatory, but I’ll hold your hand like a small child and do an example anyway.

<?php
$alphabet = $alphabet_broken = range('a','z');
//default behavior. We just got rid of b and c
$missing = array_splice($alphabet_broken, 1, 2);
//same offset, 0 length so nothing is removed
//the missing letters in $replacemenent
array_splice($alphabet_broken, 1, 0, $missing);
//if I don't suck at life, the two should be identical
if (array_diff($alphabet, $alphabet_broken)) {
    echo "Something got fucked up!";
}
?>

The full manual entry for array_splice is here.

My wordpress is more better now too!

Sunday, December 14th, 2008

Upgraded to 2.7, swapped for a very basic, unfinished theme. Just had to get rid of the orange-ish monstrosity I had up here before, obviously still working on it.

WordPress 2.7 very nice so far (using the QuickPress feature for this)

My C is more better now!

Friday, December 12th, 2008

I have pushed an update for my silly php extension to github for anyone that wants to have a looksie. I’ve optimized the human_interval_precise function to only declare two variables (the long to hold the number of seconds passed in from userspace and the char array that gets passed back). Still need to figure out a sensible max length for the return value and replace the arbitrary char[60] I have in there now.

I’ve also added a human_interval (not precise) function for approximating in the largest possible units. However, I just realized as I was writing this that I forgot to make it round in any way, so it probably comes up with fairly bogus results right now. Fail. Good thing this is just an exercise, and I’ll have another 3 hours on a shuttle to kill on Monday… and on Tuesday… and on Wednesday…

As I said before, I’m also working on a redo of the WordPress theme, as well as another more “grown up” C project.

Also, did not get laid off. Best of luck to all that did. Crazy times we live in.

PS: just saw on my nifty little wordpress toolbar that 2.7 is available. FUCKING ROCK! New design coming with the quickness.

Learning to walk after running for 4 years: from PHP to C

Sunday, December 7th, 2008

In an effort to stop being such a fucking noob, I’ve seriously taken up C. As a natural transition/crutch, I have started with PHP extensions. The PHP API basically picks your droppings up after you with a plastic bag, so it’s pretty easy. Using ext_skel gave me pretty much everything I needed, since I’ve only written one function so far and have not done anything fancy.

The extension is called “human,” short for “human readable.” The first function, human_interval_precise, simply represents a number of seconds in the largest “precise” time units possible. That is, thousands of seconds are converted to weeks, days, hours, minutes, and seconds as needed. The source is here.

Lessons

I’m obviously still a giant noob, but there are some things that I have picked up along the way (mostly via Crowley) that made things a lot clearer. Hopefully my posting this will help other noobs following in my footsteps.

“Pointers are just numbers”

Yup. Just the number of memory blocks from the start of the segment of memory you’re dealing with. “Pointer math” is really just math: p+5 really is just the position in memory 5 blocks away from the start of p. The size of the blocks is determined by the pointer type, so if you have a char pointer, the blocks are 1 byte.

an array is just a pointer to a set of consecutive memory blocks

The name of the array is just the pointer to the first element. This is particularly important with char arrays. When I do the following in my code, I am just adding output to the end of an already existing “string.”

sprintf(retstr+strlen(retstr), "%dd ", days);
The K&R section on the relationship between arrays and pointers is incredibly useful.

Always remember to initialize your strings

Nothing says “FAIL” like running your PHP function more than once in the same script and seeing it return its result appended to the end of the result from the previous call. A simple

*retstr = 0;
at the top of the function takes care of the problem. Otherwise, your char pointer ends up pointing to memory that isn’t claimed by anything else, but isn’t cleared either. In the case of consecutive PHP function runs, it just happens to be the previous output. It could end up being any sort of useless garbage.

I started writing this entry a few weeks ago and got sidetracked by a bigger and more interesting project/thanksgiving. As a result, I don’t remember what else I was going to write, so I’ll just wrap it up. More C learnings coming soon. It has been a humbling experience to say the least.