Posts tagged "Technology"

Drupal Edu Initiative launched

I mentioned awhile back that one of my coworkers and I were considering putting together a simple resource site for Drupal users in educational contexts; today, http://drupaledu.org is live (many thanks to Acquia for providing the hosting)!

The initiative was born out of discussions at DrupalCamp Austin last fall; many of the folks we spoke with there were frustrated at how infrequently universities share information with one another on web issues. Since we at UNT CWS believe Drupal is a very useful tool in solving a lot of the typical web problems educational institutions face, we figured this kind of site would help tear down the silos and improve things for everyone.

At the moment, the site is very simple—authenticated users can post links to relevant outside sites, participate in forums, and vote on the usefulness of other user-contributed content. We hope the community will really get involved in this; the more people contribute content, the more useful the initiative becomes.

If this sounds interesting to you, please take a moment to check out http://drupaledu.org. And of course, if you have some Drupal-in-education-related content to share, please do; the more the merrier!

Better performance please

Over the last few months I've been pretty dissatisfied with the performance of this blog. Not only were page load times sometimes upwards of 10 seconds, but occasionally my swap usage would max out and crash the server, requiring a hard reboot. And it's a blog, for crying out loud—nothing this simple should ever flat-out crash a server, even if it's only got 256MB of RAM.

Well, this past weekend my employer closed down for a couple of days due to our own little Dallas Snowpocalypse, and I had the chance to implement a single, simple fix I'd been planning for some time. Here are the results in terms of home page ping time:

Graph showing significant decrease in ping time around February 11

So what did I do? Simple: I switched my web server software from Apache to Nginx. The hardest part was setting up the PHP FastCGI process; although there are lots of instructions online as to how to do this, most of them seem a bit outdated. I ended up using an init script from the Nginx wiki; once that was taken care of, it was a simple matter of converting my Apache confs to Nginx's syntax, switching the ports over, and watching my site's performance improve fantastically.

So there you have it—my blog is now practically readable again, and it turns out the performance problems had nothing to do with my programming! Good news on all fronts today.

DrupalCamp video posted

Last November my coworker Adrian Rollett and I got the opportunity to present a talk at DrupalCamp Austin; I've blogged about the content previously, but now that the video is up, I thought I'd post it here as well.

Read more...

Zend Framework Cron Tasks in Parallel

At the end of my recent post on building a cron service for Zend Framework applications, I mentioned a couple of weaknesses in my approach, most notably the lack any kind of locking mechanism. This post shows how to fix that.

Read more...

Cron tasks in Zend Framework apps

I recently took the opportunity to build a simple cron task manager for this blog; since the resulting system could easily be adapted to other Zend Framework applications, I figured I'd better share.

Read more...

Keeping your listeners in order

A couple of days ago I blogged about how Doctrine's SoftDelete behavior can keep other listeners' preDelete() hooks from firing; after a bit of coding this morning, I believe I have a solution.

Read more...

Know thy bottlenecks

One of my projects at work lately has been a searchable index of about 80,000 images, each involving about 20 fields' worth of metadata. It's a Drupal project, so it was pretty easy to set up the appropriate content types, fields, and so forth, but when it came time to set up searching, I made a few regrettable assumptions that cost me a lot of time.

Given the record count, I decided it didn't make much sense to use Drupal's core search functionality; I was under the impression that the core search just grepped through the contents of the node table, and would therefore not perform particularly well. That's regrettable assumption #1. Regrettable assumption #2 is simpler: I didn't think search would ever perform well as long as the index was stored in the database.

As a result, I went on an odyssey of sorts looking for replacement search engines. Some of the contenders:

Apache Solr from Acquia
Apache Solr is a Java-based search indexing platform with a supporting Drupal module, and as it happens, the Drupal support company Acquia provides a hosted Solr service that can be leveraged by subscribers. We do have an Acquia subscription, but unfortunately we also have hundreds of Drupal sites, and the subscription doesn't quite cover that many.
Self-hosted Apache Solr
We've occasionally considered setting up our own Solr instance as a service for our users around campus, but the administrative overhead doesn't really fit our schedules just yet. So again, I moved on.
Search Lucene API
Unlike the two Solr-based options, the Search Lucene API module handles its search indexing via PHP (specifically, via Zend_Search_Lucene). It also has a pretty good selection of helper modules available for things like faceted search, content suggestion, and so forth.

Of the three options, Search Lucene API seemed like the best choice with the least administrative overhead. Over the next couple of weeks I hacked away amid intermittent user support requests, slowly but surely piecing together the necessary components for a killer faceted search system. Once I was ready to try it, I started to import the content. Node by node it arrived, and the search kept on scaling successfully as it went. Pleased as punch, I went home for the evening so that the rest of the records could import.

The next morning my inbox was stuffed to the brim with out-of-memory errors from Drupal cron runs. I checked the search index settings; the system had managed to pull in around 33,000 records, but indexing had ground to a halt. It was so bad that I couldn't even access the index statistics page to tell it to rebuild. And this on a system with 112MB dedicated to PHP.

I was confused. I'd never experienced scaling problems with Zend Framework components before, and I couldn't imagine that Drupal added that much overhead. Not wanting to admit defeat, I posted an issue. Soon, the maintainer politely informed me that Search Lucene API was only intended to scale up to about 10,000 records, and less than that if they were particularly complicated.

It would seem I was hosed. However, I realized that there was one more contender I hadn't quite considered yet:

Drupal core search
Drupal comes with a built-in search module, and it's supported by any number of contributed helper modules providing the functionality it doesn't have on its own (e.g., faceted search).

Despairing of all other hopes, I turned off Search Lucene API and turned on the core search module with the appropriate helpers …and it handled everything without a hiccup!

As it turns out, Drupal's core search is a lot smarter than I'd given it credit for. Yes, it's searching against the database, but not the node table …it has a special search index table that is built up on cron runs, just like the other modules do it. With that in mind, it's no surprise that it's a lot faster than I had expected …plus, it doesn't introduce nearly the same PHP memory overhead as Search Lucene API, because a lot of the heavy lifting is offloaded to the database server (which, in our case, is more than powerful enough).

The moral of the story? Know thy bottlenecks. If I had realized how well Drupal's core search performed I never would have tried to optimize it out of the equation, and I would have saved myself a significant amount of development time. Good to know; lesson learned; hope this helps someone else.

When is a DELETE not a DELETE?

In my recent post on using Zend_Acl with Doctrine record listeners, I described a way to automate a Doctrine-based application's access control logic based on certain event hooks in Doctrine's record listener system. I still think it's a fairly elegant approach, but as I've been working with it, I discovered one behavior I didn't quite expect.

As it happened, one of the models on which I was using this technique also implemented Doctrine's core SoftDelete behavior. With SoftDelete enabled, calling $record->delete() doesn't actually remove the record from the database; instead, it provides and sets a deleted_at column and then adjusts all your other queries to treat any record with a deleted_at value as though it isn't there. In other words, all SQL DELETEs become UPDATEs, and all SQL SELECTs get an extra WHERE clause that ensures no "deleted" records are ever returned unless you explicitly ask for them. Pretty ingenious, really; it's nice if you think you'll ever need to recover from accidental deletions (though fortunately I haven't had to use it yet).

However, I recently discovered something I probably should have expected in the first place: when I called my SoftDelete-powered record's delete() method, my record listener's preDelete() hook wasn't firing; after some further research I discovered that it was firing preUpdate() instead.

As it turns out, since SoftDelete turns what would have been a SQL DELETE operation into a SQL UPDATE, the pre- and postDelete hooks are overridden with their *Update equivalents (at least after SoftDelete's own delete hooks have finished up). The unfortunate side effect? Since the preUpdate hook allows users with the "update" permission to proceed, users who had "update" could delete records, even if they didn't have the "delete" permission. Not a great setup, all things considered.

Now, I do have other protections in place. For one, I'm not only checking permissions at the model layer; my controllers do still have a few remaining ACL checks to avoid showing the user interfaces they won't actually be able to use. That said, I'd love to find a workaround for this, especially if I ever release any of this code for public use.

At the moment the only thing I can think to try is to figure out a way to ensure that my record listener is registered earlier in the stack than SoftDelete is. I'm not sure this is possible with how Doctrine behaviors are registered, but I figure it's worth some experimentation. I'll let you know how it goes.

Using Zend_Acl with Doctrine record listeners

Zend_Acl is a powerful tool to help manage access control logic, but it can be difficult to determine just where and how to use it. Today, I'll show you how this blog manages access control using Zend_Acl with some custom Doctrine record listeners.

Read more...

On being a framework guy in a Drupal world

Recently, I've stumbled across a few rather interesting blog posts on how Drupal is perceived in other parts of the PHP development community. Thought I'd post the links and make a few comments here.

Read more...

< Previous | 1 | 2 | Next >