The Strigi vs Tracker debate

A few days ago Aaron Seigo blogged about the Strigi vs Tracker vs other search engines. I agree with Aaron that we are wasting a lot of efforts by duplicating code with very similar features. Not only that, but we spend time discussing with each other trying to come up with ways to get some overlap and share some code. It’s not easy and can be really frustrating. As free software developers, we all put in a huge commitment by coding in our free time to make the world a better place with our code. At least that’s my motivation. So let us try to maximize the effect of our efforts.

Despite code duplication, I think that having different ideas is important. When we are talking about Strigi and Tracker, we see a lot of common functionality, but also some unique features for each application. I have to give kudos to Jamie for posting his POV on Aarons blog. The post is completely accurate in describing the history of attempted collaboration between the two projects. So far the only thing we collaborate on is the Xesam specification. Sadly, every day we do not merge at least parts of our efforts, is a day we grow further apart.

I think this is a real shame, because we can easily share more code. For example, since more than half a year, Strigi has been modularized to have two libraries that give out core functionalities of Strigi to any application that would like to use them. This is all exactly according to the most important programming lesson I learned: code reuse.

If you have a function you want to perform and if you think others may benefit from it, make sure it is reusable: put it in a library. Throughout the development of Strigi I’ve taken great pains to make the code as reusable as possible. For example, the analyzer code and the indexing daemon have almost no external dependencies. I’m not even using Qt! (and yes, not using Qt is frustrating). The result of this is that we have a library, libstreamanalyzer, that captures the complete information extraction capabilities of Strigi. This library has the coolest part of Strigi inside it and it is there in the open to reuse for any indexer! Beagle and Tracker can go right out and use it.

Something Jamie did not mention is that I tailored the application xmlindexer to give out XML that would be liked by Tracker, so he could use it for indexing. This was somewhere in February, I think. Since then he’s not come round to actually using this functionality, which I think is a shame. libstreamanalyzer is a separate deb package in Ubuntu so the dependency is really small. If you use it, you get the potential benefit of all meta data extractors that we implemented in one nice and stable application. I really hope you guys start using it, because the reason I write free software is, that I want to improve the user experience for any user. I do not care if it’s a KDE, GNOME or Windows user.

I hope this blog gets picked up on the gnome planet too, because I think this is an important issue. We should share more than just X and DBus. Divide and conquer is a proven strategy and in the free software world, we help our adversaries and give them division for free.

Comments

share meta-data extractors

Being able to share meta data extractors across KDE and Gnome would be a huge benefit. There's so many file types out there and these scanners are only useful if they have info to scan...

By eean at Fri, 08/10/2007 - 22:12

Code Reuse and Programming Languages

One of the big problems of code reuse between open-source projects, especially between GNOME and KDE, are the different programming languages... sad but true. With KDE using C++ and GNOME using C + GObject, building solutions that can be used for both platforms without a lot of headaches is pretty difficult I think. It would be sooo cool if both GNOME and KDE would rely on the same prgramming language, preferrably one that is object-oriented on language level like C++; that would make things a lot simpler.

By denisw at Sat, 08/11/2007 - 11:42

The Strigi vs Tracker debate

2007-08-10

Comments

share meta-data extractors

Code Reuse and Programming Languages