With Pentaho 5.0, the repository went from a file-based master repository to a database one. This has some advantages in that it’s typically easier to configure a single database in a clustered environment and point to that. However, it has the downside, that modification of text and XML files in the repository have a slower development cycle: edit, upload, test, repeat.
With the release of the Community Text Editor (CTE) that is no longer the case. You can edit, and test in a highly iterative manner. For example, I’ve been working on an action sequence (xaction) that’s in the repository. I simply open it up in CTE, edit, and run. I have the editor in one tab and the running action sequence in another. Change, test, change, test. Simple.
And lest you think it’s just a basic editor, think again. It has syntax highlighting, undo, and even cmd-S to save. It’s a very well done editor that I recommend any Pentaho developer have installed. Check it out in the marketplace.
I was recently working with a client and saw an interesting approach to a classic problem dealing with holidays as a dimension. Now maybe this is a common solution, but I hadn’t seen it before, so I thought I’d share. This same solution could be used for similar problems as well.
The goal is to be able to analyze the impact of holidays on various measures. Imagine you are analyzing sales or hours worked. Knowing if a particular day is a holiday is pretty important to understanding spikes in the data.
A first approach might be to have the holiday as a property in the date dimension. That would only work if the data you are dealing with has the same holiday for all data that points back to that particular date. This isn’t even a true case for the United States, where some states have their own holidays, much less on a global scale.
So what this client did, was solve the problem at the ETL level. For each fact, they check with the client calendar and see if the data is for a holiday or not and then set it in the fact table as a degenerate dimension. You could have a separate dimension as well, but they decided to avoid the join that would get created with Mondrian. Just make sure to index that column so you don’t do a table scan when grabbing the members.
A simple solution to what at first might appear to be a complex problem. I like that.
Somewhat random post. I saw a quote I’ve long loved attributed to Steve Jobs: “Deciding what not to do is as important as deciding what to do.” I don’t think I can really improve on that statement. So maybe I can share what I do to stay focused. For the free-wheeling types out there, this may seem a bit O-C, but it works for me.
I split my focus areas into four categories that I think are important:
- Work / Professional
- Family / Home
- Health and Fitness
These are the categories that I feel are very important. Feel free to have your own. Many might add “spiritual”, but I lump that in with Heath and Fitness.
Within each of these categories I create 2-3 goals with some being more important than others. For example, I’m trying to get into shape for a major bike ride this month. I’d also like to drop a few pounds, so that’s secondary.
I won’t go into detail on how I deal weekly with the goals, since that’s not the point of this post. What this does do, is allow me to evaluate new ideas and things I want to do. If the idea is something already on the list, then I can do something with it. If not, then I put it on my backlog to address in the future.
I’ve been using this technique for a few years now and it works really well. It helps me stay focused on a few things I can accomplish, while allowing me the flexibility to change and focus on new things. Definitely something worth considering if you feel overwhelmed or that you are having problems accomplishing your goals.
- I like to share things that I figure out that may not be so obvious to others. I like to experiment and play around with technologies and when I learning something new and non-obvious it seems nice to share.
- It makes a great resource for me to go back to. I’ve been asked questions about some of the things I’ve blogged about and I can point people to my blog. Sometimes the solution to a problem is complex enough that you can’t really remember all the steps you used, esp. when you solved the problem six month ago.
- It’s a good form of professional self promotion. I suppose there are some professionals out there that don’t use their blogs in part to help establish their credibility. But I don’t think I’ve met any. I could probably go on and on about why pros should blog, but will leave it for now.
It’s hard to believe it’s been almost a year since we announced that a book on Mondrian was in the works. But were finally getting to the point where it feels like it’s almost finished. We are getting ready to go into the final series of chapter reviews (11 total).
It’s still going to be a few more months as we finish up the appendices and indexes and update based on reviews and then the production guys make it look nice and finished. We also know that by the time the book is published some pretty big things are likely to have happened in the Mondrian technology sphere, like 4.0 actually being released and Pentaho 5.0 hopefully being released as well. But that’s the drawback to technical books. (It gives me interesting stuff to blog about.)
On the whole, I’m very happy with what we’ve put together. We’ve managed to put a lot of information into the book. So much so that we’re now looking for ways to trim back to get within our allotted page count. This book will be a great one to give to anyone who wants to learn about Mondrian and doesn’t want to visit a whole bunch of different sites and blogs. I hope you enjoy it and find it useful.
A quiet, maybe too quiet, new feature of Analyzer in Pentaho 4.8 was the addition of the AnalyzerBusinessGroup annotation. This annotation will let you specify that a measure should go into a specific group rather than be lumped in with a bunch of measures. If you have just a few measures, it’s not that big a deal. But many users have a lot of measures that can be categorized and it would be nicer to have them in separate groups. I have not tried this with dimensions, but if you define them correctly it seems that it would be overkill. I also suspect that Mondrian 4’s Measure Groups will make this obsolete, but don’t know that for a fact.
Using the feature is very simple. Just add an annotation to the measure and specify the AnalyzerBusinessGroup. For example:
results in the following (using the Steel Wheels example):
In a noble attempt to start playing around with Pentaho 5.0 (aka Sugar), I downloaded the CI version and PostgreSQL, since it’s replaced MySQL as of 4.8 as the default database for the repository. After dutifully reading the README file and changing some shared memory settings (don’t skip this step), rebooting, and remounting the install image, I kicked off the install app. Nothing. No message, no popup, NOTHING. Except for a not so helpful message in the Console that indicated one of the install files was in quarantine because of TextEdit.
After a bit of searching, I found a helpful site that told me how to turn off quarantine (search for “quarantine”). It turns out there is a handy bash command to turn off the quarantine. “$ xattr -r -d com.apple.quarantine <file-name>”. The ‘-r’ recurses if this is a file. So, I cd’d to the /Volumes/PostgreSQL 9.2.2-1 and ran “$ xattr -r -d com.apple.quarantine postgresql-9.2.2-1-osx.app” since a .app file is really a directory. And …. it failed because it’s a mounted disk without write permissions.
That should have been obvious to me, but it’s late. So I copied the .app file to ~/Downloads and ran the command again. After a quick return I ran the install app with no further problems.
Now to do all the cool stuff I had originally planned before getting sidetracked.