I've had an idea for a while now to make a Facebook application that will let lovers of New Haven Pizza sample a pie (no slices here) when they are not able to get to their shrines of Pepe's, Sally's and Modern.
For those of you NOT familiar with New Haven Pizza, it, like all great things, inspires a great deal of loyalty, religious devotion and controversy. The Facebook group 'I live for New Haven Pizza' has 6000+ members and is a 24/7 debate room full of arguments over which is the best, when to go to avoid the lines and rude dismissals of other inferior sources of tomato pie.
As an act of community service to the lovers of my hometown's pizza, I thought i would create an application that would let users create pies that faithfully represented the topping combinations from the three best spots while respecting their quirky hours (only Pepe's is open Mondays, only Modern has late hours and Sally's never opens before 5pm).
It would also be a good way to explore some of the themes I've been working on including modeling content in XML, using XQuery to build applications and delivering content to new distribution channels.
Since there is a lot to cover and to help get this started, I've broken this up into a series:
- In this first segment I will look at the XML model behind the application.
- The next segment will cover using XQuery to power the application and how MarkLogic Server can be the single resource you need for to power an information product (even if its just about fake tomato pie).
- In the last segment I'll talk about delivery and how XQuery is perfect for integrating with the new web application platforms like Facebook.
Part 1: An XML model for a Pizza place
Using the right data model for an application is something that has been coming up a lot lately and in particular for applications that want to use XQuery but may not have XML. XQuery has come a long way and I think that the arguments I made in this blog entry a couple years ago really ring true: modeling data as XML can be much more flexible than relational and lets you use great tools like MarkLogic Server. The proof is in great applications such as MarkMail or AuthorMapper all with only XML data and powered by MarkLogic.
For our example we'll take a Pizza place. The first thing you will notice is that I didn't say 'lets take Pizza'. This is because the application is about these places as much as it is about the food (see above re: the fanatical user base). So I want to capture whatever details I can and use them in the app if possible . . . and I want to be able to add more later.
Instead of breaking up the information into little bits and making tables, I simply made one document with an XML schema that looks like this:
<notes>Sally's was established in 1938 by Salvatore Consiglio. Renown for thin crust pizza ...</notes>
<notes>At Sally's mozzarella is considered a topping, so if you want it on your pie let us know!. </notes>
This schema lets me capture everything I need to run my application and also capture many of the differences between the places.
For instance Modern and Sally's call it APizza, but at Pepe's your order a Tomato Pie. So the entry for Pepe's has a <tomato-pie> element instead of the <apizza> one. And even though Sally's still has just a photo of their menu, they are the only ones with some text useful to include. So the others don't, for the moment, have notes.
The toppings and kinds of pie are also pretty different. While Sally's just has toppings, Pepe's has a mix of toppings and specials. This is the topping section for Pepe's:
<special>Clam white or red sauce only</special>
<special>Clam with mozzarella</special>
Modern also has specials, but its a mix of things that you can add to regular pies and real specials like the famous 'Italian Bomb'. The data is faithful to the Modern menu but has a <combine-specials> element to record what the waiter would explain if you asked.
The item ordering of the menus is also preserved. They each have different, unique, non-alphabetic order. Since this was easy to capture just by putting them in document order I thought it best to record it and stay on the good side of the fanatics out there.
Finally, the hours structure is modeled on the way the way restaurants describe hours - Monday's we're closed, Tue-Fri we are open until 10 etc etc. and is easy to extend with new sets of rules such as holidays and summer vacations (which I didn't put in but are indeed part of the picture).
Approaching even this simple example as a relational model gets really complex, really fast. A rough estimate for a practical (not fully normalized) schema would be at least 5 or 6 main tables and 8 to 10 relationship tables. This would take at least a couple of hours to model and then create in a database.
But to top it off (sorry), it would be pretty hard to model for the differences. In fact, the only way to do it is to add more complexity to the table structure and even more relationships to note that one place has specials and another lets you mix specials and toppings etc etc. And when you want to add more information to power new features of our application, you would go back to the drawing board, make some new tables and have to redo the whole thing again.
By contrast, the XML I am using is a simple document I made in 15 minutes and, when I load it into MarkLogic lets me do queries like this to see where I can get a Bacon APizza:
Or a Clam Tomato Pie:
Having created and loaded my data, I'm now ready to start the next tasks of building an application that will let users query the XML to find out where they can get a pie and create a virtual pizza and then work on wrapping it up so it can be used within Facebook.
Right now I'm getting pretty hungry and sadly am nowhere near New Haven!