May 08, 2009

Getting Ready: T-minus 3 Days to the Mark Logic User Conference

The run up to the Mark Logic User Conference (The World's Largest XQuery Gathering) is usually hectic and this year is no different.  I've been busy preparing for the 1-day orientation class that I will be teaching and desparately trying to squeeze in bike rides so I don't embarrass myself on the 'unofficial' conference ride I'm putting together.  Getting ready for the class is somewhat easier as it mostly involves brushing up on Shakespeare.  We use the complete Shakespeare XML as a basis for the examples and its always fun to do some exploring of the text with the class.  I am hoping to expand my set of queries for cool things like King Lear's Falchion.

Other folks are also plenty busy.  Scott Able, AKA The Content Wrangler, will be our official Blogger at the conference and his interview (located here) with MarkLogic's Norm Walsh is a great warm up for the conference.  It has excellent examples of XML and XQuery in action, some sound reasoning on the correctness of using XML for content and some thoughts on its widespread (but still unsung?) use in almost everything.  I particularly like his retelling of how Dave Kellogg steered him in the right direction in a response to his newbie question on a MarkLogic internal list.  Norm, the rest of us just thought you were crazy :)

Dave's attention to detail does pay off - this week he was busy collecting awards:  MarkLogic Server just won the SIIA Codie Award for Best Databse Management System and Dave's blog (where among other things he related that story about Norm and talked about the bike ride) won for Best Corporate Blog.  Way to go Dave . . . way to go MarkLogic Server team!

Over at O'Reilly, Kurt Cagle has been pushing the boundaries of MarkLogic Server and our implementation of XQuery for his presentation at the conference.  This post looks at some of the neat features that Mark Logic has added to XQuery to make programming easier.  Function mapping will automatically execute a function for every item in a sequence and maps let you work with hash maps within XQuery.  I particularly like Kurt's post because he gives good XQuery code samples, something I am always trying to create more of.

I'll have to make sure I add Kurt's session (Tuesday at 2:45) to my conference plan.  It's getting pretty full with so many great customer sessions (gotta make BusinessWeek and Wiley) and can't miss technical presentations (AtomPub from Norm, Kurt's session and the MarkLogic feature sessions) that I'm afraid I'll miss out on all the good side conversations about how XQuery is changing the face of content applications. 

If I can just make it through getting ready, it is going to be really fun!

Matt

April 28, 2009

World's Largest XQuery Gathering Only Two Weeks Away!

The Mark Logic User Conference is just two weeks away!   Once again, XQuery enthusiasts will flock to San Francisco for a great mix of hands on technical sessions, best practices and, my favorite, customer application profiles.  Look for lots of live demos, interactive and frank discussions about using XQuery and some very eye opening applications. 

Publishing customers like BusinessWeek and Wiley will be highlighting their new content applications, companies like JetBlue talking about how XQuery driven applications have impacted their enterprise, government integrators Boeing and Booz Allen will be talking about how MarkLogic Server helps them get the job done and organizations like the The Church of Jesus Christ of Latter-day Saints who will showcase their use of MarkLogic Server to power all kinds of applications.

More than few key members of the XML and XQuery community will also take part with Kurt Cagel talking about ReST services and Norm Walsh highlighting AtomPub powered by XQuery.

What all these sessions share is a strong belief that the XML model and XQuery programming really work, get the job done faster and help create wonderful new applications that just aren't possible using the traditional (old?) tools like databases and search engines.

And with over 400 people attending, it is indeed the World's Largest XQuery event!  So be there or be ... relational or something.

Matt

Mark Logic User Conference 2009 Breakout Session Guide

April 02, 2009

Making Pizza Right Part 2: The XQuery Application

[This is part 2 of a series on the creation of a Facebook application to let fans of New Haven pizza order a virtual pie.  Part 1 is here.]

Now that I've created our XML data model for the Pizza places, I can move on to developing the XQuery code to power the application.   Creating applications using XQuery is something I do a lot at MarkLogic and seems to be getting some mainstream attention lately.   IBM has been developing resources such as this article on using XQuery to develop applications, this tutorial XQuery idioms and this guide to using XQuery to create dashboards.  An open source group called 28msec recently published this paper which takes the even better approach of describing an end-to-end XQuery development paradigm.

MarkLogic's own Norm Walsh has also been leading the way with an all XQuery AtomPub server and a recent project that was all XML + XQuery.  And the results of using XQuery from top to bottom can be seen in such great applications as the  MarkMail email archive and the Digital Library from the Princeton Theological Seminary Library.  The Princeton app opens up the archives of the world's 3rd largest seminary library to the public with rich search and actual images of the original books all with just the XML about the books and XQuery.

For our application, we'll be using XQuery to create an application to let New Haven pizza lovers create a pie from their favorite pizza place that is accurate in the toppings you can select and also only available when the actual places are open in New Haven (a big part of New Haven pizza is waiting for it).

Our XML data for the application looks like this:

<new-haven-pizza>
    <restaurant>
        <name>Sally's</name>
        <hours>
            <set>
                <day>Tuesday</day>
                ...

                <open>17:00:00</open>
                <close>22:30:00</close>
            </set>
        </hours>
        <apizza>
            <sizes>
                <size>Small</size>
                ...

            </sizes>
            <toppings>
                <topping>Mozzarella</topping>
                <topping>Anchovies</topping>

                <special>...</special>
        apizza>
    </restaurant>
</new-haven-pizza>


There are some variations that we explored in Part 1 and we'll account for these in our code.

I plan to have the application be a Facebook app that allows users to select a restaurant and then order a pie.  To do this we need a set of services to return values for each step in the process.  I'll collect these services in a single XQuery module and then call them from the main code of the Facebook interface (that I'll put together in Part 3).

The first service is a list of restaurants.   Since I want to be true to the restaurants hours, I need to use some of the extensive XQuery date functions to see if things are open as well as use the day of week name function from FunctX, the XQuery function library.

I'll start with this query that returns the list of restaurants:

    for $restaurant in /new-haven-pizza/restaurant/name return $restaurant

I need to add to this basic function to check the day and time and only show what's open.  I also need to have an error message and, since this is a service that will be feeding a web application directly, I'll go ahead and make the everything return the appropriate XHTML.  So <select> elements from forms and <p>s for error messages etc.

(: module declaration and namespace :)
module namespace nh="http://xquery.typepad.com/nh-apizza";

(: import the functx module with the  day of week name function :)
import module namespace functx="http://www.functx.com" at "functx.xqy";

(: get restaurant function :)
declare function nh:get-restaurants() {

(: get the current date and time :)
    let $current-daytime := fn:current-dateTime()

 (: use the day of week name function to see what day it is :)
    let $day := functx:day-of-week-name($current-daytime)
 (: get the time :)
    let $time := xs:time($current-daytime)
 (: use the day and time to get the restaurants that are open :)
    let $restaurants :=  /new-haven-pizza/restaurant[./hours/set/day eq $day]
                            (: [xs:time(./hours/set/open) < $time]
                            [xs:time(./hours/set/close) > $time] :)     
return
(: return the restaurants Or an error :)
    if ($restaurants) then
        <select name="restaurant">
            {for $restaurant in $restaurants
            return
(: lets use the fullname if its available for the label :)
                <option value="{fn:string($restaurant/name)}">{fn:string(if ($restaurant/full-name) then $restaurant/full-name else $restaurant/name)}</option>
        }</select>
    else
        <p>Nothing is open right now!  It is {$day} {$time} Eastern Time.</p>
};


The next functions pick up from here by listing the options for making a pie once you have selected a restaurant.  These are the sizes, the toppings (with the specials as toppings if the restaurant allows it) and the specials.

For each function we do a simple value test on the name element and then go get what we need:

declare function nh:get-sizes($restaurant) {
    <select name="size">
    {for $size in /new-haven-pizza/restaurant[./name eq $restaurant]/(apizza | tomato-pie)/sizes/size
    return
        <option value="{fn:string($size)}">{fn:string($size)}</option>
    }</select>
};

To get the full topping list and include the specials that could be used as toppings (an oddity of Modern) we need to do some additional checking on the <special> element and also make use of the flexible XPath model (element1 | element2) to return topping *and* special elements in the same statement:



declare function nh:get-toppings($restaurant) {
    for $topping in /new-haven-pizza/restaurant[./name eq $restaurant]/(apizza | tomato-pie)/toppings/(topping | special[../../combine-specials eq "yes"])
    return
      <input type="checkbox" name="topping" value="{fn:string($topping)}">{fn:string($topping)}</input>
};

And then the pretty simple list of specials:


declare function nh:get-specials($restaurant) {
    for $topping in /new-haven-pizza/restaurant[./name eq $restaurant]/(apizza | tomato-pie)/toppings/special
    return
        <input type="radio" name="special" value="{fn:string($topping)}">{fn:string($topping)}</input>
};


All of these functions provide the information needed to create the pie selection forms.   Once we gather that data back, I need a pie maker function to present the finished pie:

declare function nh:make-a-pie($size, $toppings, $special, $restaurant) {
    let $pie := if ($special) then
                    $special
                else
                    fn:string-join($toppings, ', ')
    return
        <p>Here is your {$size} {$pie} from {$restaurant}.  Enjoy and thanks for using the New Haven Pizza Maker!</p>
 };

And thats really it!  Its under 60 lines of code to make a complete application.  This simplicity has a lot to do with the fact that I'm never leaving the world of XML.  There is no impedance mismatch (that Dave Kellogg blogged about in this post) and XQuery has everything you need to execute a content application, all in one language whose output *is* XML.

This is why XQuery applications are a growing trend - people are discovering what Joris Petrus Maria Graaumans proved in his Doctoral Thesis three years ago - XQuery is just plain better!  My entry on his paper is  here and the whole thing is here.

In the next part of this series I'll put it all together, build the Facebook interface and launch the New Haven Pizza maker.

See you next time,

Matt

March 10, 2009

Making Pizza Right Part 1: An XML Model for a Pizza Place

I've had an idea for a while now to make a Facebook application that will let lovers of New Haven Pizza sample a pie (no slices here) when they are not able to get to their shrines of Pepe's, Sally's and Modern.

For those of you NOT familiar with New Haven Pizza, it, like all great things, inspires a great deal of loyalty, religious devotion and controversy.  The Facebook group 'I live for New Haven Pizza' has 6000+ members and is a 24/7 debate room full of arguments over which is the best, when to go to avoid the lines and rude dismissals of other inferior sources of tomato pie.

As an act of community service to the lovers of my hometown's pizza, I thought i would create an application that would let users create pies that faithfully represented the topping combinations from the three best spots while respecting their quirky hours (only Pepe's is open Mondays, only Modern has late hours and Sally's never opens before 5pm).

It would also be a good way to explore some of the themes I've been working on including modeling content in XML, using XQuery to build applications and delivering content to new  distribution channels.

Since there is a lot to cover and to help get this started, I've broken this up into a series:

  • In this first segment I will look at the XML model behind the application.
  • The next segment will cover using XQuery to power the application and how MarkLogic Server can be the single resource you need for to power an information product (even if its just about fake tomato pie).
  • In the last segment I'll talk about delivery and how XQuery is perfect for integrating with the new web application platforms like Facebook.


Part 1: An XML model for a Pizza place

Using the right data model for an application is something that has been coming up a lot lately and in particular for applications that want to use XQuery but may not have XML.   XQuery has come a long way and I think that the arguments I made in this blog entry a couple years ago really ring true:  modeling data as XML can be much more flexible than relational and lets you use great tools like MarkLogic Server.  The proof is in great applications such as MarkMail or AuthorMapper all with only XML data and powered by MarkLogic.

For our example we'll take a Pizza place.  The first thing you will notice is that I didn't say 'lets take Pizza'.   This is because the application is about these places as much as it is about the food (see above re: the fanatical user base).  So I want to capture whatever details I can and use them in the app if possible . . . and I want to be able to add more later.

Instead of breaking up the information into little bits and making tables, I simply made one document with an XML schema that looks like this:

<new-haven-pizza>
    <restaurant>
        <name>Sally's</name>
        <location>Wooster Street</location>
        <payment>cash only</payment>
        <est>1938</est>
        <website>sallysapizza.net</website>
        <notes>Sally's was established in 1938 by Salvatore Consiglio. Renown for thin crust pizza ...</notes>
        <hours>
            <set>
                <day>Tuesday</day>
                <day>Wednesday</day>
                ...

                <open>17:00</open>
                <close>22:30</close>
            </set>
            </hours>
        <apizza>
            <sizes>
                <size>Small</size>
                <size>Medium</size>
                <size>Large</size>
            </sizes>
            <toppings>
                <topping>Mozzarella</topping>
                <topping>Anchovies</topping>
                <topping>Bacon</topping>
            </toppings>
            <notes>At Sally's mozzarella is considered a topping, so if you want it on your pie let us know!.                        </notes>
        </apizza>
    </restaurant>
</new-haven-pizza>

This schema lets me capture everything I need to run my application and also capture many of the differences between the places. 

For instance Modern and Sally's call it APizza, but at Pepe's your order a Tomato Pie.  So the entry for Pepe's has a <tomato-pie> element instead of the <apizza> one.  And even though Sally's still has just a photo of their menu, they are the only ones with some text useful to include.  So the others don't, for the moment, have notes.

The toppings and kinds of pie are also pretty different.  While Sally's just has toppings, Pepe's has a mix of toppings and specials.  This is the topping section for Pepe's:

            <tomato-pie>
                ...
                <toppings>

                <topping>Mozzarella</topping>
                <topping>Sausage</topping>
                <topping>Mushroom</topping>
                ...
                <special>Clam white or red sauce only</special>
                <special>Clam with mozzarella</special>
                <special>Chicken</special>
                ...

                </toppings>
            </tomato-pie>

Modern also has specials, but its a mix of things that you can add to regular pies and real specials like the famous 'Italian Bomb'.  The data is faithful to the Modern menu but has a <combine-specials> element to record what the waiter would explain if you asked.

The item ordering of the menus is also preserved.  They each have different, unique, non-alphabetic order.  Since this was easy to capture just by putting them in document order I thought it best to record it and stay on the good side of the fanatics out there.

Finally, the hours structure is modeled on the way the way restaurants describe hours - Monday's we're closed, Tue-Fri we are open until 10 etc etc. and is easy to extend with new sets of rules such as holidays and summer vacations (which I didn't put in but are indeed part of the picture).

Approaching even this simple example as a relational model gets really complex, really fast.  A rough estimate for a practical (not fully normalized) schema would be at least 5 or 6 main tables and 8 to 10 relationship tables.  This would take at least a couple of hours to model and then create in a database.

But to top it off (sorry), it would be pretty hard to model for the differences.  In fact, the only way to do it is to add more complexity to the table structure and even more relationships to note that one place has specials and another lets you mix specials and toppings etc etc.  And when you want to add more information to power new features of our application, you would go back to the drawing board, make some new tables and have to redo the whole thing again.

By contrast, the XML I am using is a simple document I made in 15 minutes and, when I load it into MarkLogic lets me do queries like this to see where I can get a Bacon APizza:

/new-haven-pizza/restaurant[./apizza/toppings/topping eq "Bacon"]/name
->
<name>Modern</name>
<name>Sally's</name>            


Or a Clam Tomato Pie:

/new-haven-pizza/restaurant[./tomato-pie/toppings/topping eq "Clam"]/name -> <name>Pepe's</name>


Having created and loaded my data, I'm now ready to start the next tasks of building an application that will let users query the XML to find out where they can get a pie and create a virtual pizza and then work on wrapping it up so it can be used within Facebook.

Right now I'm getting pretty hungry and sadly am nowhere near New Haven!

Thanks,

Matt

October 17, 2008

Good News (for XQuery at least)

Lots of good things are happening with XQuery which is a nice change from the regular news.

First and foremost, MarkLogic Server 4.0 is out!  With this release, MarkLogic Server is updated to XQuery 1.0 and has added some very cool features to extend the capabilities of XQuery.

The highlights are:

  • Geospatial support.  Lets you add location as a dimension to queries and find where your content is at.  Like this search result from some content about earthquakes where the search is for the word 'new' AND anything within a 500 mile radius of Utica, NY:

Geo-search  
  • Entity Enrichment.  Puts some smarts right into your content by marking up the people, places and things.  This is coupled with an enrichment framework to plug your own text mining tool into the process.
  • Alerting.  Lets you notify people about new content they may be interested in.  A key part of this is something called reverse-query which returns the queries that match incoming content.  Very cool.

As usual theres a lot more in MarkLogic Server 4.0 and here are links to an overview and the developer site.

But it's not just Mark Logic that has been busy.  At Elsevier the Article 2.0 contest is off and running.  It's a chance to win $4000 by coming up with new ways to present scholarly articles.  Who couldn't use a little bit of extra $$$ in these times?  All you have to do is create the best new way to display the source XML Elsevier is providing.  Here are some hints on how to use XQuery to give you an edge.

And overall XQuery is getting more and more attention.  When I first started writing this blog (back in April of 2006  - wow thats more than 2 years ago!)  XQuery was that strange language that used squiggly brackets and smiley faces and there just wasn't a lot of dicsussion about it.

In just the last couple of weeks I've seen:

  • A video of an XQuery lecture from the University of Irvine
  • Posts on XRX (XForms, REST, XQuery) outlining an new application stack where adopting the XML model and using XQuery greatly simplifies processing (here are some similar thoughts from me, me again and Dave Kellogg)
  • The continuation of the wonderful XQuery, Multi-platform future meme/spam.  As I discussed in this post a year ago, this very lightweight article was replicated throughout the internet as a simple traffic driving trick.  XQuery had finally made enough of an impact to merit someone actually spending time spreading it around . . . and he's still doing it - just last week I got this article in another google alert.

So it's nice to see XQuery bring us some good news - sure beats the other kind.

Matt

September 04, 2008

Its (Really) About XQuery

I finally got on the Wordle bandwandwagon and ran the Discovering XQuery Blog through their clever wordcloud generator thingy.  Actually, I ran it more than a couple of times since it's a lot of fun to let it create random patterns (warning: time suck!).

Here is what the most recent posts (it uses the RSS feed) produced:

Wordle-xquery
(click here to see in on Wordle)

What a shocker!  Yup, XQuery is front and center here at Discovering XQuery and with XML to base it on you have all you need to create and work with content (of course using MarkLogic Server).  I also like the accidental sentence 'XQuery can script'.  Youbetcha!

A good example of this is this paper from the upcoming International Conference on Music Information Retrieval (ISMIR2008).  In between the sessions on music recognition and automated transcription (my favorite: Multiple-Feature Fusion Based Onset Detection for Solo Singing Voice) is this session on working with MusicML: Using XQuery on MusicXML Databases for Musicological Analysis.  The authors got some MusicXML from wikifonia (a free site that collects MusicXML and provides a nice rendering of the music) and used XQuery like a buzzsaw to slice up the content sorting out duplicate titles, listing key signatures and time signatures and even trying to find motives (sequences of chords or rhythms) in the corpus.  Pretty cool stuff!

I thought I'd give it a try myself and add maybe add to their collection of queries.

The first step is to get some content.  Lloading XML from an external source looks like this in MarkLogic:


for $i in (1 to 200)
let $song-source := concat("http://static.wikifonia.org/" ,$i , "/musicxml.xml")
let $database-uri := concat("/songs/" ,$i , "/music.xml")
return
xdmp:document-load($song-source,

xmlns="xdmp:document-load">


<uri>{$database-uri}uri>

xml

)

This grabs the first 200 songs and inserts them into a database.

The XML is pretty neat and, like all good XML, nicely self-describing:

partwise version="2.0">


    <movement-title>What's New</movement-title>
    <identification>
        <creator type="composer">Bob Haggard</creator>
   ...

    </identification>
    <part-list>
        <score-part id="P1">
            <part-name>Voice</part-name>
            <score-instrument id="P1-I1">
                <instrument-name>Voice</instrument-name>

<!-- Then the actual music in measures -->

   <measure number="1">
        <note>
            <pitch>
               <step>D</step>
               <octave>4</octave>
            </pitch>
            <duration>384</duration>
            <voice>1</voice>
            <type>eighth</type>
            <lyric number="1" name="verse">
                <syllabic>end</syllabic>
                <text>You</text>
            </lyric>
        </note>

        ....
    </measure>
....
</score>

Having loaded some MusicXML, it's time to run use some XQuery to explore the music.  You can run the ones in the paper like this one that lists the titles that have no rests:

for $i in //score-partwise
return
$i[count($i//rest) eq 0]//movement-title

This gives us a sequence of titles:

Zachtjes gaan de paardenvoetjes


Sinterklaas is jarig


Or you can do your own thing - like listing the instruments that have parts written for them:

fn:distinct-values(//score-partwise//score-instrument/instrument-name)

And then using one of the values in the list to get all the notes (that aren't rests) for a given instrument:

let $notes := //score-partwise[.//score-instrument/instrument-name eq "Grand Piano"]//note
let $rest-notes := //score-partwise[.//score-instrument/instrument-name eq "Grand Piano"]//note[./rest]
return
$notes except $rest-notes


And we get our notes for the piano:

<note>
<pitch>
<step>B</step>
<octave>4</octave>
</pitch>
<duration>2</duration>
<type>eighth</type>
...
</note>
<note>
...


This makes use of the cool 'except' operator to give you only the <note> elements in $notes that are NOT in the sequence of <note> elements that have <rest> children in $rest-notes.  It turns out that these are always <note> elements with <pitch> children, but this is a very handy tool when you are only sure what you *don't* want.

The abstract of the paper ends with these thoughts:

This shows the feasibility of automated musicological
analysis on digital score libraries using the latest software
tools. Bottom line: it’s easy.


I couldn't agree more - and I have the Wordle to prove it:  if you have some XML, just put some XQuery on top and off you go!

Matt

June 23, 2008

Have it Your Way with Article 2.0 and XQuery

At the MarkLogic User Conference, Darin McBeath of Elsevier asked the question what would you do if you were the publisher?  How would you present an article?  How would you render figures and references?  What would you do if you were in control?

To answer, Darin showed some very cool examples of cutting across articles to present figures in new ways, a really cool interactive reference information browsing interface and a new take on the word cloud within an article to analyze article content.

And it was all done with XQuery.  The source XML is stored in a MarkLogic Server instance and served up via XQuery powered web services.  Each demo was then executed with XQuery running on another instance of MarkLogic that got the content from the first and, using XQuery, presented the new versions of the article from the source XML.  Darin said that each demo was just a few lines of code and didn't take any more than a couple of days to complete.  Very nice!

But what's really cool is that Elsevier is giving everyone the access to the same tools and a chance to make a new article presentation in the  ... (drumroll please) ... Article 2.0 contest!

Check out this super cool idea:  you get access to source XML content and, using XML tools like XQuery, you create your own idea of how an academic research article should be displayed.   How would you present an article?  Its really up to you.  And with prize money at stake you can bet you won't be alone in showing how some new ideas can shake up scientific publishing.

Check out the full details here.  

It all starts in September so you have plenty of time to plan your XQuery masterpiece . . . and Discovering XQuery can help with some examples of how to transform and render XML, how to grab some content from the web and present it, how to enrich content to power cool displays and a tutorial on how to get started with my favorite XQuery engine, MarkLogic Server.

Good luck and happy coding,

Matt

May 28, 2008

Mark Logic User Conference 2008: Generate Some XQuery Buzz

The Mark Logic User Conference, the world's largest gathering of XQuery users, experts and fans, is only two weeks away!

The jam packed agenda of sessions on ground breaking and diverse XQuery applications is a feast for XQuery fans.  You'll get to see real-world XQuery applications in action from content syndication at Simon & Schuster to the Electronic Flight Bag at United Airlines to the Army's Knowledge Management Systems.  There are also best practice sessions and sessions on the latest XQuery tools from Mark Logic.

So sign up here to and come on out to San Francisco to see how XQuery is generating some buzz across a wide range of industries and uses.

And really why would we expect anything less?  We've seen that XQuery (with some help from the easy to use MarkLogic Server) can quickly get you up and querying XML in minutes, scrape the web with ease, create new XML, enrich your content and power AJAX.   All from just plain old XML (which is the right model for content of course).

And as for generating buzz . . . well how about generating XQuery?  Like all good languages XQuery can be used in generative (or automatic) programming.  Thankfully with AJAX you don't need to generate Javascript so much any more (phew!) . . . but this approach still has plenty of uses.  For instance, I recently had to generate some XQuery for the very handy performancemeters to run on some sample content.  To generate the script, I used XQuery (of course):
<h:script xmlns:h="http://marklogic.com/xdmp/harness"> {
for $count in (1 to 100)  (: get 100 calls :)
let $selected := xdmp:random(3001) (: get a random number to choose from the 3000 documents :)
let $uri := let $uri := fn:base-uri(doc()[$selected]) (: get the uri of the selected document :)
return
   <h:test>
     <h:name>sample test</h:name>
     <h:comment-expected-result>transform {$uri}</h:comment-expected-result>
     <h:set-up/>
     <h:query>
     (: generate the XQuery to run for the test :)
      import module namespace tx="http://www.marklogic.com/test/transformtest" at  
         "transform-test.xqy"
    
         tx:transform("{$uri}")
     </h:query>
     <h:tear-down/>
   </h:test>
}</h:script>
In this simple XQuery I'm collecting some random URIs and and creating the calls to the transform function (which as a simple transformation ala XQuery Transformers) but you can see that this would be very handy to use the values in the XML to write custom XQuery for just a test . . . or even as part of your XQuery content application.

With all that XQuery can do its no wonder its good at generating buzz!

See you at the user conference,

Matt

April 11, 2008

XQuery: The Real X in AJAX

Like the real Napster in the movie The Italian Job (the remake), XQuery might have a bit of a chip on its shoulder about the X in AJAX.

Sure, it stands for XML since the idea is that return an XML fragment to the browser to update content in a div, fill in form fields or even create drop down menu options on the fly.

But how do you create that XML?  Using static XML files works, but the whole idea is to dynamically respond to user actions and give them information without reloading the whole page.

And what's the best way to dynamically create XML?  XQuery of course!

To prove it, lets do a simple example to create a drop-down form field for Shakespeare's characters (using the Shakespeare XML we loaded in the tutorial).  This field will auto-complete using AJAX and XQuery and give the user the characters found in the XML that begin with the letters entered into the field.

To get a head start, we'll use the popular AJAX tool Scriptaculous that takes care of all of the hard javascript stuff and lets us just work on creating the backend to deliver the content.

We'll also make use of MarkLogic Server's app server built-ins and its ability to run an HTTP server to make a complete application that presents the HTML, including Scriptaculous.  To do this, we'll start with the /modules directory we created and accessed with WebDav in the tutorial (where the CQ application was placed).

Assuming you have that set up, here are the steps to get the client side set up:

  1. create a js/ directory and install all of the .js scriptaculous libraries that came with the distribution (located here)
  2. Make a lookup.css in /modules with the sample styles from this page
  3. create lookup.xqy under the /modules directory with the following HTML in it:

xdmp:set-response-content-type("text/html; charset=utf-8")
(:  sets the mime type :)
,
<html>
    <head>
        <title>Shakespeare Lookup</title>
        (: reference the stylesheet :)

       

<link rel="stylesheet" type="text/css" href="lookup.css" media="screen"/>

         (: get the scriptaculous scripts loaded - the " " is to prevent them being optimized into an
        empty XML node like <script...  /> which some browsers don't like :)

        <script src="js/prototype.js">{" "}</script>
        <script src="js/scriptaculous.js">{" "}</script>

    </head>
<body>
    <h1>Shakespeare Character Lookup</h1>

    <div>
        <form>
        (: create the placeholder for the autocomplete field :)
        <input type="text" id="autocomplete" name="autocomplete_parameter"/>
        <input type="submit" value="select character"/>
        </form>
        <div id="autocomplete_choices" class="autocomplete"></div>

        (: run the scriptaculous autocomplete :)
        <script type="text/javascript">
            new Ajax.Autocompleter("autocomplete", "autocomplete_choices",                
            "request.xqy", {{}});
        </script>
    </div>
</body>
</html>

If you are like me and learned HTML and javascript way way back, it may take you a moment to realize that the <input> element named 'autocomplete' is NOT what actually shows up in the browser.  The script Ajax.Autocompleter replaces that element with the fully decked out, onClick enabled <input> element that does all of the auto-lookuping including making the calls to your request.xqy.  You can create all this yourself . . . but Scriptaculous does it for you, so go ahead and enjoy it!

This should all result in a simple web page with a text field on it that you can get to at http://localhost:8002/lookup.xqy (or wherever your MarkLogic Server is installed).  But it won't do anything until we create the backend XQuery.

In the Ajax.autocompleter code, we gave request.xqy as our source for the lookups.  We need to create this in the /modules directory and it's contents can be something as simple as this:

(: get the value of the field - sent to us as a POST from the Scriptaculous autocompleter :)

let $query-base := substring-after(xdmp:get-request-body(),"autocomplete_parameter=")

(: add an '*' to it to create a wildcard search :)

let $query := fn:concat($query-base, "*")

return
    <ul>{

        (: use the MarkLogic built-in cts:element-value-match()* to search all of the values in the PERSONA element in the loaded XML plays :)

        for $item in cts:element-value-match(xs:QName("PERSONA"),$query)
        return

        (: return the <li> elements scriptaculous expects for its list :)

        <li>{fn:string($item)}</li>
        }
    </ul>

Yup - thats all there is to it:  9 lines of code to query some XML and return XML.

When all this is hooked up and running, you should have a mini-application that looks something like this:

Shakespearelookup


It's now up to you how you will use the power of XQuery to create dynamic content elements for AJAX.    Will you populate complex taxonomies and even bring back the content from leaf nodes?  Will you create search interfaces that give you the answers in the form?

How about an amazing interface for searching XML content?  Check out markmail.org, which is all XQuery and AJAX, for some inspiration.

So XQuery really must be the real X in AJAX, right?

Well it turns out there is a bit of room under the X in AJAX these days.  JSON is a popular alternative to XML (and XQuery's got that covered - check out this library Jason Hunter wrote to generate JSON from XQuery) and there are lots of ways to generate both XML and JSON.

But for those of us in the know, XQuery is the only way to go.  And with the growing number of XQuery powered content applications, no one can shut down the real X!

Matt

*NOTE: cts:element-value-match() is a MarkLogic built-in that requires an Element Range Index be configured for the element in question.  This is pretty straight forward: select your database under the Databases tab in the MarkLogic admin interface (also covered in the tutorial). Under Element Range Index, select Add.  For the scalar type select string, namespace can be blank and enter PERSONA for localname.  This will create an index for ordering that can also be used to perform the character lookup.

*ALSO NOTE:  there are plenty of other ways to get a list of PERSONA values based on user's input such as:

//PERSONA[cts:contains(., "p*")] (: still uses MarkLogic search built-ins :)
//PERSONA[fn:starts-with(fn:lower-case(string(.)), "p")] (: standard XQuery :)

But like scriptaculous, MarkLogic's search built-ins do the work for you (and also do it much more efficiently) so let's just enjoy using them too!

February 17, 2008

XML is 10!

As this post from Elliot Kimber reminded me, it was 10 years ago (!!) that XML was officially born with the publication of the recommendation on February 10, 1998.

Unlike Elliot, who was in the middle of the standards process, I was very much a user of XML in 1997-1998.  I was working at PC World Online and we had just started to really think about how to model the articles for a multi-channel delivery process.  Getting them from Quark to the website was hard enough, but with the start of online syndication there were requests for simple HTML, ASCII text and for who knows what. 

As we sat around in the fall of 1997 trying to come up with a plan, the idea of tags that we could control and name emerged as a model that would let us get to almost any other format.  Pretty soon we were learning all about SGML and the soon to be created XML.

Things moved fast back then, and by February of 1998 we were already right in the middle of development of our newly designed XML publishing system featuring an Oracle storage system with XML in a BLOB and key fields as columns (called partial decomposition), TCL script (!!) running on the first version of Vignette and some very basic XML tools that looked a bit like XSLT developed for us by Vignette.

Somehow we put the new system on place and ran our first issue on it in April of 1998 - that's 5 months from idea to production!  (If you want to know more about that project see Just One Question for Matt Turner and this paper I gave at XML 1998).

I think of this project as real proof that the principles of XML and its simplicity compared to SGML really did enable the technology to make that huge leap from a niche idea to mainstream content model.

For me the most exciting part of the story is just beginning.  As I often say, 'Oh how we wish we had XQuery back then' and its true.  We were trying to program and transform XML and had to use so many layers of code (even TCL) and a horrible data model.

XQuery lets you do all that same work in one application layer directly against the native XML content.  Its no wonder that XML and content applications are seeing a huge resurgence now that XML (born in 1998) has its match in XQuery (born in 2007).

Happy Anniversary XML!

Matt