May 28, 2008

Mark Logic User Conference 2008: Generate Some XQuery Buzz

The Mark Logic User Conference, the world's largest gathering of XQuery users, experts and fans, is only two weeks away!

The jam packed agenda of sessions on ground breaking and diverse XQuery applications is a feast for XQuery fans.  You'll get to see real-world XQuery applications in action from content syndication at Simon & Schuster to the Electronic Flight Bag at United Airlines to the Army's Knowledge Management Systems.  There are also best practice sessions and sessions on the latest XQuery tools from Mark Logic.

So sign up here to and come on out to San Francisco to see how XQuery is generating some buzz across a wide range of industries and uses.

And really why would we expect anything less?  We've seen that XQuery (with some help from the easy to use MarkLogic Server) can quickly get you up and querying XML in minutes, scrape the web with ease, create new XML, enrich your content and power AJAX.   All from just plain old XML (which is the right model for content of course).

And as for generating buzz . . . well how about generating XQuery?  Like all good languages XQuery can be used in generative (or automatic) programming.  Thankfully with AJAX you don't need to generate Javascript so much any more (phew!) . . . but this approach still has plenty of uses.  For instance, I recently had to generate some XQuery for the very handy performancemeters to run on some sample content.  To generate the script, I used XQuery (of course):
<h:script xmlns:h="http://marklogic.com/xdmp/harness"> {
for $count in (1 to 100)  (: get 100 calls :)
let $selected := xdmp:random(3001) (: get a random number to choose from the 3000 documents :)
let $uri := let $uri := fn:base-uri(doc()[$selected]) (: get the uri of the selected document :)
return
   <h:test>
     <h:name>sample test</h:name>
     <h:comment-expected-result>transform {$uri}</h:comment-expected-result>
     <h:set-up/>
     <h:query>
     (: generate the XQuery to run for the test :)
      import module namespace tx="http://www.marklogic.com/test/transformtest" at  
         "transform-test.xqy"
    
         tx:transform("{$uri}")
     </h:query>
     <h:tear-down/>
   </h:test>
}</h:script>
In this simple XQuery I'm collecting some random URIs and and creating the calls to the transform function (which as a simple transformation ala XQuery Transformers) but you can see that this would be very handy to use the values in the XML to write custom XQuery for just a test . . . or even as part of your XQuery content application.

With all that XQuery can do its no wonder its good at generating buzz!

See you at the user conference,

Matt

November 28, 2007

XML lists on MarkMail and XQuery at XML 2007

Just in time for the upcoming XML conference the MarkMail team (none other than Mark Logic's XQuery gurus Jason Hunter and Ryan Grimm) have added the xml-dev, xquery-talk and xsl-list mailing lists to the already impressive collection of developer mailing lists in MarkMail.

MarkMail is an XQuery powered next generation email search application that not only lets you easily find a specific message in a big stack of email (over 4,000,000 messages) but also presents analytics to let you see the patterns and trends across topics.

So far I've been looking at the big picture with the XML lists and its interesting to see the activity around XQL in the late 90s  when it was picking up steam . . .

Xql_2
(click here to see it live)

the way Quilt (the real ancestor to XQuery) sort of picked up the ball at first but quickly faded . . .

Quilt
(click here to see it live)

and how XQuery really took over as it became a the standard (and got its own list so then things tapered off).

Xquery
(click here to see it live)

I also like how Jonathan Robie is the number 1 poster for all of these searches - thank you Jonathon!!

But most of all I like that I was able to find all this great info with just a few searches and some drilling down into content.

MarkMail is a great example of a content application:  its data is email modeled as XML and its application layer is XQuery to take full advantage of the structure in the content and the powerful set of application buidling features XQuery provides.

And since its powered entirely from the XML content, you can read every post in the same interface.  So for the Quilt search I was able to scan the messages and get a feel that most people were actually just referring to Quilt when actually talking about other things (like XQuery):

Quilmsg
(to see this click any message in this result)

So whether you are looking for trends - such as the decline of DSSSL or your favorite entry in a perma-thread
I think you will find MarkMail a valuable resource for all things on the XML lists.

And the timing couldn't be better.  Next week at the XML conference Jason will be giving the closing keynote "You're Darn Right XML has a Future on the Web" and the power of applications like MarkMail certainly underscore just what XML and XQuery can do.

And I will also be speaking at the conference subbing for Kelly Stirman in a session called "First Encounters with Open Office XML".  I'll be looking that this format then doing some live demos using XQuery to query it, take it apart and put it back together.

Hope to see you there and enjoy the new XML savvy MarkMail.

Matt

November 12, 2007

Code with the XQuery Experts . . . in San Carlos, CA NOT London

Sorry to all the folks over in the U.K. but it turns out the Code with the XQuery Experts event is NOT in London, but in Mark Logic's offices in San Carlos, CA.

New event details are:

Code with the XQuery Experts
Friday, November 30, 2007
8:30 am PT - 5:00 pm PT
Mark Logic Offices
San Carlos, CA

Registration is here

What better way to work off the soporific Thanksgiving turkey than some vigorous XQuery coding!  Plus the Best XQuery App contest lets you bring your XQuery chops and maybe win an iPhone.

For those of you in and around London, Mark Logic is hosting a cocktail party Wednesday November 5th during London Online.  Feel free to come by.  You can hand out with some other Mark Logic XQuery gurus and get a free drink to lessen your disappointment.  Registration for this event is here.

Sorry for the confusion - it is, however, nice to have so much activity XQuery to be confused about.

Matt

November 05, 2007

Agile Publishing and XQuery

This week, Mark Logic is sponsoring a breakfast seminar on agile publishing. 

The speakers are great.  I've seen Howard Ratner, the CTO of Nature Publishing, speak a couple of times and he is entertaining and informative.  I've also met with David Worlock who is both an amazing font of knowledge and an engaging speaker.

And the topic is spot on:  Looking at product development through the agile lens yields many new possibilities for how you can both build and deliver new content products.

For me, it has always been about ways to speed up and simplify the product development process.  As the Technical Director at PC World Online I was on the receiving end of hundreds of small, 'can't you just' requests.  I hated saying no - these were the good ideas that made us innovate and if I said 'this isn't in the schedule' or 'maybe next year' then we'd get nowhere.  And in 1997 at the start of web publishing we had a LOT of ground to cover!

So we did a lot of small projects, we launched things in days rather than months and we made the most of our tools, pushing XML into databases the best we could and using Tcl (!!) and Vignette as a basic framework for rapidly developing many, simultaneous projects.  It was certainly agile (if with a little 'a') and that term applied to not just the tech team, but everyone working together to create new products.

These same trends are still around and are maybe even more important as publishers today need to invent, create new products and break the mold of the traditional products (which we were very busy inventing 10 years ago!).

And the tech teams are still getting those 'can't you just' questions . .  . but now you can use XQuery instead of all those clunky database/webcms/app layer toolsets.  XQuery is *the* native programming language for XML and XML is *the* model for content.  So with XQuery you don't spend a lot of time translating your content between tables, objects and outputs.  Instead, you just get right to work on those 'can't you just' questions.

And when you use an XQuery engine like MarkLogic Server,  things go even faster because you can load any content without up-front configuration and can perform any query on any part of the XML.  This turbo-charges the process but letting you get a-hold of some content and right away start writing your application (like we did in the first tutorial).

I often say that we wish we had XQuery back then.  Well, you do have XQuery now and what a difference it makes!

So come on out this week and explore the world of agile content products - here are the details:

The Agile Publishing Imperative:
Accelerate the Creation of Information Products

Thursday, November 8
8:00 am - 11:00 am
Four Seasons Hotel
Cost: Complimentary

Registration

Hope to see you there,

Matt

 

October 24, 2007

XQuery: The Search Language For A Multi-Platform Future

I keep a couple of google alerts for all things XQuery and I was very pleased to see this headline pop up a couple of days ago:

XQuery: The Search Language For A Multi-Platform Future
  The advent of wireless internet access has made web design a very complicated matter. Previously, all web browsers were created equal. HTML was the only language used to create web sites, and it was only possible to go online with a ...

Wow!  Along with the XML.com article XQuery, the Server Language here is someone who really gets the power of XQuery as an application . . .  and for search too!

And it's true, XQuery really is the search language for a multi-platform future.  Where XML is the powerful model for content from meta-data to books, XQuery is the application language that unlocks the potential of this content to build content applications to deliver content to multiple formats.

With MarkLogic's search extensions (which anticipate the additions to the standard) XQuery also becomes a search platform to power applicaitons.  Unlike a search engine which can only point to the content, XQuery can search, manipulate and render the content all in one system.

This lets you build full applicaitons on one platform.  For instance, as the Jim Pretin (the author of the article) suggests, a dating site.

If you the data for the dating site was modeled in XML that looked something like this:

<singles>
    <person>
        <name>Peter</name>
        <sex>male</sex>
        <age>32</age>
        <interests>golf, skiing, camping, gazing at the stars</interests>
    </person>
    <person>
        <name>Jane</name>
        <sex>female</sex>
        <age>27</age>
        <interests>skiing, swimming, horseback riding</interests>
     </person>
    <person>
        <name>Fred</name>
        <sex>male</sex>
        <age>32</age>
        <interests>golf, football, car racing</interests>
    </person>
</singles>

XQuery would let you perform the basic operations of searching for a date by matching conditions (a man over 30) and also let you do full text search against the content - say in the <interests> element where content is entered as free text:

for $person in input()/singles/person[./sex eq "male"][./age > 30][cts:contains(./interests, "gazing")]
return
    <date>{$person/name}</date>

*cts:contains() is a MarkLogic Server search built-in that lets you do full text search instead of the regex powered contains() in the current XQuery spec.

With this single function we get a potential date that is a male, over 30 and mentioned 'gazing' in his interests.  We can then output this in any format - maybe an SMS or post to facebook or some simple XML:

<date>
    <name>Peter</name>
</date>

We can do this all in one language  - yup XQuery is powerful stuff and certainly, as the article points out, the right tool for a multi-platform future.

But there is just one little problem with the article . . . it keeps appearing over and over and over. 

At first I thought it was a mistake - maybe I was rereading the same google alert twice?  But then it just kept coming, sometimes two and three times in a single alert, sometimes with different titles (XQuery the Search Language of Tomorrow), but always there - a constant companion in my google alert.

So it looks like Jim Pretin (who runs a service called forms4free that will "GUARANTEE you a working form") is actually much more interested in spotting trendy keywords and spamming the world with content than actually promoting XQuery.
Xquerysearch_2
And he's quite good - check out this link from page FIVE (!) of a google search for the article.

But, all in all, this makes me pretty happy.  XQuery is now a buzzword worth spending who knows how much time replicating content around the internet to get search hits!

Just another milestone for XQuery:

    Standard (check!)
    Powerful search and query across loads of content (check!)
    Powers innovative content applications (check!)
    Internet buzzword (check!)

Yes, we have truly arrived!

July 04, 2007

Celebrate (XML) Independence

A couple of weeks ago Kurt Cagle posted XQuery, The Server Language on XML.com.  I like this article a whole  because it:

  • Explores how XQuery is more than a query language
  • Shows how XQuery is actually THE server-side scripting language to create HTML
  • and contains this very nice example of how things used to be:
$buf ="<html><head><title>".$myTitle;
$buf += "</title><body>";
$buf += "<h1>This is a test.</h1>";
$buf += "<p>If this were an actual emergency, we'd be out of here by now.";
echo $buf;

Yup - back before XQuery, in the days when your dad had to ride his bike uphill both ways to school and people didn't even have answering machines (you had to just call back later) . . . this is how you had to make HTML.

Creating strings to represent elements is fraught with danger - while the above works, its hardly valid. But it was the only simple choice (the other options, as Kurt points out, were Java pipelines with multiple moving parts).

Aren't we glad we have XQuery?

This sort of thing is now done in an entirely XML-centric environment where you just create elements instead of strings you hope will work out:

let $mytitle := "title from input"
return
<html><head><title>{$mytitle}</title></head>
<body>
<h1>This is a test.</h1>
<p>If this were an actual emergency, we'd be out of here by now.</p>
</body></html>

This is a very liberating moment.  Here is a scripting language, built to create XML - the language of the web.  There is no impedance mismatch between text, objects, and elements . . . its all about the tags.

So celebrate the XML independence by getting a good XQuery engine like MarkLogic Server and bring a little XQuery into your content applications. 

But be careful, its pretty addictive . . . luckily, these days you can just let the answering machine pick up.

June 19, 2007

XML and XQuery as the Model

A couple of weeks ago at the Mark Logic user conference (where there were many great presentations including a super Tim O'Reilly keynote) Jason Hunter floated the idea that we may be entering into a paradigm where XML tools, like MarkLogic Server, are used as the basis for applications even if the data source isn't XML.

The idea is that flexibility and functionality we've seen with XML content applications like Congressional Quarterly's Legislative Impact and Harvard Business School Publishing's content logic is fundamentally different and can be applied outside of traditional XML content environments.

Jason's example had to do with email archives.  There have been many attempts to work with email archives and most have used databases where the email is broken up into fields.  However email, like many data sources, is full of anomalies.  So even though it looks simple, the resulting database schema is often very complex and inadequate (especially if you start to deal with representing threads).

XML provides a much better model where complexity and variations are actually expected and, it turns out, its fairly easy to turn email into XML.

However the key is what to do with the XML.  I think you'd be hard pressed to say lets turn email into XML and then use a database or filesystem to store it, SQL to extract it and XSLT to transform it.  So no wonder people put up with complicated database schemes since from there the application was at least a standard database -> app server affair.

But this all changes with XQuery and, in particular, MarkLogic Server which can process XQuery at search engine speed and has added a few helpful extensions.

With emails in XML that look like this

<email>
<author>matt</author>
<subject>test email</subject>
<!-- some more headers, etc. -->
<body>Example email body</body>
</email>

using MarkLogic Server its super easy to let people search the content's of emails . . . and restrict it by a certain author or other header info:

for $email in cts:search(/email, cts:and-query((cts:element-query(xs:QName("author"), "matt"), cts:element-query(xs:QName("body"), "email"))))
return
    <div>
        <b>Email from:</b> {$email/author}<br/>
        {$email/body}
    </div>

the element-query search built-in lets you target our search against the XML elements and fine tune your search.  And displaying it is a snap:  we just output the HTML we want. 

With a traditional database approach this isn't even possible: you either do less complicated SQL queries or you need a search engine to index the database and application code to call the search engine and then get the content out of the database.  It's certainly not 5 or 6 lines of code.

But lets make something really useful: what if we don't know the author's name?  What if we started with just the keyword search but wanted to give the users some options to drill into the result?

Some of the very cool new features in MarkLogic Server 3.2 make this a snap. 

The first step is to make use of the element values built-in:

cts:element-values(xs:QName("author"))

Without any arguments, this gives us all of the unique values of the author element.  This is hugely useful - especially if you are dealing with a bunch of content that was organically created or has a semi controlled vocabulary (like an email archive).  This will let you see all the actual values in the content.  You can use it to correct or normalize the content or even make a lookup list for users even though there is no 'lookup table', just the actual values in the content. 

Super cool and super flexible:  if you encounter a new header value - or decide to process the content to create some new lookup values (like first and last name) you can apply the same functionality, just like that.

But, like any good ginsu knife commercial . . . there's more:  element-values also takes a search query and, new with 3.2, provide the frequency within that search:

<div><b><u>Authors in your search</u></b><br/>{
for $author in cts:element-values(xs:QName("author"),"",(), cts:element-query(xs:QName("body"), "email") )
return
    <a href="refinesearch.xqy">{$author} ({cts:frequency($author)} emails) <br/></a>
}</div>

This code asks for the values of the author field, but restricts the results to the authors of emails that matched our keyword search of the body.  Then it creates a neat little widget to refine the search by the authors that looks like this:

Authors in your search
matt (11 emails)
brian (8 emails)
peter (4 emails)

Now we're really cooking.  All the user does is enter a keyword search and the application presents them with really advanced features to refine it by author.  This can be applied to any header value like recipient or even date and even time ranges.

in another one of the user conference talks, Alan Darnell from the University of Toronto talked about building a Digital Library with MarkLogic Server.  He focused on this kind of user interaction as being an ideal complement to the one box search habits fostered by Google.  While google can't really augment your search, a library or a content application can in fact be built to guide you since it is built with a specific purpose.  For Alan, this was an opportunity to reverse a trend and instead of replacing the libarian make a search application that actually included a virtual libarian, waiting and ready to helps the user find what they need.

Starting with XML as the model, search and discovery applications are starting to go places we haven't yet considered.  If email can benefit from the XML and XQuery model, what other data sets are out there waiting to be tapped?

So have a look in your files, peek into the rigid databases holding content and give those old data marts a good shake:  a new life as XML is waiting to unlock the hidden value in that content.



 

May 14, 2007

Teaching XQuery

This morning I'm in San Francisco teaching a class on XQuery before the Mark Logic User Conference swings into session tomorrow.

As part of the class we'll be doing a lab using the Shakespeare XML and (the real reason for this post) the sample content is right here:

Download bill.zip

We're going to try, in about an hour, to download, install and configure MarkLogic Server and then load this content and do some queries with CQ.  Very much like the first part of the tutorial.

Yup - thats a live session with ~30 people all of whom will be executing XQuery after about an hour.  XQuery makes this possible - it really just works

Wish us luck!

Matt

May 10, 2007

Mark Logic User Conference Next Week plus MarkLogic Server 3.2 Released!

Just a quick reminder about the Mark Logic user conference next week in San Francisco.

As previously reported, there will be no naked people, but there will be lots of XQuery enthusiasts and talks on some of the most innovative information products from Congressional Quarterly, Harvard Business School Publishing, McGraw-Hill Education and more.

Tim O'Reilly will be giving a keynote on Wednesday and Mark Logic CEO Dave Kellogg will kick things off on Tuesday.

Its not too late to sign up and its still FREE!

To make things even better, on the eve of this great event Mark Logic has released MarkLogic Server 3.2!

As Mark Logic Products VP Ian Small says, this release is a real diamond that really puts the power of XQuery in your hands.

You can work with more types of content with extensive language support (down to the node level - cool!),  new conversions including Office 2007 (yup - now you can actually do somethng with Office XML) and more encodings (so you can now scrape non-utf-8 web pages). 

It also adds powerful search capabilities for efficient complex searches, makes the already blindingly fast performance even faster and adds content analysitics to power user navigation and information displays. Plus there are lots of goodies for XQuery developers including debugger support.

I'll be posting from the user conference so more on all of this to come.

Hope to see you in San Francisco!

Matt

P.S. maybe next year we'll hire Spencer Tunick

May 07, 2007

Go Native! CMS(s), XML and XQuery

This Thursday, May 10th 2007, I'll join Lisa Bos from Really Strategies to talk about the benefits of a native XML CMS and have a look at RSuite, an XML Content Management System powered by XQuery and MarkLogic Server.

This is a really interesting topic:  at the 2006 London Online show 51 of 60 software vendors selected the 'content management' category and of these 31 were offering a Content Management System.  In that crowded field, things like automatic accessibility, ease of use and eCommerce built-ins stood out.  But full support for XML and the powerful sub-document access and control it brings was almost totally absent.

For people who work with content, investing in XML is key to their business.  But as far as CMS systems are concerned, they often have to shoehorn, change or otherwise mangle their XML to work with them.  A couple of weeks ago, Lisa discussed this on the Really Strategies blog.  Dave Kellogg also has some thoughts on this.

So join Lisa and me on Thursday to hear more about XML and content mangement.

Sign up here and hope to see you there.

MT