An Opinionhttp://anopinion.net/blogen-USRSS Generated by Dottext 0.94Mike DeemOverheadhttp://anopinion.net/posts/285.aspxMon, 24 Nov 2003 08:58:00 GMThttp://anopinion.net/posts/285.aspxhttp://anopinion.net/comments/285.aspxhttp://anopinion.net/posts/285.aspx#feedback14http://anopinion.net/comments/commentRss/285.aspxhttp://anopinion.net/trackback.aspx?ID=285<p>In response to my <a href="/posts/282.html">WinFS API Lean and Mean post</a>, <span><a id="_ctl0_Header1_HeaderTitle" href="http://dotnetjunkies.com/weblog/kenbrubaker/">Ken Brubaker</a> <a href="http://dotnetjunkies.com/weblog/kenbrubaker/posts/3917.aspx">points out</a> that for ObjectSpaces, “<a href="http://blogs.gotdotnet.com/aconrad/">Andrew Conrad</a> has <a href="http://blogs.gotdotnet.com/aconrad/PermaLink.aspx/5d3bf937-16a2-4c3c-a59d-13f1e583de0b">lead us to believe</a> that the 30% performance loss is due to object creation.“ The full quote from Andrew's posting is:</span></p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p>In general, this performance difference comes from the materializing of the objects and some overhead from the mapping layer.<span style="mso-spacerun: yes">  </span>Note – if one was to materialize their own objects over the SqlDataReader, they probably won’t see a significant difference between ObjectSpaces and their custom solution.</p></blockquote> <p>Some of the overhead is mapping. Some is object creation. The WinFS API reduces the object creation overhead to a minimum by taking advantage of the fact that WinFS objects are derived from our base class. </p> <p>Andrew goes on to say:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p>So unless one does a join on the server and then normalizes the results them self, it will be hard to beat the ObjectSpaces’ performance. As the hierarchy becomes deeper and more complex, this performance gain will be more noticeable.</p></blockquote> <p>In the end, objects have costs and benefits. I assert the benefits out weight the costs. Achieving good performance is a priority zero requirement for us and we'll probably still be tweaking things as they pry the bits from our carpel-tunneled hands for pressing onto CDs (DVDs?).</p>

In response to my WinFS API Lean and Mean post, Ken Brubaker points out that for ObjectSpaces, “Andrew Conrad has lead us to believe that the 30% performance loss is due to object creation.“ The full quote from Andrew's posting is:

In general, this performance difference comes from the materializing of the objects and some overhead from the mapping layer.  Note – if one was to materialize their own objects over the SqlDataReader, they probably won’t see a significant difference between ObjectSpaces and their custom solution.

Some of the overhead is mapping. Some is object creation. The WinFS API reduces the object creation overhead to a minimum by taking advantage of the fact that WinFS objects are derived from our base class.

Andrew goes on to say:

So unless one does a join on the server and then normalizes the results them self, it will be hard to beat the ObjectSpaces’ performance. As the hierarchy becomes deeper and more complex, this performance gain will be more noticeable.

In the end, objects have costs and benefits. I assert the benefits out weight the costs. Achieving good performance is a priority zero requirement for us and we'll probably still be tweaking things as they pry the bits from our carpel-tunneled hands for pressing onto CDs (DVDs?).

Mike DeemWinFS API Lean and Meanhttp://anopinion.net/posts/282.aspxSun, 23 Nov 2003 10:10:00 GMThttp://anopinion.net/posts/282.aspxhttp://anopinion.net/comments/282.aspxhttp://anopinion.net/posts/282.aspx#feedback32http://anopinion.net/comments/commentRss/282.aspxhttp://anopinion.net/trackback.aspx?ID=282<a href="http://adoguy.com/">Shawn Wildermuth</a> is right, “<a href="http://adoguy.com/content.aspx?id=rantview&amp;rantid=56">WinFS has to be quick...not just fast...lightning fast</a>.” As previously stated, <a href="/posts/273.html">we are using OPath</a>, but if you read carefully I don't think anyone has said that we are actually using the full ObjectSpaces stack from top to bottom. Given the proscriptive nature of the WinFS data and storage model, we don't need a flexible mapping layer and it has essentially been optimized out of the WinFS API stack.Shawn Wildermuth is right, “WinFS has to be quick...not just fast...lightning fast.” As previously stated, we are using OPath, but if you read carefully I don't think anyone has said that we are actually using the full ObjectSpaces stack from top to bottom. Given the proscriptive nature of the WinFS data and storage model, we don't need a flexible mapping layer and it has essentially been optimized out of the WinFS API stack.Mike DeemProperties vs. Methods Take Twohttp://anopinion.net/posts/277.aspxFri, 21 Nov 2003 17:21:00 GMThttp://anopinion.net/posts/277.aspxhttp://anopinion.net/comments/277.aspxhttp://anopinion.net/posts/277.aspx#feedback16http://anopinion.net/comments/commentRss/277.aspxhttp://anopinion.net/trackback.aspx?ID=277In a private e-mail, <a href="http://blogs.gotdotnet.com/BradA/">Brad Abrams</a> ask me to clarify the <a href="/posts/273.html">property vs. method issue</a> in the WinFS API. I put together this hopefully simplified explanation. What follows is essentially a conceptual description, I’m glossing over a lot of details and the attributes shown aren’t exactly the ones we use. <p>In WinFS we have an item table and a relationship table. The item table has an ItemId column. The relationship table has SourceItemId and TargetItemId columns. At the SQL level, when querying for related items you join through the relationship table. </p> <p>When mapping items and relationships to objects we end up with something like the classes shown below. This lets use properties in the classes to represent the joins. </p> <blockquote> <p><font face="Courier New">public class Relationship <br /></font><font face="Courier New">{ <br /></font><font face="Courier New">  public ItemId SourceItemId { get; } <br /></font><font face="Courier New">  public ItemId TargetItemId { get; } <br /><br /></font><font face="Courier New">  [Join(“SourceItemId”, “Item.ItemId”)] <br /></font><font face="Courier New">  public Item Source { get; } <br /><br /></font><font face="Courier New">  [Join(“TargetItemId”, “Item.ItemId”)] <br /></font><font face="Courier New">  public Item Target { get; } <br /></font><font face="Courier New">} </font></p> <p><font face="Courier New">public class Item <br /></font><font face="Courier New">{ <br /><br /></font><font face="Courier New">  public ItemId { get; } <br /><br /></font><font face="Courier New">  string DisplayName { get; set; } <br /><br /></font><font face="Courier New">  [Join(“ItemId“, “Relationship.SourceItemId“)] <br /></font><font face="Courier New">  public VirtualRelationshipCollection OutgoingRelationships { get; } <br /><br /></font><font face="Courier New">  [Join(“ItemId“, “Relationship.TargetItemId”)] <br /></font><font face="Courier New">  public VirtualRelationshipCollection IncommingRelationships { get; } <br /></font><font face="Courier New"><br /></font><font face="Courier New">}           </font></p></blockquote> <p>These properties show up in two places: in applications which use them to “navigate” from object to object in memory and in queries that traverse the joins between the underlying tables. Let’s look at the query case. If I want to find all items targeted by a relationship from any item with the display name “foo”, I can use the following code: </p> <blockquote> <p><font face="Courier New">foreach( Item item in Item.FindAll( “IncommingRelationships.Source.DisplayName=’foo’” ) ) ...; </font></p></blockquote> <p>What this does is cause us to generate a query against the item table that joins to the relationship table using Item.ItemId = Relationship.TargetItemId and then back to the Item table using Relationship.SourceItemId = Item.ItemId. </p> <p>Now, it is very common for an item to be related to many other items. Hence we use our VirtualRelationshipCollection type to represent the relationships in the Item class. This type doesn’t load all the relationship objects when the item object is loaded. They are not even loaded when the application accesses the property, as they can be used for Add and Remove operations without loading all relationships into memory. The relationships are loaded only if the application tries to enumerate over the collection. </p> <p>Similarly, there are scenarios where an application may want work with a relationship object without actually loading the source and/or target item object into memory. The source/target object is loaded only when the Source/Target property is first accessed. </p> <p>We could use GetSource and GetTarget methods insted of properties in the Relationship class, but what should we then do in OPath? One of the goals of OPath is to allow queries to exactly mirror the object models. If we use GetSource in the object, the OPath should become “IncommingRelationships.GetSrouce().DisplayName”. The problem is, in general, we cannot let you use the methods on a class in OPath. </p> <p>The principle we are trying to follow in the WinFS API is: if you see a property in the object browser, you can use that property in your OPath but you can’t use any methods in OPath. We have gone out of our way to make things you can’t use in queries accessible only by calling methods. </p> <p>So we have conflicting requirements: using properties/methods to convey to the user what can and cannot be accessed efficiently and with minimal chance of failure vs. using properties/methods to convey to the user what they can and cannot use in OPath. Given that the former is the established rule, it should take precedence. However, that leaves us with the problem of trying to identify what can be used in OPath. </p> <p>Maybe the current situation isn’t so bad. The Source and Target properties in relationship classes are the only ones that are loaded on first access. There are no other properties in all of the WinFS API like this. Similarly, the number of places we used methods instead of properties because a value cannot be used in OPath is also limited.</p>In a private e-mail, Brad Abrams ask me to clarify the property vs. method issue in the WinFS API. I put together this hopefully simplified explanation. What follows is essentially a conceptual description, I’m glossing over a lot of details and the attributes shown aren’t exactly the ones we use.

In WinFS we have an item table and a relationship table. The item table has an ItemId column. The relationship table has SourceItemId and TargetItemId columns. At the SQL level, when querying for related items you join through the relationship table.

When mapping items and relationships to objects we end up with something like the classes shown below. This lets use properties in the classes to represent the joins.

public class Relationship

  public ItemId SourceItemId { get; } 
  public ItemId TargetItemId { get; } 

  [Join(“SourceItemId”, “Item.ItemId”)] 
  public Item Source { get; } 

  [Join(“TargetItemId”, “Item.ItemId”)] 
  public Item Target { get; }
}

public class Item


  public ItemId { get; } 

  string DisplayName { get; set; } 

  [Join(“ItemId“, “Relationship.SourceItemId“)] 
  public VirtualRelationshipCollection OutgoingRelationships { get; } 

  [Join(“ItemId“, “Relationship.TargetItemId”)] 
  public VirtualRelationshipCollection IncommingRelationships { get; }

}          

These properties show up in two places: in applications which use them to “navigate” from object to object in memory and in queries that traverse the joins between the underlying tables. Let’s look at the query case. If I want to find all items targeted by a relationship from any item with the display name “foo”, I can use the following code:

foreach( Item item in Item.FindAll( “IncommingRelationships.Source.DisplayName=’foo’” ) ) ...;

What this does is cause us to generate a query against the item table that joins to the relationship table using Item.ItemId = Relationship.TargetItemId and then back to the Item table using Relationship.SourceItemId = Item.ItemId.

Now, it is very common for an item to be related to many other items. Hence we use our VirtualRelationshipCollection type to represent the relationships in the Item class. This type doesn’t load all the relationship objects when the item object is loaded. They are not even loaded when the application accesses the property, as they can be used for Add and Remove operations without loading all relationships into memory. The relationships are loaded only if the application tries to enumerate over the collection.

Similarly, there are scenarios where an application may want work with a relationship object without actually loading the source and/or target item object into memory. The source/target object is loaded only when the Source/Target property is first accessed.

We could use GetSource and GetTarget methods insted of properties in the Relationship class, but what should we then do in OPath? One of the goals of OPath is to allow queries to exactly mirror the object models. If we use GetSource in the object, the OPath should become “IncommingRelationships.GetSrouce().DisplayName”. The problem is, in general, we cannot let you use the methods on a class in OPath.

The principle we are trying to follow in the WinFS API is: if you see a property in the object browser, you can use that property in your OPath but you can’t use any methods in OPath. We have gone out of our way to make things you can’t use in queries accessible only by calling methods.

So we have conflicting requirements: using properties/methods to convey to the user what can and cannot be accessed efficiently and with minimal chance of failure vs. using properties/methods to convey to the user what they can and cannot use in OPath. Given that the former is the established rule, it should take precedence. However, that leaves us with the problem of trying to identify what can be used in OPath.

Maybe the current situation isn’t so bad. The Source and Target properties in relationship classes are the only ones that are loaded on first access. There are no other properties in all of the WinFS API like this. Similarly, the number of places we used methods instead of properties because a value cannot be used in OPath is also limited.

Mike DeemWinFS API, OPath, and Propertieshttp://anopinion.net/posts/273.aspxThu, 20 Nov 2003 09:33:00 GMThttp://anopinion.net/posts/273.aspxhttp://anopinion.net/comments/273.aspxhttp://anopinion.net/posts/273.aspx#feedback16http://anopinion.net/comments/commentRss/273.aspxhttp://anopinion.net/trackback.aspx?ID=273<p>The recent posts about <a href="http://dotnetjunkies.com/weblog/kenbrubaker/posts/3712.aspx">WinFS using OPath</a> (which is correct) and about the use of <a href="http://blogs.gotdotnet.com/BradA/permalink.aspx/96f582da-42e0-4a4d-9e53-e57a5b0c6839">properties vs. methods</a> in an API juxtapose in an interesting way in the WinFS API. Basically OPath lets me write a boolean expression using the properties exposed by a class. OPath doesn't let you use methods in the expression because OPath is actually converted to a SQL query and executed in the database.</p> <p>In the WinFS API we have a Relationship class. The Relationship type in WinFS has two properties: source item id and target item id. When writing an OPath expression, we want you to be able to navigate from an item through relationships to the source or target item. To make this easy, we want to expose properties that represent the source and item target object's themselves. For example, to find all the people who live in the New York metropolitan region, you would write code like the following:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p><font face="Courier New">Person.FindAll(“IncomingRelationships.Cast(System.Storage.Contacts.HouseholdMember).Household.OutgoingRelationships.Cast(System.Storage.Core.ContactLocations).Location.MetropolitanRegion = ‘New York’“ );</font></p></blockquote> <p>Follow along in the diagram below. Start with the Person items. <em>IncomingRelationships.Cast(...HouseholdMember)</em> gets you to the HouseholdMembership relationships that target all Person items. <em>Household</em> is a property that gets you to the Household item that is the source of the relationship. <em>OutgoingRelationships.Cast(...ContactLocations)</em> gets you to ContactLocation relationships. <em>Location</em> gets you to Location items. The Location items have a MetropolitanRegion property which is compared to the string 'New York'.</p> <p><img src="/other/relationship.gif" /></p> <p>Now, here is the dilemma: we added Source and Target properties to Relationship to make writing a query easy, but actually providing a value for these properties may require a round trip to the store (it can take a while and fail in unexpected ways). In the query this is fine (we actually ask the store to do a join based on the item ids). For example:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p><font face="Courier New">foreach( HouseholdMember hm in HouseholdMember )</font><font face="Courier New"><br />{<br />  Person p = hm.Person; // could take some time<br />  Houeshold h = hm.Household; // could take some time<br />}</font></p></blockquote> <p>It is the property vs. method debate, but with the twist that we want some things to be properties so they can be used in a query. </p>

The recent posts about WinFS using OPath (which is correct) and about the use of properties vs. methods in an API juxtapose in an interesting way in the WinFS API. Basically OPath lets me write a boolean expression using the properties exposed by a class. OPath doesn't let you use methods in the expression because OPath is actually converted to a SQL query and executed in the database.

In the WinFS API we have a Relationship class. The Relationship type in WinFS has two properties: source item id and target item id. When writing an OPath expression, we want you to be able to navigate from an item through relationships to the source or target item. To make this easy, we want to expose properties that represent the source and item target object's themselves. For example, to find all the people who live in the New York metropolitan region, you would write code like the following:

Person.FindAll(“IncomingRelationships.Cast(System.Storage.Contacts.HouseholdMember).Household.OutgoingRelationships.Cast(System.Storage.Core.ContactLocations).Location.MetropolitanRegion = ‘New York’“ );

Follow along in the diagram below. Start with the Person items. IncomingRelationships.Cast(...HouseholdMember) gets you to the HouseholdMembership relationships that target all Person items. Household is a property that gets you to the Household item that is the source of the relationship. OutgoingRelationships.Cast(...ContactLocations) gets you to ContactLocation relationships. Location gets you to Location items. The Location items have a MetropolitanRegion property which is compared to the string 'New York'.

Now, here is the dilemma: we added Source and Target properties to Relationship to make writing a query easy, but actually providing a value for these properties may require a round trip to the store (it can take a while and fail in unexpected ways). In the query this is fine (we actually ask the store to do a join based on the item ids). For example:

foreach( HouseholdMember hm in HouseholdMember )
{
  Person p = hm.Person; // could take some time
  Houeshold h = hm.Household; // could take some time
}

It is the property vs. method debate, but with the twist that we want some things to be properties so they can be used in a query. 

Mike DeemWarm Fuzzyhttp://anopinion.net/posts/267.aspxWed, 19 Nov 2003 23:11:00 GMThttp://anopinion.net/posts/267.aspxhttp://anopinion.net/comments/267.aspxhttp://anopinion.net/posts/267.aspx#feedback13http://anopinion.net/comments/commentRss/267.aspxhttp://anopinion.net/trackback.aspx?ID=267<p>People have been asking me for days if I have read <a href="http://www.ozzie.net/blog/stories/2003/11/14/640kbOughtToBeEnoughForAnyone.html">Ray Ozzie's post on WinFS</a>. I finally took the time read it slowly and enjoy. Makes me feel good about what I'm working so hard to accomplish.</p> <p>But you know what... posts like <a href="http://www.grack.com/news/WinFSAvalonLonghornblam.html">Matthew Mastracci's</a> are the really valuable ones. <a href="/posts/164.html">Challenging our assumptions</a>. There are some interesting comments in <a href="http://longhornblogs.com/scobleizer/posts/1368.aspx">Robert Scoble's response</a> as well.</p> <p>A few specific points:</p> <p>1) <strong>“...I can't see how calling the WinFS APIs is any different than calling the vCard/EXIF/ID3 APIs directly”</strong>: Each of those APIs (if you can call a file format an API) is different. The WinFS API provides a common programming model for working with all kinds of meta-data. This programming model fits in with the rest of WinFX, creating yet another kind of network effect resulting in increased programmer productivity. Also note that WinFS doesn't devalue these formats, it just provides a common way to program and search them.</p> <p>2) <strong>“How does the record executive get his contact for 'Marvin Gaye' to link to a list of albums and lyrics, while ensuring that my friend Marvin Gaye's address isn't associated for any of the same things.”</strong>: Yep, this is one of our <a href="/posts/164.html">hard questions</a>. What fault is there in WinFS for trying to solve this problem or even simply providing a platform that can be used by an ISV to solve the problem?</p> <p>3) <strong>“The existance of an open data format doesn't mean that your favorite (or mission-critical) application stores its data in the open!”</strong>: But providing a simple standard store that is integrated with the shell and many Windows applications will provide a reason for such mission-critical applications to use WinFS for “offline” scenarios. WinFS security will make this data at least as secure as saving it an XML document, an Excel spreadsheet or an Access database, which is what people do with this data today when they go on a business trip.</p> <p>4) <strong>“What I see in WinFS is the Windows-centric design philosophy”</strong>: Yes. So? We are investing a lot in making Windows the best platform out there. If it ends up being better then Linux... sorry. But I don't buy into the second part of the statement: <strong>“<em>everyone is using Windows, so we can just assume that all of their data will be somewhere in the Windows domain!”</em></strong> The WinFS sync infrastructure will allow data to be moved into and out of WinFS as needed. The back end data sources don't have to be Windows. The work we are doing to support XML data in WinFS is also an indication that we don't believe all data is somewhere in the Windows domain.</p>

People have been asking me for days if I have read Ray Ozzie's post on WinFS. I finally took the time read it slowly and enjoy. Makes me feel good about what I'm working so hard to accomplish.

But you know what... posts like Matthew Mastracci's are the really valuable ones. Challenging our assumptions. There are some interesting comments in Robert Scoble's response as well.

A few specific points:

1) “...I can't see how calling the WinFS APIs is any different than calling the vCard/EXIF/ID3 APIs directly”: Each of those APIs (if you can call a file format an API) is different. The WinFS API provides a common programming model for working with all kinds of meta-data. This programming model fits in with the rest of WinFX, creating yet another kind of network effect resulting in increased programmer productivity. Also note that WinFS doesn't devalue these formats, it just provides a common way to program and search them.

2) “How does the record executive get his contact for 'Marvin Gaye' to link to a list of albums and lyrics, while ensuring that my friend Marvin Gaye's address isn't associated for any of the same things.”: Yep, this is one of our hard questions. What fault is there in WinFS for trying to solve this problem or even simply providing a platform that can be used by an ISV to solve the problem?

3) “The existance of an open data format doesn't mean that your favorite (or mission-critical) application stores its data in the open!”: But providing a simple standard store that is integrated with the shell and many Windows applications will provide a reason for such mission-critical applications to use WinFS for “offline” scenarios. WinFS security will make this data at least as secure as saving it an XML document, an Excel spreadsheet or an Access database, which is what people do with this data today when they go on a business trip.

4) “What I see in WinFS is the Windows-centric design philosophy”: Yes. So? We are investing a lot in making Windows the best platform out there. If it ends up being better then Linux... sorry. But I don't buy into the second part of the statement: everyone is using Windows, so we can just assume that all of their data will be somewhere in the Windows domain!” The WinFS sync infrastructure will allow data to be moved into and out of WinFS as needed. The back end data sources don't have to be Windows. The work we are doing to support XML data in WinFS is also an indication that we don't believe all data is somewhere in the Windows domain.

Mike DeemWinFS Schema Languagehttp://anopinion.net/posts/266.aspxWed, 19 Nov 2003 22:41:00 GMThttp://anopinion.net/posts/266.aspxhttp://anopinion.net/comments/266.aspxhttp://anopinion.net/posts/266.aspx#feedback16http://anopinion.net/comments/commentRss/266.aspxhttp://anopinion.net/trackback.aspx?ID=266<p>In his <a href="http://www.eweek.com/article2/0,4149,1388597,00.asp">recent interview</a> with Jonathan Schwartz, Steve Gillmor makes the statement that “XSD [is] being baked into Office, but is being deprecated in favor of a new schema structure for WinFS.” This reflects some statements by <a href="http://weblog.infoworld.com/udell/2003/10/31.html#a836">Jon Udell</a>. <a href="http://www.25hoursaday.com/weblog/PermaLink.aspx?guid=d0b398f5-9343-4f91-8e1c-340c5a6669e2">Dare</a> had what I thought was a good response, and <a href="http://blogs.gotdotnet.com/johnmont/">John Montgomery</a> posted <a href="http://blogs.gotdotnet.com/johnmont/permalink.aspx/b7823239-aa48-457a-9915-b76b6fb41eac">my opinion</a> on this before I had my blog setup.</p> <p>I'll say it as plainly as I can: the choice to not use XSD to describe WinFS types in no way deprecates the use of XSD to describe XML documents or data.</p> <p>For a number of really good reasons we decided early on that the that first class things WinFS would store would be items, relationships, and extensions not XML elements and attributes. As such, we needed a language to describe our items, relationships and extensions. Trying to do this with XSD was sort of like trying to <a href="http://www.comp.leeds.ac.uk/royce/papers/icmc2002_neagle.pdf">describe dance movement</a> using the <a href="http://www.americanscientist.org/template/BookReviewTypeDetail/assetid/14361;jsessionid=aaa8FODkQqbUyL">language of physics</a>. </p> <p>Sure, we could have used some sort of XSD + Annotations - Features We Can't Support. But this is also a <a href="http://www.gotdotnet.com/team/dbox/default.aspx?key=2003-11-04T09:28:58Z">bad idea</a> and would result in other kinds of criticism. It is interesting that nobody is saying we should have used a modified SQL grammar (CREATE ITEMTYPE Foo ...).</p> <p>I want to make one final point: even though the <strong>first class</strong> things stored in WinFS are not elements and attributes, we are working on a really good XML storage story for WinFS. With the emphasis of Office on XML, it would just be plain stupid for us to do anything else. Oh, and I also think it is what our customers will want.</p>

In his recent interview with Jonathan Schwartz, Steve Gillmor makes the statement that “XSD [is] being baked into Office, but is being deprecated in favor of a new schema structure for WinFS.” This reflects some statements by Jon Udell. Dare had what I thought was a good response, and John Montgomery posted my opinion on this before I had my blog setup.

I'll say it as plainly as I can: the choice to not use XSD to describe WinFS types in no way deprecates the use of XSD to describe XML documents or data.

For a number of really good reasons we decided early on that the that first class things WinFS would store would be items, relationships, and extensions not XML elements and attributes. As such, we needed a language to describe our items, relationships and extensions. Trying to do this with XSD was sort of like trying to describe dance movement using the language of physics.

Sure, we could have used some sort of XSD + Annotations - Features We Can't Support. But this is also a bad idea and would result in other kinds of criticism. It is interesting that nobody is saying we should have used a modified SQL grammar (CREATE ITEMTYPE Foo ...).

I want to make one final point: even though the first class things stored in WinFS are not elements and attributes, we are working on a really good XML storage story for WinFS. With the emphasis of Office on XML, it would just be plain stupid for us to do anything else. Oh, and I also think it is what our customers will want.

Mike DeemWinFS and Interfaceshttp://anopinion.net/posts/264.aspxWed, 19 Nov 2003 21:52:00 GMThttp://anopinion.net/posts/264.aspxhttp://anopinion.net/comments/264.aspxhttp://anopinion.net/posts/264.aspx#feedback16http://anopinion.net/comments/commentRss/264.aspxhttp://anopinion.net/trackback.aspx?ID=264<p><a id="_ctl0_pageBody-1_CommentList__ctl9_NameLink" href="http://www.tallent.us/" target="_blank">Richard Tallent</a> asks <a href="In addition to a type hierarchy, will there be inherent support for Interfaces instead of classes?">in addition to a type hierarchy, will there be inherent support for Interfaces instead of classes</a>? This is a really interesting question and answering it gives me an opportunity talk about the third major type hierarchy in WinFS: extensions (the other two are <a href="/posts/241.html">items</a> and <a href="/posts/261.html">relationships</a>).</p> <p>In CLR, interfaces are typically used to describe a behavioral contract. Since a class can implement multiple interfaces (but can be derived from only one class), interfaces allow for a kind of extensibility over time. An existing class can implement a new interface when adding new behavior allowing it to play a new role in a system.</p> <p>WinFS schemas are primarily about describing data that will be stored, not behavior, so interfaces don't really fit into the WinFS data model. However, the API classes generated from the schema can implement behavior and can leverage interfaces just like any other CLR class.</p> <p>But WinFS items do provide for an extensibility mechanism. In a schema I can defined an extension type with properties. Instances of extensions can be attached to item instances. For example, if I wanted to add an “hair color“ property to the standard Person item type I could define an extension as follows:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p><font face="Courier New">&lt;ExtensionType Name=”PhysicalDescription” BaseType=”System.Storage.Extension”&gt;<br />  &lt;Property Name=“HairColor“ Type=“String“ Size=“50“/&gt;<br />&lt;/ExtensionType&gt;</font></p></blockquote> <p dir="ltr">I can add an instance of this extension to a Person item as follows:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p dir="ltr"><font face="Courier New">ItemContext ic = ItemContext.Open();<br />Person p = Person.FindItemById( id );<br />PhysicalDescription d = new PhysicalDescription();<br />d.HairColor = “Brown”;<br />p.Extensions.Add( d );<br />ic.SaveChanges();</font></p></blockquote> <p dir="ltr">I can find all Person items representing people with brown hair using the code:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p dir="ltr"><font face="Courier New">ItemContext ic = ItemContext.Open();<br />FindResult result = Person.FindAll(<br />  “Extensions.Case(PhysicalDescription).HairColor='Brown'“);</font><font face="Courier New"><br />foreach( Person p in result ) {<br />  Console.WriteLine( p.DisplayName );<br />}</font></p></blockquote>

Richard Tallent asks in addition to a type hierarchy, will there be inherent support for Interfaces instead of classes? This is a really interesting question and answering it gives me an opportunity talk about the third major type hierarchy in WinFS: extensions (the other two are items and relationships).

In CLR, interfaces are typically used to describe a behavioral contract. Since a class can implement multiple interfaces (but can be derived from only one class), interfaces allow for a kind of extensibility over time. An existing class can implement a new interface when adding new behavior allowing it to play a new role in a system.

WinFS schemas are primarily about describing data that will be stored, not behavior, so interfaces don't really fit into the WinFS data model. However, the API classes generated from the schema can implement behavior and can leverage interfaces just like any other CLR class.

But WinFS items do provide for an extensibility mechanism. In a schema I can defined an extension type with properties. Instances of extensions can be attached to item instances. For example, if I wanted to add an “hair color“ property to the standard Person item type I could define an extension as follows:

<ExtensionType Name=”PhysicalDescription” BaseType=”System.Storage.Extension”>
  <Property Name=“HairColor“ Type=“String“ Size=“50“/>
</ExtensionType>

I can add an instance of this extension to a Person item as follows:

ItemContext ic = ItemContext.Open();
Person p = Person.FindItemById( id );
PhysicalDescription d = new PhysicalDescription();
d.HairColor = “Brown”;
p.Extensions.Add( d );
ic.SaveChanges();

I can find all Person items representing people with brown hair using the code:

ItemContext ic = ItemContext.Open();
FindResult result = Person.FindAll(
  “Extensions.Case(PhysicalDescription).HairColor='Brown'“);

foreach( Person p in result ) {
  Console.WriteLine( p.DisplayName );
}

Mike DeemRelationshipshttp://anopinion.net/posts/261.aspxMon, 17 Nov 2003 23:10:00 GMThttp://anopinion.net/posts/261.aspxhttp://anopinion.net/comments/261.aspxhttp://anopinion.net/posts/261.aspx#feedback24http://anopinion.net/comments/commentRss/261.aspxhttp://anopinion.net/trackback.aspx?ID=261<p><a href="/posts/248.html">Shawn Smith</a> and <a href="http://longhornblogs.com/robert/posts/1355.aspx">Robert McLaws</a> have both asked questions which are hard to answer without explaining WinFS relationships. </p> <p>Relationships are used to construct complex structures from individual <a href="http://longhornblogs.com/robert/posts/1355.aspx">items</a>. Relationships are just like the sticks that you use between the round spool thingies in <a href="http://www.strongmuseum.org/NTHoF/tinkertoy.html">Tinker Toys</a> (the spool thingies are the items). A relationship has two ends (we call them source and target), and is always stuck between two items. An item can be the source and target of any number of relationships.</p> <p>Like items, all relationships have a type. The relationship type hierarchy is rooted in the System.Storage.Relationship type. Also like item types, relationship types can define properties that will be stored with relationship instances. In addition, relationship types can specify the required source and target item types.  </p> <p>Here is an example relationship type I could use to create a graph of <a href="http://longhornblogs.com/robert/posts/1355.aspx">Foo items</a>:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p><font face="Courier New">&lt;RelationshipType Name=“FooToFoo“ BaseType=“System.Storage.Relationship“</font><font face="Courier New">&gt;<br />    &lt;Source Name=“SourceFoo“ Type=“Foo“/&gt;<br />    &lt;Target Name=“TargetFoo“ Type=“Foo“/&gt;</font><font face="Courier New"><br />    &lt;Property Name=“X“ Type=“System.Storage.WinFSTypes.Int32“/&gt;<br />&lt;/Relationship&gt;</font></p></blockquote> <p>If I have two Foo items, I can relate them as follows:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p><font face="Courier New">ItemContext ic = ItemContext.Open();<br />Foo f1 = ic.FindItemById( id1 );<br />Foo f2 = ic.FindItemById( id2 );<br />f1.OutRelationships.Add( new FooToFoo( f2 ) );<br />ic.SaveChanges();</font></p></blockquote> <p>So, now to answer the questions. WinFS comes with an item type “Folder” and relationship type “FolderMember”. The FolderMember relationship requires that the source item type be Folder but allows the target item type. So, Robert, because items are put in folders using relationships and an item can be targeted by more then one relationship, you can put an item in more then one folder. And Shawn, in WinFS foldering isn't really obsolete, just expanded to encompass a much more powerful concept: relationships.</p> <p>I'm glossing over some important details, such as the fact that there are three different relationship “modes” (holding, embedding, and relationship) that determine how the relationship controls the lifetime of the target object, how the name space exposed by WinFS is built up, and how security is inherited. Read <a href="http://longhorn.msdn.microsoft.com/lhsdk/winfs/daconwhatiswinfsdatamodel.aspx#winfs_relationships_winfsconceptualmodel">this section of the Longhorn SDK</a> and look at <a href="http://download.microsoft.com/download/6/6/9/669C56E3-12AF-48C5-AB2A-E7705F1BE37F/CLI320.ppt">this PDC presentation</a> for details.</p>

Shawn Smith and Robert McLaws have both asked questions which are hard to answer without explaining WinFS relationships. 

Relationships are used to construct complex structures from individual items. Relationships are just like the sticks that you use between the round spool thingies in Tinker Toys (the spool thingies are the items). A relationship has two ends (we call them source and target), and is always stuck between two items. An item can be the source and target of any number of relationships.

Like items, all relationships have a type. The relationship type hierarchy is rooted in the System.Storage.Relationship type. Also like item types, relationship types can define properties that will be stored with relationship instances. In addition, relationship types can specify the required source and target item types. 

Here is an example relationship type I could use to create a graph of Foo items:

<RelationshipType Name=“FooToFoo“ BaseType=“System.Storage.Relationship“>
    <Source Name=“SourceFoo“ Type=“Foo“/>
    <Target Name=“TargetFoo“ Type=“Foo“/>

    <Property Name=“X“ Type=“System.Storage.WinFSTypes.Int32“/>
</Relationship>

If I have two Foo items, I can relate them as follows:

ItemContext ic = ItemContext.Open();
Foo f1 = ic.FindItemById( id1 );
Foo f2 = ic.FindItemById( id2 );
f1.OutRelationships.Add( new FooToFoo( f2 ) );
ic.SaveChanges();

So, now to answer the questions. WinFS comes with an item type “Folder” and relationship type “FolderMember”. The FolderMember relationship requires that the source item type be Folder but allows the target item type. So, Robert, because items are put in folders using relationships and an item can be targeted by more then one relationship, you can put an item in more then one folder. And Shawn, in WinFS foldering isn't really obsolete, just expanded to encompass a much more powerful concept: relationships.

I'm glossing over some important details, such as the fact that there are three different relationship “modes” (holding, embedding, and relationship) that determine how the relationship controls the lifetime of the target object, how the name space exposed by WinFS is built up, and how security is inherited. Read this section of the Longhorn SDK and look at this PDC presentation for details.

Mike DeemItems Without Fileshttp://anopinion.net/posts/250.aspxSun, 16 Nov 2003 14:17:00 GMThttp://anopinion.net/posts/250.aspxhttp://anopinion.net/comments/250.aspxhttp://anopinion.net/posts/250.aspx#feedback14http://anopinion.net/comments/commentRss/250.aspxhttp://anopinion.net/trackback.aspx?ID=250<a id="_ctl0_pageBody-1_CommentList__ctl4_NameLink" href="http://www.1968.org/" target="_blank"><font color="#355ea0">Kirk Marple</font></a> asks about WinFS being an “<a href="/posts/241.html">object storage layer over top of terabytes of geographically distributed files</a>.” This is one of the scenarios we are thinking about. In my <a href="/posts/241.html">explanation</a> of WinFS, I talked about schemas that are “likely to be used for file backed items most of the time.” The idea is that the schema doesn't dictate if it is a file backed item or not (file backedness isn't part of the type). That allows for situations where you have the object, but want to point to a file in a different location. Of course, this leads to <a href="/posts/248.html">synchronization issues</a> but is acceptable in many circumstances.Kirk Marple asks about WinFS being an “object storage layer over top of terabytes of geographically distributed files.” This is one of the scenarios we are thinking about. In my explanation of WinFS, I talked about schemas that are “likely to be used for file backed items most of the time.” The idea is that the schema doesn't dictate if it is a file backed item or not (file backedness isn't part of the type). That allows for situations where you have the object, but want to point to a file in a different location. Of course, this leads to synchronization issues but is acceptable in many circumstances.Mike DeemKeeping Files and Items in Synchttp://anopinion.net/posts/248.aspxSun, 16 Nov 2003 14:10:00 GMThttp://anopinion.net/posts/248.aspxhttp://anopinion.net/comments/248.aspxhttp://anopinion.net/posts/248.aspx#feedback6http://anopinion.net/comments/commentRss/248.aspxhttp://anopinion.net/trackback.aspx?ID=248In a comment to <a href="/posts/241.html">my explanation</a> of WinFS, Shawn Smith <a href="/posts/241.html">wonders</a> why we would not store all the data (including the file) in the database. In fact we are. The file storage itself is managed by the database engine. In the relational schema, there is a column with a binary type and an attribute that tells the system to store the data in a file in NTFS. Transactions are coordinated between the database and NTFS to maintain consistency. The advantage you get over just a normal binary column is that you can ask the system for a special UNC path that can be used to open the file for I/O, leveraging all the performance of NTFS. It really is the best of both worlds.In a comment to my explanation of WinFS, Shawn Smith wonders why we would not store all the data (including the file) in the database. In fact we are. The file storage itself is managed by the database engine. In the relational schema, there is a column with a binary type and an attribute that tells the system to store the data in a file in NTFS. Transactions are coordinated between the database and NTFS to maintain consistency. The advantage you get over just a normal binary column is that you can ask the system for a special UNC path that can be used to open the file for I/O, leveraging all the performance of NTFS. It really is the best of both worlds.Mike DeemWinFS = Windows Foo Systemhttp://anopinion.net/posts/241.aspxSat, 15 Nov 2003 23:30:00 GMThttp://anopinion.net/posts/241.aspxhttp://anopinion.net/comments/241.aspxhttp://anopinion.net/posts/241.aspx#feedback16http://anopinion.net/comments/commentRss/241.aspxhttp://anopinion.net/trackback.aspx?ID=241<p>I need to clear up a misconception about WinFS: it isn't a system for storing metadata associated with files.</p> <p>WinFS is an item store. Items have a type which defines the properties that make up that item. Item types are arranged in a type hierarchy, rooted with a base Item type. For example, here is the WinFS schema for a Foo item type:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p><font face="Courier New">&lt;Schema Name="FooSchema" xmlns="http://schemas.microsoft.com/winfs/2002/11/18/schema"&gt;</font></p> <p><font face="Courier New">  &lt;Using Namespace="System.Storage"/&gt;<br />  &lt;Using Namespace="System.Storage.WinFSTypes"/&gt;</font></p> <p><font face="Courier New">  &lt;ItemType Name="Foo" BaseType="System.Storage.Item"&gt;<br />    &lt;Property Name="Bar" Type="System.Storage.WinSFTypes.Int32"/&gt;<br />  &lt;/ItemType&gt;</font></p> <p><font face="Courier New">&lt;/Schema&gt;</font></p></blockquote> <p>If I run this schema through our API generator and install the schema in WinFS (neither of which are possible with the PDC release, sorry), I can write the following code to create a Foo item in the store:</p> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p><font face="Courier New">ItemContext ic = ItemContext.Open();<br />Foo foo = new Foo();<br />foo.Bar = 42;<br />Folder folder = Folder.GetRootFolder( ic );<br />folder.OutFolderMemberRelationships.AddMember( foo );</font><font face="Courier New"><br />ic.Update();</font></p></blockquote> <p>I can find a Foo item using this code:</p><font face="Courier New"> <blockquote dir="ltr" style="MARGIN-RIGHT: 0px"> <p>ItemContext ic = ItemContext.Open();<br />Foo foo = Foo.FindOne( ic, "Bar = 42" );</p></blockquote></font> <p>At no time was a file created in order to store my Foo item. The best conceptual analogy (and more or less the technical truth) is that a row was inserted into a database.</p> <p>Now, it so happens that we are using this item store to store meta-data associated with a file. We do this as follows: </p> <p>1) A file system redirector is part of WinFS. That is the thing that handles UNC paths like <a href="/file//localhost/DefaultStore/index.html">\\localhost\DefaultStore</a> (if you are on a Longhorn box, you should be able to click on this link to open your WinFS store).</p> <p>2) When a file is created through this redirector, WinFS creates an item that represents that file. This item and the file are tightly bound. When the file is deleted, the item is deleted. </p> <p>3) WinFS looks up a registered file property handler using the file's extension. The handler knows what type of item to create and how to move meta-data between the file content and the item (in the PDC release, this happens only in one direction: from the file to the item. This will work in both directions before WinFS is finished).</p> <p>In WinFS lingo, such an item is called a <em>file backed item</em>. It is distinct from the Foo item I created in the example above which we call a <em>native item</em>. Many of the types in the WinFS schemas we intend to ship with Windows are likely to be used for file backed items most of the time. For example, Document, Track, and Image. Other types will rarely be used for file backed items. Person, Organization, and Group come to mind.</p> <p><em>Note: I decided to change the schema and code examples in this post to a version that is a little newer then that that shipped with the PDC release. We are making changes for good reason... like the new stuff makes more sense and is easier to explain.</em></p>

I need to clear up a misconception about WinFS: it isn't a system for storing metadata associated with files.

WinFS is an item store. Items have a type which defines the properties that make up that item. Item types are arranged in a type hierarchy, rooted with a base Item type. For example, here is the WinFS schema for a Foo item type:

<Schema Name="FooSchema" xmlns="http://schemas.microsoft.com/winfs/2002/11/18/schema">

  <Using Namespace="System.Storage"/>
  <Using Namespace="System.Storage.WinFSTypes"/>

  <ItemType Name="Foo" BaseType="System.Storage.Item">
    <Property Name="Bar" Type="System.Storage.WinSFTypes.Int32"/>
  </ItemType>

</Schema>

If I run this schema through our API generator and install the schema in WinFS (neither of which are possible with the PDC release, sorry), I can write the following code to create a Foo item in the store:

ItemContext ic = ItemContext.Open();
Foo foo = new Foo();
foo.Bar = 42;
Folder folder = Folder.GetRootFolder( ic );
folder.OutFolderMemberRelationships.AddMember( foo );

ic.Update();

I can find a Foo item using this code:

ItemContext ic = ItemContext.Open();
Foo foo = Foo.FindOne( ic, "Bar = 42" );

At no time was a file created in order to store my Foo item. The best conceptual analogy (and more or less the technical truth) is that a row was inserted into a database.

Now, it so happens that we are using this item store to store meta-data associated with a file. We do this as follows:

1) A file system redirector is part of WinFS. That is the thing that handles UNC paths like \\localhost\DefaultStore (if you are on a Longhorn box, you should be able to click on this link to open your WinFS store).

2) When a file is created through this redirector, WinFS creates an item that represents that file. This item and the file are tightly bound. When the file is deleted, the item is deleted.

3) WinFS looks up a registered file property handler using the file's extension. The handler knows what type of item to create and how to move meta-data between the file content and the item (in the PDC release, this happens only in one direction: from the file to the item. This will work in both directions before WinFS is finished).

In WinFS lingo, such an item is called a file backed item. It is distinct from the Foo item I created in the example above which we call a native item. Many of the types in the WinFS schemas we intend to ship with Windows are likely to be used for file backed items most of the time. For example, Document, Track, and Image. Other types will rarely be used for file backed items. Person, Organization, and Group come to mind.

Note: I decided to change the schema and code examples in this post to a version that is a little newer then that that shipped with the PDC release. We are making changes for good reason... like the new stuff makes more sense and is easier to explain.

Mike DeemWinFS at Scalehttp://anopinion.net/posts/240.aspxSat, 15 Nov 2003 22:08:00 GMThttp://anopinion.net/posts/240.aspxhttp://anopinion.net/comments/240.aspxhttp://anopinion.net/posts/240.aspx#feedback5http://anopinion.net/comments/commentRss/240.aspxhttp://anopinion.net/trackback.aspx?ID=240<a id="_ctl0_pageBody-1_CommentList__ctl10_NameLink" href="http://www.jonhoneyball.com/" target="_blank">Jon Honeyball</a> asks some questions about <a href="/posts/184.html">how well will WinFS handle very large amounts of data</a>. Basically the thing to keep in mind is that WinFS is, at its heart, a relational database. There are <a href="http://terraserver.com">very</a> large <a href="http://skyserver.sdss.org/dr1/en/">databases</a> built on the same technology. I would expect WinFS to ultimately have roughly<em> </em>the same capabilities. However, I can't say (because we don't yet know) exactly how much of this will be achieved in the Longhorn release.Jon Honeyball asks some questions about how well will WinFS handle very large amounts of data. Basically the thing to keep in mind is that WinFS is, at its heart, a relational database. There are very large databases built on the same technology. I would expect WinFS to ultimately have roughly the same capabilities. However, I can't say (because we don't yet know) exactly how much of this will be achieved in the Longhorn release.Mike DeemWinFS Synchttp://anopinion.net/posts/239.aspxSat, 15 Nov 2003 21:50:00 GMThttp://anopinion.net/posts/239.aspxhttp://anopinion.net/comments/239.aspxhttp://anopinion.net/posts/239.aspx#feedback11http://anopinion.net/comments/commentRss/239.aspxhttp://anopinion.net/trackback.aspx?ID=239<p><a href="/posts/184.html">Chris Adams asks some questions about how WinFS sync works</a>. <a href="http://longhornblogs.com/abudja/">Andrej Budja</a> has been doing some digging into the <a href="http://longhorn.msdn.microsoft.com">Longhorn SDK</a> material on this subject and has<a href="http://longhornblogs.com/abudja/posts/1283.aspx"> pulled out some content</a> that may answer these questions.</p>

Chris Adams asks some questions about how WinFS sync works. Andrej Budja has been doing some digging into the Longhorn SDK material on this subject and has pulled out some content that may answer these questions.

Mike DeemWinFS API Usabilityhttp://anopinion.net/posts/238.aspxSat, 15 Nov 2003 21:04:00 GMThttp://anopinion.net/posts/238.aspxhttp://anopinion.net/comments/238.aspxhttp://anopinion.net/posts/238.aspx#feedback2http://anopinion.net/comments/commentRss/238.aspxhttp://anopinion.net/trackback.aspx?ID=238<a href="http://blogs.gotdotnet.com/stevencl/">Steven Clarke</a>, <a href="http://blogs.gotdotnet.com/stevencl/PermaLink.aspx/2fa015b9-439f-4c67-a38c-15afbaa2db5a">wants you to sign up</a> for the WinFS API usability test. So do I.Steven Clarke, wants you to sign up for the WinFS API usability test. So do I.Mike DeemTechnology and Schemashttp://anopinion.net/posts/222.aspxThu, 13 Nov 2003 21:32:00 GMThttp://anopinion.net/posts/222.aspxhttp://anopinion.net/comments/222.aspxhttp://anopinion.net/posts/222.aspx#feedback6http://anopinion.net/comments/commentRss/222.aspxhttp://anopinion.net/trackback.aspx?ID=222<p><a href="http://primates.ximian.com/~miguel//texts/pdc.html">Miguel de Icaza writes about his impression of WinFS</a>. It seems some people may be assuming that it is the technology behind WinFS that is all that makes it what it is. Many people have pointed out that <a href="http://www.novell.com/products/ifolder/">other products</a> have, or will soon have, similar features. </p> <p>However, WinFS's schemas play an even larger part in making WinFS what it is. The idea that there will be a common schema for “Person“ and “Document“ and “Album“ that can be shared, and extended, by thousands of Windows applications is incredibly powerful. <a href="http://www.thinkingin.net/2003/11/12.aspx#a505">Larry O'Brien gets it</a>. And yes <a id="_ctl0__ctl5_CommentList__ctl1_NameLink" href="http://weblogs.asp.net/asanto" target="_blank">Addy</a>, this does <a href="http://weblogs.asp.net/aaguiar/posts/37347.aspx#37399">hark back to what made Hailstorm interesting</a>.</p>

Miguel de Icaza writes about his impression of WinFS. It seems some people may be assuming that it is the technology behind WinFS that is all that makes it what it is. Many people have pointed out that other products have, or will soon have, similar features.

However, WinFS's schemas play an even larger part in making WinFS what it is. The idea that there will be a common schema for “Person“ and “Document“ and “Album“ that can be shared, and extended, by thousands of Windows applications is incredibly powerful. Larry O'Brien gets it. And yes Addy, this does hark back to what made Hailstorm interesting.