Free!  FoxPro to WPF screen converter
   Almost free!  One day of FREE VFP to WPF evaluation of your FoxPro app ONSITE
   Free!  Read the foreword to Les' new book (available here)...
Skip Navigation Links

Speeding up your queries with Projection

How to do more with less...


Les Pinter

Login or register to download source code

Introduction

Surprisingly, LINQ queries can be quite fast. Not so surprisingly, they can also be very, very slow. This articles shows how to speed them up.

I'm currently working on a project that has a MASTER table of 75,000 people with joins to fifteen other tables. The form used to maintain the MASTER table consists of a TabPage with a list on page 1, and details on pages 2 through 6. When I load the people collection for page 1, the app bombs with an OutOfMemory exception. Even if I filter it to only people from a particular state, it takes freaking forever.

But in the grid on page 1, I really only need a dozen columns from the MASTER table, plus two fields from the related States table so that I can also show the StateCode and StateName in the grid. As my user scrolls through the GridView, I retrieve the corresponding MASTER record (including ALL columns) and ALL of the related records from the 15 JOINs, and I display those fields in controls on pages 2 through 6 of the tabPage.

So how do I retrieve only the columns that I need for the gridview, but also get all of the other columns, plus the related data from all of the navigation collections, to display on the other tabpages? Projection, plus clever use of the setter for the public property bound to the SelectedItem attribute of the gridview.

Projection consists of directing a query to return only some of the columns in a query. If there are related tables, the query can be defined as a JOIN of only some of the tables that the primary table is related to. The normal entity query that's constructed when you save the EDMX file can be used to return the complete entity with all of the navigation collections containing all of the related records for only the selected MASTER record. But for this common design pattern, the collection used in the GridView on page one doesn't need all of that.

The resulting speed increase is huge. My MASTER table went from bombing after a minute (when only about 15,000 records had loaded) to loading all 75,000 rows in 6.2 seconds.

The test case

I can't show you my client's data, so for my test case, I used the US Postal Service's ZIPCODES table, without any related tables. Although a more complex case with a dozen related tables would better showcase the speed gains that projections permit, the complexity of such a real-world case would be too confusing. So my test case will deal only with a single table. It makes the case just as well.

I downloaded the ZIP code file from the post office. And I already paid for it! (You did too, if you pay taxes. Even if you don't, the good folks at the USPS make it available expressly for testing. Says so right there on the USPS website.) The DBF is included in the zip file available from the download link at the top of this article (once you register/sign in). You can use my Load DBFs TO SQL utility, the last article in the FoxPro TO .NET tab on this website.

Here's the structure of the ZIPCODES table:

    1  zip          VarChar( 5)
    2  type         VarChar( 8)
    3  primary_ci   VarChar(13)
    4  acceptable   VarChar(17)
    5  unacceptab   VarChar(254)
    6  state        VarChar( 5)
    7  county       VarChar(14)
    8  timezone     VarChar(19)
    9  area_codes   VarChar(10)
   10  latitude     VarChar( 8)
   11  longitude    VarChar( 9)
   12  world_regi   VarChar(12)
   13  country      VarChar( 7)
   14  decommissi   VarChar(14)
   15  estimated_   VarChar(20)
   16  notes        VarChar(13)

PRIMARY_CI is the City. I don't really know what some of these fields are, but since the whole point of this articles is that we usually don't need all of the columns in our tables, that's perfect. And if you're working with tables that you inherited from someone else, your column names are probably just as goofy as mine.

My test form contains a two-page TabPage:

The List page of the form
The Details page of the form

On the first page, I display four of the columns from the ZIPCODES table. On page 2, I display all of the fields in the currently-selected record from the grid on page 1. If we had related tables, their values could appear on this and other tabpages.

The other fields in the ZIPCODES table, and any values located in related tables, aren't in the zipcodes ObservableCollection that feeds the grid. So where do you get them?

DataBinding makes this easy. You can use the SelectedItem property of the GridView to poke the selected ZIPCODE record from the grid into a ViewModel public property called SelectedZIP (derived from the ZIPCODE entity). As the user scrolls through the grid, the SelectedZIP property's setter uses the "zip" value in the SelectedZIP object to retrieve the corresponging ZIPCODE record into another ZIPCODE object, in this case called One. The controls on page 2 are bound to that property. If there were additional related tables and corresponding navigation properties, they could be displayed here.

Normal queries versus projections

When you use a query, you usually return all columns. A SELECT statement in SQL that looks like this

SELECT * FROM ZIPCODES

would be rendered in LINQ as

public ObservableCollection zipcodes { get; set; }
...
var query = dbContext.ZIPCODES;
foreach {ZIPCODE Z in query) { zipcodes.Add (Z); }

The rows returned into the collection contain navigation properties to hold records returned from all related tables.

The LINQ() method does that:

    void LINQ()  // (some code removed)
    { zipcodes.Clear();
      var query = mgr.ZIPCODES.OrderBy( x => x.zip);
      foreach (ZIPCODE Z in query) { zipcodes.Add(Z); }
    }

This query returns all columns from each record in the ZIPCODES table and stores them as rows in the zipcodes ObservableCollection. If you had related tables in the EDMX diagram, the entity would contain one navigation collection for each related table, and each would consist of all of the columns in the corresponding table. You refer to related objects for each row using the navigation property name. (We don't have any in my example, but that's how you'd get to them.)

But I don't need to return all of the columns in the table, and all of the columns in its navigation property collections, to appear in the list on page 1; I only need four of them. So how do I return only the four columns that appear in the GridView? In a nutshell, you modify the query to reflect only the JOINs that you need for the GridView, and select only the columns that are needed.

In our case, it looks like this:

    void Project()
    { zipcodes.Clear();
      var query = mgr.ZIPCODES.OrderBy(x => x.zip)
        .Select ( x => new { a = x.zip, b = x.primary_ci, c = x.state, d = x.county });
      foreach (var Z in query)
      { zipcodes.Add(new ZIPCODE { zip = Z.a, primary_ci = Z.b, state = Z.c, county = Z.d });}
    }

I could have used more meaningful column names, but I wanted to make the point that the names don't matter. You can call them anything - as long as they match. The zipcodes collection contains all of the columns in the entity structure; it just doesn't contain all of the data in each row.

Note that you can use a dynamic type, and pass it the query (including a projection defined in the Select clause) to load your collection:

Using a dynamic type to create an ObservableCollection:

   public new ObservableCollection<dynamic> ZIPList { get; set; }

   ...

   ZIPList = new ObservableCollection<dynamic>(
      from    zp in mgr.ZIPCODES
      orderby zp.zip
      select new
      { zip = zp.zip,
        city = zp.primary_ci,
        state = zp.state,
        county = zp.county
      });

However, you would need to include a class definition with the same structure as the dynamic collection in order to use SelectedItem to pass the current record to a public property in the ViewModel.

Building the DataModel

I created a database called WPFTest in an earlier article, and that's a good place to put this table as well. The download contains a ZIPCODES.DBF that you can load to this database using LoadDBFtoSQL utility mentioned above. Download it from my website.

I started by creating a new WPF project, with a single form called MainWindow. (You might want to download the entire project from the above link (you'll have to register and/or sign in to see the "download" link) before continuing.) Use "Tools, NuGet Package Manager, " and select "Manage Nuget Packages for Solution." The IDE will add a reference to EntityFramework to both projects.

If you decide to battle your way through this, it will be good for you. To create the datamodel, add a separate class library project to the solution, and then delete the "Class1.cs" file that it automatically adds. Added a New Item (using ADO as the filter) and selecte an ADO Entity Data Model, naming it "DataModel.cs". Created a connection to my WPFTest database and select the ZIPCODES table. Then, build the DataModel project. Next, drag and drop the App.config file from the DataModel project to the WPF project, and add a reference from the DataModel project to the WPF project,

In the download for this article, you'll find a little class file called ViewModelBaseClass.cs. All ViewModels need a mechanism to implement "Property Change Notification" and Routed Commands, and my ViewModelBaseClass has both. For each form, add a ViewModel.cs class, and after its name, add this:

   : ViewModelBaseClass

You can then add public properties to bind to, and code to execute when said properties are set. There are 4 lines of code that you need to add to the CodeBehind of each form. But the ViewModel is where the rest of your code goes.

Creating the WPF Form using Extended Application Markup Language (XAML)

The form (the "View" in "Model-View-ViewModel") is built in the designer. I tweak the XAML quite a bit, because it's ugly and I like my published articles about code to look pretty.

Let's look at the XAML needed to display this form. Click the button below to show/hide the code. Comments follow the code.

The Window declaration

The Window declaration centers the form and provides a title, width and height; the namespace (xmlns) declarations provide internal names for resources used by the screen renderer.

The Window Resources section

The Window.Resources section provides some visual styles for buttons and textboxes; ordinarily, you would have a single resource file or an App.XAML file to style all controls. These can be quite robust; some are 5,000 lines or more long.

The Grid

The preferred layout mechanism for a WPF form is a Grid. Don't use a StackPanel! StackPanel tries to re-render its contents every time a row is added, and it can cause a 10-second data loading process to take five minutes. So don't EVER enclose a DataGrid or RadGridView inside a StackPanel.

All of my Grid Rows are 30 pixels high, except for row 2, which holds the grid. The "*" height specification means that it occupies the remaining space in the Grid.

Grid.Row "0" collects the State parameter optionally used in the query. I say optionally because the corresponding Where clause is this:

.Where ( x => StateCode == ""? true: x.state == StateCode)

which means that if StateCode is the empty string the record is included; if it's not, it has to match the user's selection. (Make sure that your ItemsSource collection for the combo includes a null string at the top of the list so that you can set it to the nulll string with combobox row 1.}

WARNING: You have to "RaisePropertyChanged" (see ViewModelBase.cs, below) after you set the value of a property; otherwise, WPF can't see that it changed - unless it's an ObservableCollection, which automatically does Property Change Notification.

Grid.Row "1" displays a textbox where you can enter the name, or the first few characters of the name, of a city, to see how many city names match. The Where clause for that is

.Where ( x => City == ""? true: x.primary_ci.StartsWith(City))

By the way, the parameter values aren't sent to their bound properties in the ViewModel until you tab out of these controls. So if you change the combo to "Arkansas" and then press Ctrl+L to use the LINQ query to load the GridView's bound zipcodes collection, it won't load the Arkansas data. There's a way to work around it, but "bug-free software is software that works according to the instructions." So tell your users to tab past the control before loading the gridview.

Grid.Row "2" holds the gridview. I used a WPF gridview, although in my practice I use Telerik's RadGridView, which does my filtering and searching. The ItemsSource is the name of the public ObservableCollection that I load in the ViewModel. The SelectedItem binding causes the currently selected row to be sent to a ViewModel property SelectedZIP, consisting of one record with the same structure as the rows in the table on page 1. As the user moves the row selector up and down, the "current record" is stuffed into a structure based on one record in the ZIPCODES table.

The setter for the SelectedItem binding property SelectedZIP in the ViewModel uses SelectedZIP.zip to retrieve the one record in the ZIPCODES table whose zipcode matches and store it in a ZIPCODE object named One. Once One has been populated, all of the controls on page 2 can be bound to it. The DataContext = {Binding One} attribute in the Grid tag on page 2 of the TabPage tells WPF to bind all of the contained textboxes to that record (except for the two Buttons - see below).

Grid Row "2" also contains two labels bound to two ViewModel properties to display the number of matching records in the list, and the time it took to retrieve them.

Grid.Row "3" contains a StackPanel with buttons, each with a Command attribute. Commands rely on elements of the ViewModelBaseClass that I pirated from somewhere or other. Each named command corresponds to a member of the _commands collection in the ViewModel, which are exposed via the Commands property of the ViewModel.

Page 2

Page 2 of the TabControl contains a grid with sixteen labels and sixteen textboxes, corresponding to the 16 columns in the ZIPCODES table. The last row of the grid contains Save and Cancel buttons. Note that since the grid's DataContext is set to "One", all of the controls automatically look in that object (the currently-selected record returned by the SelectedItem binding to SelectedZIP) for their Path property value. The Commands collection isn't in the One object - hence the somewhat convoluted Command= reference back to the top of the DataContext.

(The grid on page 2 is bound to "DataContext.One". Although One is a property of the ViewModel, bindings are actually bound to the DataContext. So since the grid is already bound to "DataContext.One", you have to go "up one level" to find the Commands property. That's why the RelativePath nonsense is required. I personally favor a better way of specifying "global" DataContext/Viewmodel properties like this - in case the guys in Redmond read my posts.)

The codebehind

The codebehind for the form is tiny, and almost always the same:

Listing 2 - the code-behind file

MainWindow.xaml.cs

using System.Windows;

namespace Projection
{
  public partial class MainWindow : Window
  {
    public MainWindowViewModel vm;

    public MainWindow()
    {
      InitializeComponent();
      vm = new MainWindowViewModel();
      vm.CallingForm = this; //so that the ViewModel can close the form
      DataContext = vm;
    }

  }
}

Its purpose is to instantiate an object based on the form's viewmodel, and assign it to the form's DataContext. It also passes the form object (this) into the ViewModel's CallingForm object so that we can close the form.

The ViewModel

The ViewModel, usually named <FormName>VM.cs, contains the code used by the form.

Click the little button to see my MainWindowViewModel code:

Properties in the ViewModel class

RaisePropertyChanged("PropName") is implemented in the ViewModelBase class, It sends "property change notification" back to the form. Without it, the form won't know that the value was changed in your code. If other public properties are changed, you'll also have to RaisePropertyChanged for those properties. I often forget to include it, and then spend hours wondering why my lovely code doesn't work. So you've been forewarned.

CallingForm, the first entry in the ViewModel class is actually a public field, not a property, As we'll see later, the calling form is assigned to this field so that the calling form that uses this viewmodel can be closed using CallingForm.Close(). Not pure MVVM, but i'm feeling naughty.

There are several public properties that are used to communicate with the form.

  • The RecordCount property is set to the number of rows in the people collection, explained below; the RaisePropertyChanged("RecordCount") is needed to tell the form that the value changed.
  • The ElapsedTime property reports how long the query took;
  • The CommandMap, consisting of two properties: One to hold the collection of commands, and another to make them easy to reference from WPF;
  • The dbContext class (WPFTestEntities) generated by ADO, and the mgr class created from it;
  • An ObservableCollection of ZIPCODE objects, used to populate the grid;
  • A ZIPCODE object called SelectedZIP, which received a copy of the selected Grid row when the user moves the pointer;
  • Two additional ZIPCODE objects: One which contains the complete current row retrieved from SQL as SelectedZIP changes, and SavedOne, a copy of One to be used to restore the previous state if the user cancels changes; and
  • StateCode and City, bound to the two controls in the third row of the grid on page 1, used as parameters in the queries.

The Constructor (which always has the same name as the containing class in C#) creates the three objects used in the code, (two ObservableCollections and mgr, the dbContext), and adds the commands that are bound to our buttons. The data is loaded using normal LINQ if the user clicks on the "using LINQ" button, or projection if they click on the "Using Projection" button.

LINQ() clears the people collection, and then defines and executes a query and loads the records (one at a time) into zipcodes. It also stores the number of records loaded into RecordCount (which contains its own RaisePropertyChanged() call), and loads the first row into SelectedZIP, just as scrolling the grid to the first row would do.

Project() does the same thing as LINQ, except that only four columns are loaded into the zipcodes object. If you set a breakpoint on the line

 { zipcodes.Add ( Z );

you'll see that all of the other columns in each row are null.

Save() is simple in Entity-Framework - just call mgr.SaveChanges().

Cancel() is relatively complicated, and something of a surprise. One might expect a "RejectChanges()" method on the mgr object, but no. And calling LoadData() after canceling seems a bit extreme, especially if your data file has hundreds of thousands of records. And for the moment, PAge 2 isn't being refreshed when I restore the previous state of "One" when I Cancel. I'll look for a better approach.

CanSave() is a delegate that determines whether buttons are enabled or disabled. The Save and Cancel buttons are only enabled if the user changes something in one of the controls on page 2. This feature is implemented using the CanExecute delegate of the AddCommand method which adds the commands that are bound to buttons on the form.

Close() uses the Close() method of the CallingForm property, which is set in the CodeBehind.

The ViewModel Base Class

The ViewModel Base Class provides the RaisePropertyChanged event (which tells your form that you changed a property's value in your ViewModel), and provides the mechanism for "routed commands", which are bound to buttons in the form. Routed commands have two delegates: The "Execute" delegate is executed when the button is clicked; the "CanExecute" delegate returns a bool value which is passed to the associated button's IsEnabled property - .ChangeTracker.HasChanges() in this case. When data is changed, Save and Cancel are enabled. My dbContext is called mgr. ViewModelBaseClass.cs appears in Listing 3, below. You really don't need to see it, but I have to include it.

Final remarks

The completed project looks like this:

Fig. 3 - The completed project

The techniques explained in this article should make it relatively easy to implement list/detail forms of this type. Let me know if you would like to see variations on this theme.

Les

qqq


Copyright(C) Pinter Consulting, 2012Tel: +1 (650) 464-6924
Automated conversion between C# and VB by Visible C#/Visible VB from Tangible Software Solutions