How many times have you stored something in key/value collection? Most probably it was Dictionary or some kind of implementation of IEnumerable<KeyValuePair<TKey, TValue>>. More than a few times I wanted to store more than one value under single key, most common solution for this situation is Dictionary with collection of some kind as value type, but do you remember about type that was designed just for that and is seen much less often in code?
This type is Lookup<TKey, TValue> and if you haven’t ever used it it’s probably because it doesn’t even have public constructor. You can acquire it by using ToLookup method in your LINQ pipeline and if you remember one of my previous posts about GroupBy and Joins, you probably know a thing or two about interface IGrouping. It’s important because interface ILookup, which our ToLookup method will return, implements interface IEnumerable<IGrouping<TKey, TElement>> which is return type of GroupBy method (although underlying concrete class which is being used by GroupBy is GroupedEnumerable, you can peek it’s source code here). I’m writing about that because Lookup class share some similarities, I really don’t want to repeat things from my previous posts so if you’re not familiar with GroupBy and/or ToDictionary just read this post first.
Lookup is collection designed for storing multiple values with shared key, one of simplest examples is that you as a person are unique and you can have few or none cars, phones, computers or other items. If you don’t have any items at all and I’ll try to fetch your items to see what you have I shouldn’t get exception that you don’t exist in the system and nothing is stored under your key, I should have empty collection which tells me that nothing was found under provided key.
And that is our Lookup. First of all we’ve just asking “show me everything you have under this key” and if there’s nothing we shouldn’t hear “huh, what key?” as an answer. We never, ever get KeyNotFound exception while using ILookup indexer, worst case scenario you just end with empty Enumerable. Although you can Count all keys with values in your Lookup or check if it contains some key. Here’s interface which is being implemented by Lookup class implementation.
1 2 3 4 5 6 |
public interface ILookup<TKey, TElement> : IEnumerable<IGrouping<TKey, TElement>>, IEnumerable { IEnumerable<TElement> this[TKey key] { get; } int Count { get; } bool Contains(TKey key); } |
As you can see above this collection is read only. You can iterate through it, you can access all those values but you just can’t add anything, along with lack of public constructor those are only major restrictions of Lookups.
What are benefits then? Because for now it looks a lot like Dictionary with collection as a value or just some fancy combo of ToDictionary with GroupBy. It is possible and I’ve seen and used this quite often. Still, you need to be careful about unique keys or you’ll get an exception. With ToLookup you just need some kind of key and everything will be grouped, then you can store your collection and access it anytime you want.
Let’s say we have some collection of Item with two properties Name and Date. Let’s then say we want to group this collection by month number in this date. Those are two simplest ways to do it with LINQ.
1 2 3 4 5 |
var dict = items .GroupBy(i => i.Date.Month) .ToDictionary(g => g.Key, g => g.Select(i => i.Name).AsEnumerable()); var lookup = items.ToLookup(i => i.Date.Month, i => i.Name); |
Accessing collections stored upon those keys would be simple too. This time I’ll show you three examples, first one is simple LINQ Where used on array of Item objects.
1 2 3 4 5 6 7 8 9 10 11 |
//Array var x = items.Where(i => i.Date.Month == 1) .Select(i => i.Name) .ToList(); //Dictionary if (dictionary.ContainsKey(1)) var x = dictionary[1].ToList(); //Lookup var x = lookup[1].ToList(); |
In my opinion in this case using LookUp is simplest solution.
Performance
I’ve created a simple demo with benchmarks that you can clone and run yourself to compare creating and accessing data in LookUp and Dictionary, you can see it on my GitHub here. It’s nothing fancy but a little sample.
I’ve used Item class with static method that’ll create array containing one million of them.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
public class Item { public int Id {get; set;} public string Name { get; set; } public DateTime Date { get; set; } public Item(int n, DateTime date) { Id = n; Name = $"Item-{n}"; Date = date.AddHours(-n); } public static Item[] GetItems() { var now = new DateTime(2017, 1, 1); return Enumerable.Range(1, 1000000).Select(n => new Item(n, now)).ToArray(); } } |
Then I’ve tested creation of Dictionary and LookUp from this array.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
public class Create { private Item[] items; [Setup] public void Setup() { items = Item.GetItems(); } [Benchmark] public Dictionary<int, IEnumerable<string>> CreateDictionary() { return items .GroupBy(i => i.Date.Month) .ToDictionary(g => g.Key, g => g.Select(i => i.Name).AsEnumerable()); } [Benchmark] public ILookup<int, string> CreateLookup() { return items.ToLookup(i => i.Date.Month, i => i.Name); } } |
Second benchmark I’ve ran against accessing collections stored upon those keys. I’ve forced enumeration with ToList method too.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
public class GetValue { private Item[] items; private Dictionary<int, IEnumerable<string>> dict; private ILookup<int, string> lookup; [Setup] public void Setup() { items = Item.GetItems(); dict = items .GroupBy(i => i.Date.Month) .ToDictionary(g => g.Key, g => g.Select(i => i.Name).AsEnumerable()); lookup = items.ToLookup(i => i.Date.Month, i => i.Name); } [Benchmark] public List<string> GetFromArray() { return items.Where(i => i.Date.Month == 2) .Select(i => i.Name) .ToList(); } [Benchmark] public List<string> GetFromDictionary() { return dict[2].ToList(); } [Benchmark] public List<string> GetFromLookup() { return lookup[2].ToList(); } } |
Those are really simple tests, and below you can take a look at their results.
Create
Method | Mean | Error | StdDev |
---|---|---|---|
CreateDictionary | 88.2519 ms | 0.6195 ms | 0.5794 ms |
CreateLookup | 88.4566 ms | 0.8182 ms | 0.7253 ms |
Get
Method | Mean | Error | StdDev |
---|---|---|---|
GetFromArray | 50,655.8704 us | 550.8332 us | 488.2993 us |
GetFromDictionary | 2,334.8219 us | 46.3560 us | 47.6041 us |
GetFromLookup | 211.7846 us | 4.1913 us | 6.4005 us |
As you can see creation times are almost the same. What’s really different is times needed to access stored data. Why there is so much difference is content for another post.
Edit (3 V 2017)
I’ve done some more research on the matter. It appears that iterating through LookUp is faster only because of underlying concrete collection underneath which is Grouping in case of LookUp and GroupedEnumerable made with GroupBy method.
I’ve updated samples on github with some more examples so you can run benchmarks yourself. There are also great comment underneath this post which were main reason for this little update (thanks guys!). I’ll work on this some more and will write a followup post in near future, but for now I can tell you that accessing value by key in LookUp is a bit slower than in Dictionary. However in most common scenarios of creating Dictionary like one being tested with LINQ iterating through values after accessing them will be faster.
Total execution time is distributed through creation of LookUp/Dictionary, accessing stored values and iterating through them/accessing stored values. Differences in performance aren’t so big (and if performance were critical you shouldn’t use LINQ) to concern yourself with them, just choose what is more comfortable for you (in my case it would be LookUp)
Summary
I hope you’ll try using LookUp yourself if you see opportunity for that. If you want to take a closer look at how it works just take see some sources here.