View
A View is a map/reduce-powered method of quickly accessing information inside of a Collection. A View can only belong to one Collection.
Views define two important associated types: a Key type and a Value type. You can think of these as the equivalent entries in a map/dictionary-like collection that supports more than one entry for each Key. The Key is used to filter the View's results, and the Value is used by your application or the reduce()
function.
Views are a powerful, yet abstract concept. Let's look at a concrete example: blog posts with categories.
#[derive(Serialize, Deserialize, Debug, Collection)]
#[collection(name = "blog-post", views = [BlogPostsByCategory])]
pub struct BlogPost {
pub title: String,
pub body: String,
pub category: Option<String>,
}
Let's insert this data for these examples:
BlogPost {
title: String::from("New version of BonsaiDb released"),
body: String::from("..."),
category: Some(String::from("Rust")),
}
.push_into(&db)
.await?;
BlogPost {
title: String::from("New Rust version released"),
body: String::from("..."),
category: Some(String::from("Rust")),
}
.push_into(&db)
.await?;
BlogPost {
title: String::from("Check out this great cinnamon roll recipe"),
body: String::from("..."),
category: Some(String::from("Cooking")),
}
.push_into(&db)
.await?;
All examples on this page are available in their full form in the repository at book/book-examples/tests.
While category
should be an enum, let's first explore using String
and upgrade to an enum at the end (it requires one additional step). Let's implement a View that will allow users to find blog posts by their category as well as count the number of posts in each category.
#[derive(Debug, Clone, View)]
#[view(collection = BlogPost, key = Option<String>, value = u32, name = "by-category")]
pub struct BlogPostsByCategory;
impl ViewSchema for BlogPostsByCategory {
type View = Self;
fn map(&self, document: &BorrowedDocument<'_>) -> ViewMapResult<Self::View> {
let post = document.contents::<BlogPost>()?;
Ok(document.emit_key_and_value(post.category, 1))
}
fn reduce(
&self,
mappings: &[ViewMappedValue<Self::View>],
_rereduce: bool,
) -> ReduceResult<Self::View> {
Ok(mappings.iter().map(|mapping| mapping.value).sum())
}
}
The two traits being implemented are View and
ViewSchema. These traits are designed to allow keeping the
View
implementation in a shared code library that is used by both client-side
and server-side code, while keeping the ViewSchema
implementation in the
server executable only.
Views for SerializedCollection
For users who are using SerializedCollection
, CollectionViewSchema
can be implemented instead of ViewSchema
. The only difference between the two is that the map()
function takes a CollectionDocument
instead of a BorrowedDocument
.
Value Serialization
For views to function, the Value type must able to be serialized and deserialized from storage. To accomplish this, all views must implement the SerializedView
trait. For Serde-compatible data structures, DefaultSerializedView
is an empty trait that can be implemented instead to provide the default serialization that BonsaiDb recommends.
Map
The first line of the map
function calls Document::contents()
to deserialize the stored BlogPost
. The second line returns an emitted Key and Value -- in our case a clone of the post's category and the value 1_u32
. With the map function, we're able to use query()
and query_with_docs()
:
let rust_posts = db
.view::<BlogPostsByCategory>()
.with_key(Some(String::from("Rust")))
.query_with_docs()
.await?;
for mapping in &rust_posts {
let post = mapping.document.contents::<BlogPost>()?;
println!(
"Retrieved post #{} \"{}\"",
mapping.document.header.id, post.title
);
}
The above snippet queries the Database for all documents in the BlogPost
Collection that emitted a Key of Some("Rust")
.
If you're using a SerializedCollection
, you can use query_with_collection_docs()
to have the deserialization done automatically for you:
let rust_posts = db
.view::<BlogPostsByCategory>()
.with_key(Some(String::from("Rust")))
.query_with_collection_docs()
.await?;
for mapping in &rust_posts {
println!(
"Retrieved post #{} \"{}\"",
mapping.document.header.id, mapping.document.contents.title
);
}
Reduce
The second function to learn about is the reduce()
function. It is responsible for turning an array of Key/Value pairs into a single Value. In some cases, BonsaiDb might need to call reduce()
with values that have already been reduced one time. If this is the case, rereduce
is set to true.
In this example, we're using the built-in Iterator::sum()
function to turn our Value of 1_u32
into a single u32
representing the total number of documents.
let rust_post_count = db
.view::<BlogPostsByCategory>()
.with_key(Some(String::from("Rust")))
.reduce()
.await?;
assert_eq!(rust_post_count, 2);
Changing an exising view
If you have data stored in a view, but want to update the view to store data
differently, implement ViewSchema::version()
and return
a unique number. When BonsaiDb checks the view's integrity, it will notice that
there is a version mis-match and automatically re-index the view.
There is no mechanism to access the data until this operation is complete.
Understanding Re-reduce
Let's examine this data set:
Document ID | BlogPost Category |
---|---|
1 | Some("Rust") |
2 | Some("Rust") |
3 | Some("Cooking") |
4 | None |
When updating views, each view entry is reduced and the value is cached. These are the view entries:
View Entry ID | Reduced Value |
---|---|
Some("Rust") | 2 |
Some("Cooking") | 1 |
None | 1 |
When a reduce query is issued for a single key, the value can be returned without further processing. But, if the reduce query matches multiple keys, the View's reduce()
function will be called with the already reduced values with rereduce
set to true
. For example, retrieving the total count of blog posts:
let total_post_count = db.view::<BlogPostsByCategory>().reduce().await?;
assert_eq!(total_post_count, 3);
Once BonsaiDb has gathered each of the key's reduced values, it needs to further reduce that list into a single value. To accomplish this, the View's reduce()
function to be invoked with rereduce
set to true
, and with mappings containing:
Key | Value |
---|---|
Some("Rust") | 2 |
Some("Cooking") | 1 |
None | 1 |
This produces a final value of 4.
How does BonsaiDb make this efficient?
When saving Documents, BonsaiDb does not immediately update related views. It instead notes what documents have been updated since the last time the View was indexed.
When a View is accessed, the queries include an AccessPolicy
. If you aren't overriding it, UpdateBefore
is used. This means that when the query is evaluated, BonsaiDb will first check if the index is out of date due to any updated data. If it is, it will update the View before evaluating the query.
If you're wanting to get results quickly and are willing to accept data that might not be updated, the access policies UpdateAfter
and NoUpdate
can be used depending on your needs.
If multiple simulataneous queries are being evaluted for the same View and the View is outdated, BonsaiDb ensures that only a single view indexer will execute while both queries wait for it to complete.
Using arbitrary types as a View Key
In our previous example, we used String
for the Key type. The reason is important: Keys must be sortable by our underlying storage engine, which means special care must be taken. Most serialization types do not guarantee binary sort order. Instead, BonsaiDb exposes the Key
trait. On that documentation page, you can see that BonsaiDb implements Key
for many built-in types.
Using an enum as a View Key
The easiest way to expose an enum is to derive num_traits::FromPrimitive
and num_traits::ToPrimitive
using num-derive, and add an impl EnumKey
line:
#[derive(
Serialize, Deserialize, Debug, num_derive::FromPrimitive, num_derive::ToPrimitive, Clone,
)]
pub enum Category {
Rust,
Cooking,
}
impl EnumKey for Category {}
The View code remains unchanged, although the associated Key type can now be set to Option<Category>
. The queries can now use the enum instead of a String
:
let rust_post_count = db
.view::<BlogPostsByCategory>()
.with_key(Some(Category::Rust))
.reduce()
.await?;
BonsaiDb will convert the enum to a u64 and use that value as the Key. A u64 was chosen to ensure fairly wide compatibility even with some extreme usages of bitmasks. If you wish to customize this behavior, you can implement Key
directly.
Implementing the Key
trait
The Key
trait declares two functions: as_big_endian_bytes()
and from_big_endian_bytes
. The intention is to convert the type to bytes using a network byte order for numerical types, and for non-numerical types, the bytes need to be stored in binary-sortable order.
Here is how BonsaiDb implements Key for EnumKey
:
impl<'a, T> Key<'a> for T
where
T: EnumKey,
{
type Error = std::io::Error;
const LENGTH: Option<usize> = None;
fn as_big_endian_bytes(&'a self) -> Result<Cow<'a, [u8]>, Self::Error> {
let integer = self
.to_u64()
.map(Unsigned::from)
.ok_or_else(|| std::io::Error::new(ErrorKind::InvalidData, IncorrectByteLength))?;
Ok(Cow::Owned(integer.to_variable_vec()?))
}
fn from_big_endian_bytes(bytes: &'a [u8]) -> Result<Self, Self::Error> {
let primitive = u64::decode_variable(bytes)?;
Self::from_u64(primitive)
.ok_or_else(|| std::io::Error::new(ErrorKind::InvalidData, UnknownEnumVariant))
}
}
By implementing Key
you can take full control of converting your view keys.