View

A View is a map/reduce-powered method of quickly accessing information inside of a Collection. A View can only belong to one Collection.

Views define two important associated types: a Key type and a Value type. You can think of these as the equivalent entries in a map/dictionary-like collection that supports more than one entry for each Key. The Key is used to filter the View's results, and the Value is used by your application or the reduce() function.

Views are a powerful, yet abstract concept. Let's look at a concrete example: blog posts with categories.

#[derive(Serialize, Deserialize, Debug, Collection)]
#[collection(name = "blog-post", views = [BlogPostsByCategory])]
pub struct BlogPost {
    pub title: String,
    pub body: String,
    pub category: Option<String>,
}

Let's insert this data for these examples:

    BlogPost {
        title: String::from("New version of BonsaiDb released"),
        body: String::from("..."),
        category: Some(String::from("Rust")),
    }
    .push_into(&db)
    .await?;

    BlogPost {
        title: String::from("New Rust version released"),
        body: String::from("..."),
        category: Some(String::from("Rust")),
    }
    .push_into(&db)
    .await?;

    BlogPost {
        title: String::from("Check out this great cinnamon roll recipe"),
        body: String::from("..."),
        category: Some(String::from("Cooking")),
    }
    .push_into(&db)
    .await?;

All examples on this page are available in their full form in the repository at book/book-examples/tests.

While category should be an enum, let's first explore using String and upgrade to an enum at the end (it requires one additional step). Let's implement a View that will allow users to find blog posts by their category as well as count the number of posts in each category.

#[derive(Debug, Clone, View)]
#[view(collection = BlogPost, key = Option<String>, value = u32, name = "by-category")]
pub struct BlogPostsByCategory;

impl ViewSchema for BlogPostsByCategory {
    type View = Self;

    fn map(&self, document: &BorrowedDocument<'_>) -> ViewMapResult<Self::View> {
        let post = BlogPost::document_contents(document)?;
        document.header.emit_key_and_value(post.category, 1)
    }

    fn reduce(
        &self,
        mappings: &[ViewMappedValue<Self::View>],
        _rereduce: bool,
    ) -> ReduceResult<Self::View> {
        Ok(mappings.iter().map(|mapping| mapping.value).sum())
    }
}

The two traits being implemented are View and ViewSchema. These traits are designed to allow keeping the View implementation in a shared code library that is used by both client-side and server-side code, while keeping the ViewSchema implementation in the server executable only.

Views for SerializedCollection

For users who are using SerializedCollection, CollectionViewSchema can be implemented instead of ViewSchema. The only difference between the two is that the map() function takes a CollectionDocument instead of a BorrowedDocument.

Value Serialization

For views to function, the Value type must able to be serialized and deserialized from storage. To accomplish this, all views must implement the SerializedView trait. For Serde-compatible data structures, DefaultSerializedView is an empty trait that can be implemented instead to provide the default serialization that BonsaiDb recommends.

Map

The first line of the map function calls SerializedCollection::document_contents() to deserialize the stored BlogPost. The second line returns an emitted Key and Value -- in our case a clone of the post's category and the value 1_u32. With the map function, we're able to use query() and query_with_docs():

    let rust_posts = db
        .view::<BlogPostsByCategory>()
        .with_key(Some(String::from("Rust")))
        .query_with_docs()
        .await?;
    for mapping in &rust_posts {
        let post = BlogPost::document_contents(mapping.document)?;
        println!(
            "Retrieved post #{} \"{}\"",
            mapping.document.header.id, post.title
        );
    }

The above snippet queries the Database for all documents in the BlogPost Collection that emitted a Key of Some("Rust").

If you're using a SerializedCollection, you can use query_with_collection_docs() to have the deserialization done automatically for you:

    let rust_posts = db
        .view::<BlogPostsByCategory>()
        .with_key(Some(String::from("Rust")))
        .query_with_collection_docs()
        .await?;
    for mapping in &rust_posts {
        println!(
            "Retrieved post #{} \"{}\"",
            mapping.document.header.id, mapping.document.contents.title
        );
    }

Reduce

The second function to learn about is the reduce() function. It is responsible for turning an array of Key/Value pairs into a single Value. In some cases, BonsaiDb might need to call reduce() with values that have already been reduced one time. If this is the case, rereduce is set to true.

In this example, we're using the built-in Iterator::sum() function to turn our Value of 1_u32 into a single u32 representing the total number of documents.

    let rust_post_count = db
        .view::<BlogPostsByCategory>()
        .with_key(Some(String::from("Rust")))
        .reduce()
        .await?;
    assert_eq!(rust_post_count, 2);

Changing an exising view

If you have data stored in a view, but want to update the view to store data differently, implement ViewSchema::version() and return a unique number. When BonsaiDb checks the view's integrity, it will notice that there is a version mis-match and automatically re-index the view.

There is no mechanism to access the data until this operation is complete.

Understanding Re-reduce

Let's examine this data set:

Document IDBlogPost Category
1Some("Rust")
2Some("Rust")
3Some("Cooking")
4None

When updating views, each view entry is reduced and the value is cached. These are the view entries:

View Entry IDReduced Value
Some("Rust")2
Some("Cooking")1
None1

When a reduce query is issued for a single key, the value can be returned without further processing. But, if the reduce query matches multiple keys, the View's reduce() function will be called with the already reduced values with rereduce set to true. For example, retrieving the total count of blog posts:

    let total_post_count = db.view::<BlogPostsByCategory>().reduce().await?;
    assert_eq!(total_post_count, 3);

Once BonsaiDb has gathered each of the key's reduced values, it needs to further reduce that list into a single value. To accomplish this, the View's reduce() function to be invoked with rereduce set to true, and with mappings containing:

KeyValue
Some("Rust")2
Some("Cooking")1
None1

This produces a final value of 4.

How does BonsaiDb make this efficient?

When saving Documents, BonsaiDb does not immediately update related views. It instead notes what documents have been updated since the last time the View was indexed.

When a View is accessed, the queries include an AccessPolicy. If you aren't overriding it, UpdateBefore is used. This means that when the query is evaluated, BonsaiDb will first check if the index is out of date due to any updated data. If it is, it will update the View before evaluating the query.

If you're wanting to get results quickly and are willing to accept data that might not be updated, the access policies UpdateAfter and NoUpdate can be used depending on your needs.

If multiple simulataneous queries are being evaluted for the same View and the View is outdated, BonsaiDb ensures that only a single view indexer will execute while both queries wait for it to complete.

Using arbitrary types as a View Key

In our previous example, we used String for the Key type. The reason is important: Keys must be sortable by our underlying storage engine, which means special care must be taken. Most serialization types do not guarantee binary sort order. Instead, BonsaiDb exposes the Key trait. On that documentation page, you can see that BonsaiDb implements Key for many built-in types.

Using an enum as a View Key

The easiest way to expose an enum is to derive num_traits::FromPrimitive and num_traits::ToPrimitive using num-derive, and add an impl EnumKey line:

#[derive(
    Serialize, Deserialize, Debug, num_derive::FromPrimitive, num_derive::ToPrimitive, Clone,
)]
pub enum Category {
    Rust,
    Cooking,
}

impl EnumKey for Category {}

The View code remains unchanged, although the associated Key type can now be set to Option<Category>. The queries can now use the enum instead of a String:

    let rust_post_count = db
        .view::<BlogPostsByCategory>()
        .with_key(Some(Category::Rust))
        .reduce()
        .await?;

BonsaiDb will convert the enum to a u64 and use that value as the Key. A u64 was chosen to ensure fairly wide compatibility even with some extreme usages of bitmasks. If you wish to customize this behavior, you can implement Key directly.

Implementing the Key trait

The Key trait declares two functions: as_ord_bytes() and from_ord_bytes. The intention is to convert the type to bytes using a network byte order for numerical types, and for non-numerical types, the bytes need to be stored in binary-sortable order.

Here is how BonsaiDb implements Key for EnumKey:

impl<'a, T> Key<'a> for T
where
    T: EnumKey,
{
    type Error = std::io::Error;
    const LENGTH: Option<usize> = None;

    fn as_ord_bytes(&'a self) -> Result<Cow<'a, [u8]>, Self::Error> {
        let integer = self
            .to_u64()
            .map(Unsigned::from)
            .ok_or_else(|| std::io::Error::new(ErrorKind::InvalidData, IncorrectByteLength))?;
        Ok(Cow::Owned(integer.to_variable_vec()?))
    }

    fn from_ord_bytes(bytes: &'a [u8]) -> Result<Self, Self::Error> {
        let primitive = u64::decode_variable(bytes)?;
        Self::from_u64(primitive)
            .ok_or_else(|| std::io::Error::new(ErrorKind::InvalidData, UnknownEnumVariant))
    }
}

By implementing Key you can take full control of converting your view keys.