Michele Caini 3 lat temu
rodzic
commit
3a85d0f179
1 zmienionych plików z 253 dodań i 0 usunięć
  1. 253 0
      docs/md/graph.md

+ 253 - 0
docs/md/graph.md

@@ -0,0 +1,253 @@
+# Crash Course: graph
+
+<!--
+@cond TURN_OFF_DOXYGEN
+-->
+# Table of Contents
+
+* [Introduction](#introduction)
+* [Data structures](#data-structures)
+  * [Adjacency matrix](#adjacency-matrix)
+* [Flow builder](#flow-builder)
+  * [Tasks and resources](#tasks-and-resources)
+  * [Fake resources and order of execution](#fake-resources-and-order-of-execution)
+  * [Sync points](#sync-points)
+  * [Execution graph](#execution-graph)
+
+<!--
+@endcond TURN_OFF_DOXYGEN
+-->
+
+# Introduction
+
+`EnTT` doesn't aim to offer everything one needs to work with graphs. Therefore,
+anyone looking for this in the _graph_ submodule will be disappointed.<br/>
+Quite the opposite is true. This submodule is minimal and contains only the data
+structures and algorithms strictly necessary for the development of some tools
+such as the _flow builder_.
+
+# Data structures
+
+As anticipated in the introduction, the aim isn't to offer all possible data
+structures suitable for representing and working with graphs. Many will likely
+be added or refined over time, however I want to discourage anyone expecting
+tight scheduling on the subject.<br/>
+The data structures presented in this section are mainly useful for the
+development and support of some tools which are also part of the same submodule.
+
+## Adjacency matrix
+
+The adjacency matrix is (for now) designed to represent a directed graph.<br/>
+The interface deviates slightly from the typical double indexing of C and offers
+an API that is perhaps more familiar to a C++ programmer. Therefore, the access
+and modification of an element will take place via the `contains`, `insert` and
+`erase` functions rather than a double call to an `operator[]`:
+
+```cpp
+if(adjacency_matrix.contains(0u, 1u)) {
+	adjacency_matrix.erase(0u, 1u);
+} else {
+	adjacency_matrix.insert(0u, 1u);
+}
+```
+
+To be fair, both `insert` and` erase` are idempotent functions which have no
+effect if the element already exists or has already been deleted.<br/>
+The first one returns an `std::pair` containing the iterator to the element and
+a boolean value indicating whether the element has been inserted or was already
+present. The second one instead returns the number of deleted elements (0 or 1).
+
+An adjacency matrix must be initialized with the number of elements (vertices)
+when constructing it but can also be resized later using the `resize` function:
+
+```cpp
+entt::adjacency_matrix adjacency_matrix{3u};
+```
+
+To visit all vertices, the class offers a function named `vertices` that returns
+an iterable object suitable for the purpose:
+
+```cpp
+for(auto &&vertex: adjacency_matrix.vertices()) {
+	// ...
+}
+```
+
+Note that the same result can be obtained with the following snippet, since the
+vertices are unsigned integral values:
+
+```cpp
+for(auto last = adjacency_matrix.size(), pos = {}; pos < last; ++pos) {
+	// ...
+}
+```
+
+As for visiting the edges, a few functions are available.<br/>
+When the purpose is to visit all the edges of a given adjacency matrix, the
+`edges` function returns an iterable object that can be used to get them as
+pairs of vertices:
+
+```cpp
+for(auto [lhs, rhs]: adjacency_matrix.edges()) {
+	// ...
+}
+```
+
+On the other hand, if the goal is to visit all the in- or out-edges of a given
+vertex, the `in_edges` and `out_edges` functions are meant for that:
+
+```cpp
+for(auto [lhs, rhs]: adjacency_matrix.out_edges(3u)) {
+	// ...
+}
+```
+
+As might be expected, these functions expect the vertex to visit (that is, to
+return the in- or out-edges for) as an argument.<br/>
+Finally, the adjacency matrix is an allocator-aware container and offers most of
+the functionality one would expect from this type of containers, such as `clear`
+or 'get_allocator` and so on.
+
+# Flow builder
+
+A flow builder is used to create execution graphs from tasks and resources.<br/>
+The implementation is as generic as possible and doesn't bind to any other part
+of the library, so it can be used independently of everything else to create an
+execution graph for own own purposes.
+
+This class is designed as a sort of _state machine_ to which a specific task is
+attached for which the resources accessed in read-only or read-write mode are
+specified.<br/>
+Most of the functions in the API also return the flow builder itself, according
+to what is the common sense API when it comes to builder classes.
+
+Once all tasks have been registered and resources assigned to them, an execution
+graph in the form of an adjacency matrix is returned to the user.<br/>
+This graph contains all the tasks assigned to the flow builder in the form of
+_vertices_. The _vertex_ itself can be used as an index to get the identifier
+passed during registration.
+
+## Tasks and resources
+
+Although these terms are used extensively in the documentation, the flow builder
+has no real concept of tasks and resources.<br/>
+This class works mainly with _identifiers_, that is, values of type `id_type`.
+That is, both tasks and resources are identified by integral values.<br/>
+This allows not to couple the class itself to the rest of the library or to any
+particular data structure. On the other hand, it requires the user to keep track
+of the association between identifiers and operations or actual data.
+
+Once a flow builder has been created (which requires no constructor arguments),
+the first thing to do is to bind a task. This will indicate to the builder who
+intends to consume the resources that will be specified immediately after:
+
+```cpp
+entt::flow builder{};
+builder.bind("task_1"_hs);
+```
+
+Note that the example uses the `EnTT` hashed string to generate an identifier
+for the task.<br/>
+Indeed, the use of `id_type` as an identifier type is not by accident. In fact,
+it matches well with the internal hashed string class. Moreover, it's also the
+same type returned by the hash function of the internal RTTI system, in case the
+user wants to rely on that.<br/>
+However, being an integral value, it leaves the user full freedom to rely on his
+own tools if he deems it necessary.
+
+Once a task has been associated with the flow builder, it can be assigned
+read-only or read-write resources, as appropriate:
+
+```cpp
+builder
+    .bind("task_1"_hs)
+		.ro("resource_1"_hs)
+		.ro("resource_2"_hs)
+    .bind("task_2"_hs)
+		.rw("resource_2"_hs)
+```
+
+As mentioned, many functions return the builder itself and it's therefore easy
+to concatenate the different calls.<br/>
+Also in the case of resources, these are identified by numeric values of type
+`id_type`. As above, the choice is not entirely random. This goes well with the
+tools offered by the library while leaving room for maximum flexibility.
+
+Finally, both the `ro` and` rw` functions also offer an overload that accepts a
+pair of iterators, so that one can pass a range of resources in one go.
+
+## Fake resources and order of execution
+
+The flow builder doesn't offer the ability to specify when a task should execute
+before or after another task.<br/>
+In fact, the order of _registration_ on the resources also determines the order
+in which the tasks are processed during the generation of the execution graph.
+
+However, there is a way to force the execution order of two processes.<br/>
+Briefly, since accessing a resource in opposite modes requires sequential rather
+than parallel scheduling, it's possible to make use of fake resources to force
+the order execution:
+
+```cpp
+builder
+    .bind("task_1"_hs)
+		.ro("resource_1"_hs)
+		.rw("fake"_hs)
+    .bind("task_2"_hs)
+		.ro("resource_2"_hs)
+		.ro("fake"_hs)
+    .bind("task_3"_hs)
+		.ro("resource_2"_hs)
+		.ro("fake"_hs)
+```
+
+This snippet forces the execution of `task_2` and `task_3` **after** `task_1`.
+This is due to the fact that the latter sets a read-write requirement on a fake
+resource that the other tasks also want to access in read-only mode.<br/>
+Similarly, it's possible to force a task to run after a certain group:
+
+```cpp
+builder
+    .bind("task_1"_hs)
+		.ro("resource_1"_hs)
+		.ro("fake"_hs)
+    .bind("task_2"_hs)
+		.ro("resource_1"_hs)
+		.ro("fake"_hs)
+    .bind("task_3"_hs)
+		.ro("resource_2"_hs)
+		.rw("fake"_hs)
+```
+
+In this case, since there are a number of processes that want to read a specific
+resource, they will do so in parallel by forcing `task_3` to run after all the
+others tasks.
+
+## Sync points
+
+To be done. Coming soon.
+
+## Execution graph
+
+Once both the resources and their consumers have been properly registered, the
+purpose of this tool is to generate an execution graph that takes into account
+all specified constraints to return the best scheduling for the vertices:
+
+```cpp
+entt::adjacency_matrix graph = builder.graph();
+```
+
+The search for the main vertices, that is those without in-edges, is usually the
+first thing required:
+
+```cpp
+for(auto &&vertex: graph) {
+	if(auto in_edges = graph.in_edges(vertex); in_edges.begin() == in_edges.end()) {
+		// ...
+	}
+}
+```
+
+Starting from them, using the other functions appropriately (such as `out_edges`
+to retrieve the children of a given task or `edges` to access their identifiers)
+it will be possible to instantiate an execution graph.