gpu: nvgpu: Revamp semaphore support

Revamp the support the nvgpu driver has for semaphores. The original problem with nvgpu's semaphore support is that it required a SW based wait for every semaphore release. This was because for every fence that gk20a_channel_semaphore_wait_fd() waited on a new semaphore was created. This semaphore would then get released by SW when the fence signaled. This meant that for every release there was necessarily a sync_fence_wait_async() call which could block. The latency of this SW wait was enough to cause massive degredation in performance. To fix this a fast path was implemented. When a fence is passed to gk20a_channel_semaphore_wait_fd() that is backed by a GPU semaphore a semaphore acquire is directly used to block the GPU. No longer is a sync_fence_wait_async() performed nor is there an extra semaphore created. To implement this fast path the semaphore memory had to be shared between channels. Previously since a new semaphore was created every time through gk20a_channel_semaphore_wait_fd() what address space a semaphore was mapped into was irrelevant. However, when using the fast path a sempahore may be released on one address space but acquired in another. Sharing the semaphore memory was done by making a fixed GPU mapping in all channels. This mapping points to the semaphore memory (the so called semaphore sea). This global fixed mapping is read-only to make sure no semaphores can be incremented (i.e released) by a malicious channel. Each channel then gets a RW mapping of it's own semaphore. This way a channel may only acquire other channel's semaphores but may both acquire and release its own semaphore. The gk20a fence code was updated to allow introspection of the GPU backed fences. This allows detection of when the fast path can be taken. If the fast path cannot be used (for example when a fence is sync-pt backed) the original slow path is still present. This gets used when the GPU needs to wait on an event from something which only understands how to use sync-pts. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ic0fea74994da5819a771deac726bb0d47a33c2de Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1133792 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
author: Alex Waterman <alexw@nvidia.com> 2016-04-27 15:27:36 -0400
committer: Terje Bergstrom <tbergstrom@nvidia.com> 2016-06-28 18:49:11 -0400
commit: dfd5ec53fcce4ebae27f78242e6b788350337095 (patch)
tree: 073ea380b9ee4734391d381745f57600c3525be5 /drivers/gpu/nvgpu/gk20a/channel_gk20a.h
parent: b30990ea6db564e885d5aee7a1a5ea87a1e5e8ee (diff)
1 files changed, 2 insertions, 0 deletions
diff --git a/drivers/gpu/nvgpu/gk20a/channel_gk20a.h b/drivers/gpu/nvgpu/gk20a/channel_gk20a.h
index acd272b4..c5a1bd24 100644
--- a/drivers/gpu/nvgpu/gk20a/channel_gk20a.h
+++ b/drivers/gpu/nvgpu/gk20a/channel_gk20a.h
@@ -108,6 +108,8 @@ struct channel_gk20a {
        atomic_t ref_count;
        wait_queue_head_t ref_count_dec_wq;
+        struct gk20a_semaphore_int *hw_sema;
        int hw_chid;
        bool wdt_enabled;
        bool bound;
author	Alex Waterman <alexw@nvidia.com>	2016-04-27 15:27:36 -0400
committer	Terje Bergstrom <tbergstrom@nvidia.com>	2016-06-28 18:49:11 -0400
commit	dfd5ec53fcce4ebae27f78242e6b788350337095 (patch)
tree	073ea380b9ee4734391d381745f57600c3525be5 /drivers/gpu/nvgpu/gk20a/channel_gk20a.h
parent	b30990ea6db564e885d5aee7a1a5ea87a1e5e8ee (diff)

diff --git a/drivers/gpu/nvgpu/gk20a/channel_gk20a.h b/drivers/gpu/nvgpu/gk20a/channel_gk20a.h index acd272b4..c5a1bd24 100644 --- a/drivers/gpu/nvgpu/gk20a/channel_gk20a.h +++ b/drivers/gpu/nvgpu/gk20a/channel_gk20a.h
@@ -108,6 +108,8 @@ struct channel_gk20a {
108	atomic_t ref_count;	108	atomic_t ref_count;
109	wait_queue_head_t ref_count_dec_wq;	109	wait_queue_head_t ref_count_dec_wq;
110		110
		111	struct gk20a_semaphore_int *hw_sema;
		112
111	int hw_chid;	113	int hw_chid;
112	bool wdt_enabled;	114	bool wdt_enabled;
113	bool bound;	115	bool bound;