summaryrefslogtreecommitdiffstats
path: root/drivers
diff options
context:
space:
mode:
authorAlex Waterman <alexw@nvidia.com>2016-05-04 13:07:36 -0400
committerTerje Bergstrom <tbergstrom@nvidia.com>2016-05-06 15:13:52 -0400
commit70d531388205852865d48469cfbd9d0c996acd53 (patch)
tree7b2db040485e844e4015e4dfc8a970b14278fbee /drivers
parent1fc23d1280d2777c1a32544e787f257769cf8834 (diff)
gpu: nvgpu: update priv_cmdbuff computation
Update the priv_cmdbuff computation to take into account the amount of memory semaphores take. Since semaphores always require more memory than sync-pts the sync-pt computation has been dropped. Bug 1732449 JIRA DNVGPU-12 Change-Id: Ic05c26b4d1ed9cbd03d3239655c4607bb418396c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1141420 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
Diffstat (limited to 'drivers')
-rw-r--r--drivers/gpu/nvgpu/gk20a/channel_gk20a.c30
1 files changed, 19 insertions, 11 deletions
diff --git a/drivers/gpu/nvgpu/gk20a/channel_gk20a.c b/drivers/gpu/nvgpu/gk20a/channel_gk20a.c
index 697861e2..0d7a6bec 100644
--- a/drivers/gpu/nvgpu/gk20a/channel_gk20a.c
+++ b/drivers/gpu/nvgpu/gk20a/channel_gk20a.c
@@ -1278,17 +1278,25 @@ static int channel_gk20a_alloc_priv_cmdbuf(struct channel_gk20a *c)
1278 u32 size; 1278 u32 size;
1279 int err = 0; 1279 int err = 0;
1280 1280
1281 /* Kernel can insert gpfifos before and after user gpfifos. 1281 /*
1282 Before user gpfifos, kernel inserts fence_wait, which takes 1282 * Compute the amount of priv_cmdbuf space we need. In general the worst
1283 syncpoint_a (2 dwords) + syncpoint_b (2 dwords) = 4 dwords. 1283 * case is the kernel inserts both a semaphore pre-fence and post-fence.
1284 After user gpfifos, kernel inserts fence_get, which takes 1284 * Any sync-pt fences will take less memory so we can ignore them for
1285 wfi (2 dwords) + syncpoint_a (2 dwords) + syncpoint_b (2 dwords) 1285 * now.
1286 = 6 dwords. 1286 *
1287 Worse case if kernel adds both of them for every user gpfifo, 1287 * A semaphore ACQ (fence-wait) is 8 dwords: semaphore_a, semaphore_b,
1288 max size of priv_cmdbuf is : 1288 * semaphore_c, and semaphore_d. A semaphore INCR (fence-get) will be 10
1289 (gpfifo entry number * (2 / 3) * (4 + 6) * 4 bytes */ 1289 * dwords: all the same as an ACQ plus a non-stalling intr which is
1290 size = roundup_pow_of_two( 1290 * another 2 dwords.
1291 c->gpfifo.entry_num * 2 * 12 * sizeof(u32) / 3); 1291 *
1292 * Lastly the number of gpfifo entries per channel is fixed so at most
1293 * we can use 2/3rds of the gpfifo entries (1 pre-fence entry, one
1294 * userspace entry, and one post-fence entry). Thus the computation is:
1295 *
1296 * (gpfifo entry number * (2 / 3) * (8 + 10) * 4 bytes.
1297 */
1298 size = roundup_pow_of_two(c->gpfifo.entry_num *
1299 2 * 18 * sizeof(u32) / 3);
1292 1300
1293 err = gk20a_gmmu_alloc_map(ch_vm, size, &q->mem); 1301 err = gk20a_gmmu_alloc_map(ch_vm, size, &q->mem);
1294 if (err) { 1302 if (err) {