nvgpu.git - Tegra GPU Driver. Originally from nv-tegra.nvidia.com/linux-nvgpu.git.

	Commit message (Collapse)	Author	Age
*	gpu: nvgpu: update pwm source enum & VFE entry	Mahantesh Kumbar	2016-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	JIRA DNVGPU-123 Change-Id: Ia28db5d645aa431f11dc8720bf1d08e6d756e20f Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/1227670 (cherry picked from commit 2c7f89ceef3f9173fefa44b1a959345744e66536) Reviewed-on: http://git-master/r/1244659 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: voltage changes	Mahantesh Kumbar	2016-10-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- added voltage interface & ctrl defines. JIRA DNVGPU-122 Change-Id: Ia1a4c655c3c5faa638cafcdc75bdfb0e3c3be54f Signed-off-by: Mahantesh Kumbar <mkumbar@nvidia.com> Reviewed-on: http://git-master/r/1222775 (cherry picked from commit 46ff4d54d3cc02d9f039091f09eea09a5d6c22ce) Reviewed-on: http://git-master/r/1244654 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: gp10x: update pmu revision	Vijayakumar	2016-10-27
\| \| \| \| \| \| \| \| \| \| \|	JIRA DNVGPU-70 Change-Id: I927240432c4e27c01912d073ad9725f0c526288c Signed-off-by: Vijayakumar <vsubbu@nvidia.com> Reviewed-on: http://git-master/r/1239804 Reviewed-on: http://git-master/r/1242203 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Add SW_THRESHOLD policy support	Lakshmanan M	2016-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added SW_THRESHOLD policy support for over power protection. JIRA DNVGPU-70 Change-Id: I021f47f234d42be15ddbfd02a22e9299fd486636 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1233051 (cherry picked from commit 301e0ac123a7a65a7f83e5615f3a89e55253a0bd) Reviewed-on: http://git-master/r/1241958 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: Add pmgr support	Lakshmanan M	2016-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL covers the following implementation, 1) Power Sensor Table parsing. 2) Power Topology Table parsing. 3) Add debugfs interface to get the current power(mW), current(mA) and voltage(uV) information from PMU. 4) Power Policy Table Parsing 5) Implement PMU boardobj interface for pmgr module. 6) Over current protection. JIRA DNVGPU-47 Change-Id: I620f4470aa704f1cc920e03947831440fbb0eb05 Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1217176 (cherry picked from commit ed56743c2ac8dc325c75f85a82271d2d5ed8d96a) Reviewed-on: http://git-master/r/1241952 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	include: nvgpu: change alloc_gpfifo ioctl #	Sachit Kadle	2016-10-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This moves the newly added ioctl, NVGPU_IOCTL_CHANNEL_ALLOC_GPFIFO_EX, to the end of the ioctl list. This ensures that nvrm_gpu can correctly determine whether the kernel has this IOCTL. Bug 1795076 Change-Id: Ic42d88142809e71b5d7a4328388338c937252b8b Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1242442 GVS: Gerrit_Virtual_Submit Reviewed-by: David Martinez Nieto <dmartineznie@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix semaphore wakeup logic	Sachit Kadle	2016-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, when we receive a semaphore wakeup interrupt, we call the channel_update callback, which schedules deferred job clean-up. For deterministic channels, we don't allow semaphore-backed syncs anyways. That means for these channels, if we get a semaphore wakeup interrupt, it must be for a userspace-managed semaphore. In this case, there is no need to call into the channel_update callback. So for deterministic channels, we skip this. Bug 1795076 Change-Id: I4cdfecd53144078c5cd4be8a41c5c3b7d74c338e Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1225620 (cherry picked from commit 64a6db0080c3b198ddc2029544f52eb590dc08ff) Reviewed-on: http://git-master/r/1225615 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Remove global debugfs variable	Alex Waterman	2016-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove a global debugfs variable and instead save the allocator debugfs root node in the gk20a struct. Bug 1799159 Change-Id: If4eed34fa24775e962001e34840b334658f2321c Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225611 (cherry picked from commit 1908fde10bb1fb60ce898ea329f5a441a3e4297a) Reviewed-on: http://git-master/r/1242390 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Move CE cleanup	Alex Waterman	2016-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the CE cleanup to before the FIFO cleanup. Since the CE closes a channel during its cleanup the FIFO needs to be initialized since the FIFO code maintains the vmalloc()'ed channels. Bug 1816516 Change-Id: Ia7a97059a12a0c2b52368ffe411e597f803e8e6e Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225613 (cherry picked from commit 707bd2a6d4672c6a7b7a8b2e581ea3a606ed971d) Reviewed-on: http://git-master/r/1240106 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Only cleanup existing semaphore pools	Alex Waterman	2016-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not all VMs have semaphore pools made for them even when semaphores are going to be used. Thus only VMs with existing semaphore pools should have their pools cleaned up. Bug 1816516 Change-Id: I07828708faef451f1711f58c0d5b3f8e4d296dd0 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225612 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> (cherry picked from commit 6cdb7b6650765465dca68dc3c23b3d795ccdafb5) Reviewed-on: http://git-master/r/1240105 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix prealloc resource alloc error handling	Konsta Holtta	2016-10-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only free the per-channel preallocated job-tracking resources during channel allocation error path if they have actually been allocated. Bug 1795076 Change-Id: I2de90504f1042ce372337b68c5405727b4e4abb4 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1234983 (cherry picked from commit 62cb75c6baa02d0edecd1f81f1b8b80a985fd715) Reviewed-on: http://git-master/r/1238329 GVS: Gerrit_Virtual_Submit Reviewed-by: Sami Kiminki <skiminki@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add proper memset size during cleanup	Lakshmanan M	2016-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This CL covers the following small modifications, 1) Add proper memset size handling during pmu surface cleanup 2) Reset the pmu surface mem desc pointer after deallocate the memory JIRA DNVGPU-47 Change-Id: I400f8c4d3f5dc650d4fc6669cef6a1e41a70f4ab Signed-off-by: Lakshmanan M <lm@nvidia.com> Reviewed-on: http://git-master/r/1220100 (cherry picked from commit 1f171b977be51db20c2dfc56b3f6e3dd6b4b9095) Reviewed-on: http://git-master/r/1240881 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fix compile when CONFIG_PM=n	Timo Alho	2016-10-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	nvgpu driver fails to compile when CONFIG_PM build option is set to 'n'. Fix this by guarding struct gk20a_pm_ops and the functions pointed by in that struct with #ifdefs. Bug 1827482 Change-Id: I27f3535e89cc741f79824cdc427ef3572e2779e6 Signed-off-by: Timo Alho <talho@nvidia.com> Reviewed-on: http://git-master/r/1237110 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gm206: fix out of boundary memory access	Richard Zhao	2016-10-21
\| \| \| \| \| \| \| \| \| \| \| \|	Avoid out of bounds when searching bit header. JIRA VFND-2826 Change-Id: Icbde7c7e04c35c29f316d8a0ad93c76fcb8fae7a Signed-off-by: Richard Zhao <rizhao@nvidia.com> Reviewed-on: http://git-master/r/1240185 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: make deferred clean-up conditional	Sachit Kadle	2016-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change makes the invocation of the deferred job clean-up mechanism conditional. For submissions that require job tracking, deferred clean-up is only required if any of the following conditions are met: 1) Channel's deterministic flag is not set 2) Rail-gating is enabled 3) Channel WDT is enabled 4) Buffer refcounting is enabled 5) Dependency on Sync Framework In case deferred clean-up is not needed, we clean-up a single job tracking resource in the submit path. For deterministic channels, we do not allow deferred clean-up to occur and fail any submits that require it. Bug 1795076 Change-Id: I4021dffe8a71aa58f12db6b58518d3f4021f3313 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1220920 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com> (cherry picked from commit b09f7589d5ad3c496e7350f1ed583a4fe2db574a) Reviewed-on: http://git-master/r/1223941 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Add flag for running preos ucode	Terje Bergstrom	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add per-platform flag run_preos, which indicates whether to run preos ucode or not. Leave it to false for all known boards. Bug 1799537 Bug 1815139 Change-Id: I1818970b0f70f636277443d6de199d3683fc565a Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1233410 (cherry picked from commit 8bea05dbfa64af88587edb8927a8ec71c6b0d807) Reviewed-on: http://git-master/r/1239956 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: fix page alloc slab error condition	Konsta Holtta	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Return NULL instead of ERR_PTR from __gk20a_alloc_slab to be consistent with __gk20a_alloc_pages, and thus to work with an error check in gk20a_page_alloc in out-of-memory conditions. Bug 1799159 JIRA DNVGPU-100 Change-Id: I8c3c0e121840758c6aba860baac86a38e873e359 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1227730 (cherry picked from commit 209927a6b3bae4fddc2a6a745c1b4b1f46c6675c) Reviewed-on: http://git-master/r/1235192 Reviewed-by: Alex Waterman <alexw@nvidia.com> Tested-by: Alex Waterman <alexw@nvidia.com> GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: Fix coverity problem	Alex Waterman	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Coverity detected a possible overflow during the left shift. This is likely not a big problem, though, since the number of pages to allocate would have to be greater than 2^32 (that would be 4 TB of memory assuming 4k page size and the literal 1 being a signed int by default). Bug 1799159 Change-Id: Ie1d6522defd13c794eb95aeee8c5c4203db00ebf Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1238632 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: fixes for 32-bit compatibility	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes to fence framework's usage of allocator APIs to be compatible w/ 32-bit architectures. Bug 1795076 Change-Id: Ia677f9842c36d482d4e82e9fa09613702f3111b3 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1237904 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: correct gpfifo size calculation	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change fixes up the calculation of gpfifo entries, to be allocated depending on the ioctl used: 1) For the legacy ALLOC_GPFIFO ioctl, we preserve the calculation of gpfifo entries within the kernel. 2) For the new ALLOC_GPFIFO_EX ioctl, we assume that userspace has pre-calculated power-of-2 value. We process this value un-modified and only verify that it is a valid power-of-2. Bug 1795076 Change-Id: I8d2ddfdae40b02fe6b81e63dfd8857ad514a3dfd Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1220968 (cherry picked from commit c42396d9836e9b7ec73e0728f0c502b63aff70db) Reviewed-on: http://git-master/r/1223937 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: update channel max_arg_size	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Increase channel ioctl max arg size to support ALLOC_GPFIFO_EX ioctl. Bug 1795076 Change-Id: Ifb8e7c564333f4c6dd244d7d85acfee4e029b41b Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1218048 (cherry picked from commit 75c52887a0350433e7334681009f5c7cac3fb6a9) Reviewed-on: http://git-master/r/1223936 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: add deterministic submit flag	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change adds a new ioctl flag, NVGPU_SUBMIT_GPFIFO_FLAGS_DETERMINISTIC, which indicates that a gpfifo submission must exhibit deterministic behavior within the kernel. For submissions that require job tracking and also set this flag, we require the channel to have previously pre-allocated job tracking resources. Bug 1795076 Change-Id: I0496a2513c6c683fcda161b32db9e7ee6712d45c Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1210527 (cherry picked from commit 0a36a0ce3a6cbe398931993e742fc928f7b2c0aa) Reviewed-on: http://git-master/r/1223935 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add support for pre-allocated resources	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for pre-allocation of job tracking resources w/ new (extended) ioctl. Goal is to avoid dynamic memory allocation in the submit path. This patch does the following: 1) Intoduces a new ioctl, NVGPU_IOCTL_CHANNEL_ALLOC_GPFIFO_EX, which enables pre-allocation of tracking resources per job: a) 2x priv_cmd_entry b) 2x gk20a_fence 2) Implements circular ring buffer for job tracking to avoid lock contention between producer (submitter) and consumer (clean-up) Bug 1795076 Change-Id: I6b52e5c575871107ff380f9a5790f440a6969347 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1203300 (cherry picked from commit 9fd270c22b860935dffe244753dabd87454bef39) Reviewed-on: http://git-master/r/1223934 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: use inplace allocation in sync framework	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change is the first of a series of changes to support the usage of pre-allocated job tracking resources in the submit path. With this change, we still maintain a dynamically-allocated joblist, but make the necessary changes in the channel_sync & fence framework to use in-place allocations. Specifically, we: 1) Update channel sync framework routines to take in pre-allocated priv_cmd_entry(s) & gk20a_fence(s) rather than dynamically allocating themselves 2) Move allocation of priv_cmd_entry(s) & gk20a_fence(s) to gk20a_submit_prepare_syncs 3) Modify fence framework to have seperate allocation and init APIs. We expose allocation as a seperate API, so the client can allocate the object before passing it into the channel sync framework. 4) Fix clean_up logic in channel sync framework Bug 1795076 Change-Id: I96db457683cd207fd029c31c45f548f98055e844 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1206725 (cherry picked from commit 9d196fd10db6c2f934c2a53b1fc0500eb4626624) Reviewed-on: http://git-master/r/1223933 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: clear events from channel/TSG while closing	Deepak Nibade	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If User does not close the event fd created on channel/TSG, it is possible that the event stays on channel/TSG structure and reappears when channel/TSG is re-opened This causes false failure when we try to enable some event of channel/TSG since we do not allow enabling same event twice on same channel/TSG Fix this by removing all enabled events from channel/TSG while closing it Bug 200243092 Bug 1818654 Change-Id: I9d5ffc89f87cf4c44124f8015c2c2f0587ad2ef4 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1237723 (cherry picked from commit 2737a5c86cf5fbfe8a04f6a87764e8ecb9b30555) Reviewed-on: http://git-master/r/1238266 GVS: Gerrit_Virtual_Submit Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: Require version 30 for device 0x1c35	Terje Bergstrom	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	Raise the minimum VBIOS version for 0x1c35 to 86.06.30.00. Bug 1811880 Change-Id: I35f7511c3346394af45b8347e8c40f8d367bf3e0 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1233401 (cherry picked from commit 9a92ef0f0ccf33c08d48d3a769158b30a0a1496c) Reviewed-on: http://git-master/r/1239429 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: VBIOS version check	Terje Bergstrom	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a minimum VBIOS version field for each SKU. This requires the gk20a_platform structure to be per SKU. Also sets power_on back to false if there was any error in booting GPU. Bug 1811880 Change-Id: I23ef312f0db7061b31a3d503ded7e41ef45ad6b3 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1227229 (cherry picked from commit 69c9ab4349ec7526a7f8a2fcad01f9128ed4769c) Reviewed-on: http://git-master/r/1239428 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: enable probing of gp106 sku10	Sachit Kadle	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable probing of gp106 sku10 by adding the device id to the known device table. JIRA DNVGPU-72 Change-Id: I2c10914c8510c6081202a374f50ef40371d7d183 Signed-off-by: Sachit Kadle <skadle@nvidia.com> Reviewed-on: http://git-master/r/1221123 (cherry picked from commit 17ff6de69f19212b6c6be39496f8e76c8554b861) Reviewed-on: http://git-master/r/1239427 Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com> Tested-by: Terje Bergstrom <tbergstrom@nvidia.com> GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: add support for devinit scripts exec	David Nieto	2016-10-20
\| \| \| \| \| \| \| \| \| \| \| \|	JIRA DNVGPU-117 Change-Id: I8c79e5b2fcad25588c950e786289443ed64fd48d Signed-off-by: David Nieto <dmartineznie@nvidia.com> Reviewed-on: http://git-master/r/1223221 (cherry-picked from commit f3185ad9f141ab32a224046185d0a409a8a513ff) Reviewed-on: http://git-master/r/1227254 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Move ELCG programming to therm	Terje Bergstrom	2016-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move ELCG parameter programming to a new function in therm, elcg_init_idle_filter. Implement gk20a variant and use it for gk20a and gm20b. JIRA DNVGPU-74 Change-Id: I8ef400f3a6195311fb9e7da8db6c34993d62f461 Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com> Reviewed-on: http://git-master/r/1220433 (cherry picked from commit f6654ae4d83d31cd40b317bf55922964bbfa575d) Reviewed-on: http://git-master/r/1239421 GVS: Gerrit_Virtual_Submit
*	gpu: nvgpu: check engine ctx_status in wait_idle	Deepak Nibade	2016-10-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have following bug where GPU Host reports non-idle when it should report engine idle - if a context is preempted off the GPU, and there is no other context to load, NV_PGRAPH_ENGINE_STATUS will not be idle until new context is loaded - this could cause gr_gk20a_wait_idle() to fail since here we rely only on NV_PGRAPH_ENGINE_STATUS to decide if engine is busy or not To fix this, first check if context is valid or not from NV_PFIFO_ENGINE_STATUS_CTX_STATUS If context is invalid, return immediately Otherwise, continue as before Also, add accessors for invalid ctx_status Bug 1826768 Change-Id: Id627be3f02e79f4beac59a8b5195d08eabf651f2 Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1237521 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Avoid 64 bit division	Alex Waterman	2016-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid doing 64bit division in the page allocator. This causes problems on 32 bit platforms. Bug 1799159 Change-Id: I5166a71a4e84454686cce6d6cdca678a862a7ae7 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1236799 (cherry picked from commit 21c091d1c00433acb9965c3d348d16fbb4c50c1a) Reviewed-on: http://git-master/r/1236195 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: SLAB allocation for page allocator	Alex Waterman	2016-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ability to do "SLAB" allocation in the page allocator. This is generally useful since the allocator manages 64K pages but often we only need 4k chunks (for example when allocating memory for page table entries). Bug 1799159 JIRA DNVGPU-100 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1225322 (cherry picked from commit 299a5639243e44be504391d9155b4ae17d914aa2) Change-Id: Ib3a8558d40ba16bd3a413f4fd38b146beaa3c66b Reviewed-on: http://git-master/r/1227924 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: bitmap allocator optimization	Alex Waterman	2016-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add an optimization to the bitmap allocator for handling sequences of allocations. A common pattern of allocs from the priv_cmdbuf is to do many allocs and then many frees. In such cases it makes sense to store the last allocation offset and start searching for the next alloc from there. For such a pattern we know that the previous bits are already allocated so it doesn't make sense to search them unless we have to. Obviously, if there's no space found ahead of the precious alloc's block then we fall back to the remaining space. In random allocation patterns this optimization should not have any negative affects. It merely shifts the start point for searching for allocs but assuming each bit has an equal probability of being free the start location does not matter. Bug 1799159 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1205958 (cherry picked from commit 759c583962d6d57cb8cb073ccdbfcfc6db4c1e18) Change-Id: I267ef6fa155ff15d6ebfc76dc1abafd9aa1f44df Reviewed-on: http://git-master/r/1227923 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: implement PCIe Gen2 frequency swap	Alex Waterman	2016-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Implement the ability to swap between different PCIe bus speeds. This code is called during init in case the GPU is not running at the max supported PCIe bus speed. JIRA DNVGPU-89 Signed-off-by: Alex Waterman <alexw@nvidia.com> Reviewed-on: http://git-master/r/1218178 (cherry picked from commit 8dcd3e10f46f524c9bac9fd5dae0f0a899123c23) Change-Id: I21f96110578a68d5c5e30ae21776cff69aefba5d Reviewed-on: http://git-master/r/1227922 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: support suspend/resume with user disabled railgating	Deepak Nibade	2016-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We take an extra power refcount when we disable railgating through railgate_enable sysfs And that breaks suspend/resume since we check for power refcount first in gk20a_pm_suspend() Fix this with following : - set a flag user_railgate_disabled when User disables railgating through sysfs railgate_enable - in gk20a_pm_suspend(), drop one power refcount if flag is set - in gk20a_pm_resume(), take one refcount again if flag is set Fix __gk20a_do_idle() to consider this extra refcount as well. Add new variable target_ref_cnt and use it instead of assuming target refcount of 1 In case User has disabled rail gating, set this target refcount as 2 Also, export gk20a_idle_nosuspend() which drop power refcount without triggering suspend Bug 200233570 Change-Id: Ic0e35c73eb74ffefea1cd90d1b152650d9d2043d Signed-off-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-on: http://git-master/r/1236047 (cherry picked from commit 6e002d57da4b5c58ed79889728bb678d3aa1f1b1) Reviewed-on: http://git-master/r/1235219 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add func ptr for gpc exceptions	Seema Khowala	2016-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Add function ptr for enabling gpc exceptions JIRA GV11B-28 JIRA GV11B-27 Change-Id: I4c7e4300825bf096c22f229ae7196f324ce40037 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/1236902 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: skip pramin barriers for page tables	Konsta Holtta	2016-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Page table updates have an explicit write barrier at the end of a pte update operation in update_gmmu_ptes_locked(), so the per-wr32 wmb()s are not necessary. Jira DNVGPU-23 Change-Id: I2e2596f0900d840fadb369ee1261c5e2305f2070 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1225150 (cherry picked from commit 6664a667ea326e9663a6b502765f858d8669f4d9) Reviewed-on: http://git-master/r/1227475 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: allow skipping pramin barriers	Konsta Holtta	2016-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A wmb() next to each gk20a_mem_wr32() via PRAMIN may be overly careful, so support not inserting these barriers for performance, in cases where they are not necessary, where the caller would do an explicit barrier after a bunch of reads. Also, move those optional wmb()s to be done at the end of the whole internally batched write for gk20a_mem_{wr_n,memset} from the per-batch subloops that may run multiple times. Jira DNVGPU-23 Change-Id: I61ee65418335863110bca6f036b2e883b048c5c2 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1225149 (cherry picked from commit d2c40327d1995f76e8ab9cb4cd8c76407dabc6de) Reviewed-on: http://git-master/r/1227474 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: Fix wmb() order in pramin access	Alex Waterman	2016-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	wmb() should come after the writes to ensure that the writes have completed before progressing. Bug 1811382 Change-Id: I98fba317b1760240c0b5de531accf398fe69c9b3 Signed-off-by: Alex Waterman <alexw@nvidia.com> (cherry picked from commit 1b1201b9c109061590e6e25260d7230ae2c89888) Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1225251 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: gm20b (un)railgate common clock support	Peter Boonstoppel	2016-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for GPU railgating using common clock framework and Tegra DVFS on k4.4 Bug: 200233943 Change-Id: Ief9afd7a5bf3f447e9b91ab181f26dcefff0a8c8 Signed-off-by: Peter Boonstoppel <pboonstoppel@nvidia.com> Reviewed-on: http://git-master/r/1232290 GVS: Gerrit_Virtual_Submit Reviewed-by: Deepak Nibade <dnibade@nvidia.com> Reviewed-by: Bharat Nihalani <bnihalani@nvidia.com>
*	gpu: nvgpu: add ioctl for querying memory state	Konsta Holtta	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add NVGPU_GPU_IOCTL_GET_MEMORY_STATE to read the amount of free device-local video memory, if applicable. Some reserved fields are added to support different types of queries in the future (e.g. context-local free amount). Bug 1787771 Bug 200233138 Change-Id: Id5ffd02ad4d6ed3a6dc196541938573c27b340ac Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1223762 (cherry picked from commit 96221d96c7972c6387944603e974f7639d6dbe70) Reviewed-on: http://git-master/r/1235980 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: add space query in page and buddy allocs	Konsta Holtta	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Amount of free space in the buddy allocator is computed from the complete capacity minus currently used bytes. The page allocator just queries its underlying allocator. Bug 1787771 Bug 200233138 Change-Id: I9b6f5ef90119236a13de14e14cd0a3ee72144a11 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1223761 (cherry picked from commit 0b324a60ebdf67e793ade869c252a8ddd56c04f8) Reviewed-on: http://git-master/r/1235979 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: support free space query in allocators	Konsta Holtta	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add gk20a_alloc_space() to query the amount of free memory in bytes in gk20a_allocator. Bug 1787771 Bug 200233138 Change-Id: Ia381cafd5d2dbf394072d07be96991974f9289ae Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1223760 (cherry picked from commit 0c7fd96acc3e3d581bd25ccbe40b0821a310760f) Reviewed-on: http://git-master/r/1235978 GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: track pending bytes for vidmem clears	Konsta Holtta	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change clears_pending to bytes_pending and track accordingly the number of bytes to be freed instead of the number of buffers. This, atomically combined with the amount of space in the allocator, is the total amount of free memory available. Bug 200233138 Change-Id: Ibbb4e80a32728781ba19a74307d8a8ac1a4d7431 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1231422 (cherry picked from commit 025e765f312c253b201ecf2dbbe0f4972fe1d4bc) Reviewed-on: http://git-master/r/1235957 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: mark vidmem.cleared volatile	Konsta Holtta	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The boolean flag mm_gk20a.vidmem.cleared is shared across threads, so mark it volatile to prevent compiler from wrongly optimizing accesses to it. Jira DNVGPU-84 Change-Id: I1fe66b26966685d3f74ed95ba53b198f810231b9 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1233016 (cherry picked from commit dc6c9db56ea8a5f55f28f97fdfc3c1ac60d8b195) Reviewed-on: http://git-master/r/1235317 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: fix zcull programming	Seema Khowala	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are eight tiles per map tile register and depending on how many tpcs are present, there is a chance that s/w will be accessing un-allocated memory for reading tile values from temp buffers. Bug 1735760 Change-Id: I5c0e09ec75099aaf6ad03dde964b9e93c2dc2408 Signed-off-by: Seema Khowala <seemaj@nvidia.com> Reviewed-on: http://git-master/r/1221580 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: program sw veid bundles	seshendra Gadagottu	2016-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Query sw veid bundles from sim/netlist and initialize hardware with those bundles. JIRA GV11B-11 Change-Id: I26f174781f0b00b919afac407e2bb9e1fa7b158a Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1231597 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
*	gpu: nvgpu: compact pte buffers	Konsta Holtta	2016-10-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The lowest page table level may hold very few entries for mappings of large pages, but a new page is allocated for each list of entries at the lowest level, wasting memory and performance. Compact these so that the new "allocation" of ptes is appended at the end of the previous allocation, if there is space. 4 KB page is still the smallest size requested from the allocator; any possible overhead in the allocator (e.g., internally allocating big pages only) is not taken into account. Bug 1736604 Change-Id: I03fb795cbc06c869fcf5f1b92def89a04583ee83 Signed-off-by: Konsta Holtta <kholtta@nvidia.com> Reviewed-on: http://git-master/r/1221841 (cherry picked from commit fa92017ed48e1d5f48c1a12c512641c6ce9924af) Reviewed-on: http://git-master/r/1234996 Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com> Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
*	gpu: nvgpu: setup chip specific rop mapping	seshendra Gadagottu	2016-10-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for setting-up chip specific rop mapping. JIRA GV11B-21 Change-Id: If94f0de7d767f572095602a831ad6be4b764fff4 Signed-off-by: seshendra Gadagottu <sgadagottu@nvidia.com> Reviewed-on: http://git-master/r/1234547 Reviewed-by: Automatic_Commit_Validation_User GVS: Gerrit_Virtual_Submit Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>