| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using Linux APIs for mutex and spinlocks
directly, use new APIs defined in <nvgpu/lock.h>
Replace Linux specific mutex/spinlock declaration,
init, lock, unlock APIs with new APIs
e.g
struct mutex is replaced by struct nvgpu_mutex and
mutex_lock() is replaced by nvgpu_mutex_acquire()
And also include <nvgpu/lock.h> instead of including
<linux/mutex.h> and <linux/spinlock.h>
Add explicit nvgpu/lock.h includes to below
files to fix complilation failures.
gk20a/platform_gk20a.h
include/nvgpu/allocator.h
Jira NVGPU-13
Change-Id: I81a05d21ecdbd90c2076a9f0aefd0e40b215bd33
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/1293187
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reorganize the HW headers of gk20a. The headers are moved to a
new directory:
include/nvgpu/hw/gk20a
And from the code are included like so:
#include <nvgpu/hw/gk20a/hw_pwr_gk20a.h>
This is the first step in reorganizing all of the HW headers for
gm20b, gm206, etc. This is part of a larger effort to re-structure
and make the driver more readable and scalable.
Bug 1799159
Change-Id: Ic151155cbc2e6f75009f2d9d597b364a1bed2c4c
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/1244790
Reviewed-by: mobile promotions <svcmobile_promotions@nvidia.com>
Tested-by: mobile promotions <svcmobile_promotions@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
is_fmodel flag will be set in gk20a_probe().
Updated code for is_fmodel check, instead of
check for supported simulated platforms.
Bug 1735760
Change-Id: I7cbac2196130fe5ce4c1a910504879e6948c13da
Signed-off-by: Seema Khowala <seemaj@nvidia.com>
Reviewed-on: http://git-master/r/1177869
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Tested-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-by: Adeel Raza <araza@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In gk20a L2 is off when we run L2 floorsweeping function, and the
register accesses cause an error.
Remove the code to disable L2 comptag interrupt on gk20a.
Bug 1741521
Change-Id: I58ee425adf46e80ce4d045750190e930439d419b
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1151323
|
|
|
|
|
|
|
|
|
|
|
| |
Setting max_ways_evict reserves some of L2 for CB. In gk20a CB is in
dedicated RAM, so we don't need to reserve space for it.
The code gets invoked only on gk20a.
Change-Id: Ib8efec8c5e90c135bd0c10bb1eaa3f797ec68698
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1144993
|
|
|
|
|
|
|
|
|
| |
Move per-chip constants to be returned by a chip specific function.
Implement get_litter_value() for each chip.
Change-Id: I2a2730fce14010924d2507f6fa15cc2ea0795113
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1121383
|
|
|
|
|
|
|
|
|
| |
Use struct device instead of struct platform_device wherever
possible. This allows adding other bus types later.
Change-Id: I1657287a68d85a542cdbdd8a00d1902c3d6e00ed
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1120466
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Illegal comptag interrupt is triggered when a page is mapped with
two different kinds with incompatible compression status. This can
be intentional, so disable the interrupt.
Change-Id: I84a212beac147991d09d2d381a9e770b1364f4d8
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/1029663
(cherry picked from commit 819607a768f9fccdd0b233d58bcf88b9eee4ee19)
Reviewed-on: http://git-master/r/1031010
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Restore comptags to be bitmap-allocated, like they were before we had
the buddy allocator.
The new buddy allocator introduced by
e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff (originally
6ab2e0c49cb79ca68d2f83f1d4610783d2eaa79b) is fine for the big VAs, but
unsuitable for the small compbit store.
This commit reverts partially the combination of the above commit and
also one after it, 86fc7ec9a05999bea8de320840b962db3ee11410, that fixed
a bug which is not present when using a bitmap. With a bitmap allocator,
pruning the extra allocation necessary for user-mapped mode is possible,
so that is also restored.
The original generic bitmap allocator is not restored; instead, a
comptag-only allocator is introduced.
Bug 200145635
Change-Id: I87f3a911826a801124cfd21e44857dfab1c3f378
Signed-off-by: Konsta Holtta <kholtta@nvidia.com>
Reviewed-on: http://git-master/r/837180
(cherry picked from commit 5a504aeb54f3e89e6561932971158a397157b3f2)
Reviewed-on: http://git-master/r/839742
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- operands not affecting result (id = 12845)
- logically dead code (id = 12890)
- dereference after null check (id = 12968)
- unsigned compared to 0 (id = 13176)
- resource leak (id = 13338, 18673)
- unused pointer value (id = 13916)
Bug 1703084
Change-Id: I2f401dd93126af27748c53fa1b3a59cb154af36b
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/835143
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2e5803d0f2b7d7a1577a40f45ab9f3b22ef2df80 since
the issue seen with bug 200106514 is fixed with change
http://git-master/r/#/c/752080/.
Bug 200112195
Change-Id: I588151c2a7ea74bd89dc3fd48bb81ff2c49f5a0a
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/752503
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit ce1cf06b9a8eb6314ba0ca294e8cb430e1e141c0 since
it causes GPU pbdma interrupt to be generated.
Bug 200106514
Change-Id: If3ed9a914c4e3e7f3f98c6609c6dbf57e1eb9aad
Signed-off-by: Bharat Nihalani <bnihalani@nvidia.com>
Reviewed-on: http://git-master/r/749291
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 7eb42bc239dbd207208ff491c3fb65c3d83274d8.
The original commit was actually fine.
Change-Id: I564ce6530ac73fcfad17dcec9c53f0353b4f02d4
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/743300
(cherry picked from commit e99aa2485f8992eabe3556f3ebcb57bdc8ad91ff)
Reviewed-on: http://git-master/r/743301
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the return code for both gk20a_ and gm20b_ltc_cbc_ctrl()
functions. Before a positive return woudl always happen. Now,
if there's a timeout -EBUSY is returned.
Change-Id: Id76dc44af1376fceebf5043afb057c153cb0752e
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/729165
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2e235ac150fa4af8632c9abf0f109a10973a0bf5.
Change-Id: I3aa745152124c2bc09c6c6dc5aeb1084ae7e08a4
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/741469
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com>
Tested-by: Hiroshi Doyu <hdoyu@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement a new buddy allocation scheme for the GPU's VA space.
The bitmap allocator was using too much memory and is not a scaleable
solution as the GPU's address space keeps getting bigger. The buddy
allocation scheme is much more memory efficient when the majority
of the address space is not allocated.
The buddy allocator is not constrained by the notion of a split
address space. The bitmap allocator could only manage either small
pages or large pages but not both at the same time. Thus the bottom
of the address space was for small pages, the top for large pages.
Although, that split is not removed quite yet, the new allocator
enables that to happen.
The buddy allocator is also very scalable. It manages the relatively
small comptag space to the enormous GPU VA space and everything in
between. This is important since the GPU has lots of different sized
spaces that need managing.
Currently there are certain limitations. For one the allocator does
not handle the fixed allocations from CUDA very well. It can do so
but with certain caveats. The PTE page size is always set to small.
This means the BA may place other small page allocations in the
buddies around the fixed allocation. It does this to avoid having
large and small page allocations in the same PDE.
Change-Id: I501cd15af03611536490137331d43761c402c7f9
Signed-off-by: Alex Waterman <alexw@nvidia.com>
Reviewed-on: http://git-master/r/740694
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce amount of duplicate code around memory allocation by using
common helpers, and common data structure for storing results of
allocations.
Bug 1605769
Change-Id: I7c1662b669ed8c86465254f6001e536141051ee5
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/720435
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use busy looping for l2 tag flush and elpg flush
operations. This is making total flash time more
accurate and reduced overall time compared with
usleep. Also added trace points to measure
performance for these operations.
Also corrected timeout error check for non-silicon
platforms.
Bug 200081799
Change-Id: I63410bb7528db9258501633996fbdee5fdec1c74
Signed-off-by: Seshendra Gadagottu <sgadagottu@nvidia.com>
Reviewed-on: http://git-master/r/710472
(cherry picked from commit 18684cf9d5d6870a1a1fd5711c4fc2d733caad20)
Reviewed-on: http://git-master/r/710986
GVS: Gerrit_Virtual_Submit
Reviewed-by: Yu-Huan Hsu <yhsu@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
We have infinite timeouts for loops in emulation. Some functions with
the loops still return error if we exceed the original retry count.
Change-Id: I1f9ddbfc0acd9f30f6bd49d9e748d8d8fbefa154
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/709491
|
|
|
|
|
|
|
| |
Change-Id: I8b7e86afb68adf6dd33b05995d0978f42d57e7b7
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/554185
GVS: Gerrit_Virtual_Submit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix below sparse warnings :
warning: Using plain integer as NULL pointer
warning: symbol <variable/funcion> was not declared. Should it be static?
warning: Initializer entry defined twice
Also, remove dead functions
Bug 1573254
Change-Id: I29d71ecc01c841233cf6b26c9088ca8874773469
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/593363
Reviewed-by: Amit Sharma (SW-TEGRA) <amisharma@nvidia.com>
Reviewed-by: Automatic_Commit_Validation_User
Reviewed-by: Sachin Nikam <snikam@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 41b82e97164138f45fbdaef6ab6939d82ca9419e.
Change-Id: Iabd01fcb124e0d22cd9be62151a6552cbb27fc94
Signed-off-by: Sam Payne <spayne@nvidia.com>
Reviewed-on: http://git-master/r/592221
Tested-by: Hoang Pham <hopham@nvidia.com>
GVS: Gerrit_Virtual_Submit
Reviewed-by: Mitch Luban <mluban@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Convert GR and LTC HALs to use const structs, and initialize them
with macros.
Bug 1567274
Change-Id: Ia3f24a5eccb27578d9cba69755f636818d11275c
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/590371
|
|
|
|
|
|
|
|
|
|
| |
gk20a and gm20b calculate L2 size with different parameters. Split
the function for calculating size so that it does not query GPU id.
Bug 1567274
Change-Id: I09510c1bf0286c9df125d74e51df322c32bde646
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
L2 bypass registers have moved in gm20b. Move the code to
ltc_common.c, which gets compiled once per chip version.
Change-Id: I0ab4dd03c78b8ad8abc7a7b18c094b6002827587
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/499220
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Deepak Nibade <dnibade@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation is based on config GK20A_PHYS_PAGE_TABLES
to have APIs to create/free/map/unmap phys pages
Remove this config based implementaion and move the APIs so that
they are called at runtime based on tegra_platform_is_linsim()
In generic APIs, we first check if platform is linsim and if it
is then we forward the call to phys page specific APIs
Change-Id: I23eb6fa6a46b804441f18fc37e2390d938d62515
Signed-off-by: Deepak Nibade <dnibade@nvidia.com>
Reviewed-on: http://git-master/r/488843
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia780e6a7cb3579f0d6ed2dca9949a349799535fd
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/448115
Reviewed-by: Automatic_Commit_Validation_User
GVS: Gerrit_Virtual_Submit
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
When exiting rail gate, we reloaded default ZBC values. The correct
behavior is to reload the values.
Bug 1447255
Change-Id: I7aad3586dda91a91a3629062a27001af281b955e
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/418346
|
|
|
|
|
|
|
|
|
|
|
| |
ELPG flush is initiated from a common broadcast register, but must be
waited on via per-L2 registers. Split gk20a and gm20b versions of
the flush.
Change-Id: I75c2d65e8da311b50d35bee70308b60464ec2d4d
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/401545
Reviewed-by: Automatic_Commit_Validation_User
|
|
|
|
|
|
|
|
|
|
| |
Bug 1507804
Change-Id: I3cca0e83dbf911c94422f8bb0b2df675a170b990
Signed-off-by: Kevin Huang <kevinh@nvidia.com>
Reviewed-on: http://git-master/r/403213
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
Tested-by: Terje Bergstrom <tbergstrom@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
| |
CBC frontdoor access works incorrectly in the simulator if CBC
is allocated from IOVA. This patch makes CBC allocation to happen
from physical memory if are running in simulator.
Bug 1409151
Change-Id: Ia1d1ca35b5a0375f4707824df3ef06ad1b9117d4
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
| |
Bug 1409151
Change-Id: I232af159d402f818cf972498d721c3b57846ce74
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch adds necessary code to store the gpu configuration into
gr structure.
Bug 1409151
Change-Id: I045b21ebdc849833380a3d953d951f8352842ac7
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We poll completion of flush sequence by polling the broadcast
register. The polling should be done for a per-slice register
instead.
Bug 1457723
Change-Id: I10aba939175b6d05b05f5f26eebebcbe09d9b4a7
Signed-off-by: Terje Bergstrom <tbergstrom@nvidia.com>
Reviewed-on: http://git-master/r/382521
Reviewed-by: Juha Tukkinen <jtukkinen@nvidia.com>
Tested-by: Juha Tukkinen <jtukkinen@nvidia.com>
|
|
This patch moves the NVIDIA GPU driver to a new location.
Bug 1482562
Change-Id: I24293810b9d0f1504fd9be00135e21dad656ccb6
Signed-off-by: Arto Merilainen <amerilainen@nvidia.com>
Reviewed-on: http://git-master/r/383722
Reviewed-by: Terje Bergstrom <tbergstrom@nvidia.com>
|