| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Locks the k_itimer's it_lock member when handling the alarm timer's
expiry callback.
The regular posix timers defined in posix-timers.c have this lock held
during timout processing because their callbacks are routed through
posix_timer_fn(). The alarm timers follow a different path, so they
ought to grab the lock somewhere else.
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoids sending a signal to alarm timers created with sigev_notify set to
SIGEV_NONE by checking for that special case in the timeout callback.
The regular posix timers avoid sending signals to SIGEV_NONE timers by
not scheduling any callbacks for them in the first place. Although it
would be possible to do something similar for alarm timers, it's simpler
to handle this as a special case in the timeout.
Prior to this patch, the alarm timer would ignore the sigev_notify value
and try to deliver signals to the process anyway. Even worse, the
sanity check for the value of sigev_signo is skipped when SIGEV_NONE was
specified, so the signal number could be bogus. If sigev_signo was an
unitialized value (as it often would be if SIGEV_NONE is used), then
it's hard to predict which signal will be sent.
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Returns the time remaining for an alarm timer, rather than the time at
which it is scheduled to expire. If the timer has already expired or it
is not currently scheduled, the it_value's members are set to zero.
This new behavior matches that of the other posix-timers and the POSIX
specifications.
This is a change in user-visible behavior, and may break existing
applications. Hopefully, few users rely on the old incorrect behavior.
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
[jstultz: minor style tweak]
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:
setitimer(ITIMER_PROF, &val, NULL);
setitimer(ITIMER_PROF, NULL, &val);
would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.) Doing
this repeatedly would cause unbounded growth in val. So fix the math.
Here's what was wrong with the conversion: we essentially computed
(eliding seconds)
jiffies = usec * (NSEC_PER_USEC/TICK_NSEC)
by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:
jiffies = (usec * x) >> USEC_JIFFIE_SC
and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)
In particular, with HZ=1000, we consistently computed that 10000 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.
We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec->nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies. This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.
Tested: the following program:
int main() {
struct itimerval zero = {{0, 0}, {0, 0}};
/* Initially set to 10 ms. */
struct itimerval initial = zero;
initial.it_interval.tv_usec = 10000;
setitimer(ITIMER_PROF, &initial, NULL);
/* Save and restore several times. */
for (size_t i = 0; i < 10; ++i) {
struct itimerval prev;
setitimer(ITIMER_PROF, &zero, &prev);
/* on old kernels, this goes up by TICK_USEC every iteration */
printf("previous value: %ld %ld %ld %ld\n",
prev.it_interval.tv_sec, prev.it_interval.tv_usec,
prev.it_value.tv_sec, prev.it_value.tv_usec);
setitimer(ITIMER_PROF, &prev, NULL);
}
return 0;
}
Cc: stable@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Turner <pjt@google.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Paul Turner <pjt@google.com>
Reported-by: Aaron Jacobs <jacobsa@google.com>
Signed-off-by: Andrew Hunter <ahh@google.com>
[jstultz: Tweaked to apply to 3.17-rc]
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The update_walltime() code works on the shadow timekeeper to make the
seqcount protected region as short as possible. But that update to the
shadow timekeeper does not update all timekeeper fields because it's
sufficient to do that once before it becomes life. One of these fields
is tkr.base_mono. That stays stale in the shadow timekeeper unless an
operation happens which copies the real timekeeper to the shadow.
The update function is called after the update calls to vsyscall and
pvclock. While not correct, it did not cause any problems because none
of the invoked update functions used base_mono.
commit cbcf2dd3b3d4 (x86: kvm: Make kvm_get_time_and_clockread()
nanoseconds based) changed that in the kvm pvclock update function, so
the stale mono_base value got used and caused kvm-clock to malfunction.
Put the update where it belongs and fix the issue.
Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409050000570.3333@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The local nohz kick is currently used by perf which needs it to be
NMI-safe. Recent commit though (7d1311b93e58ed55f3a31cc8f94c4b8fe988a2b9)
changed its implementation to fire the local kick using the remote kick
API. It was convenient to make the code more generic but the remote kick
isn't NMI-safe.
As a result:
WARNING: CPU: 3 PID: 18062 at kernel/irq_work.c:72 irq_work_queue_on+0x11e/0x140()
CPU: 3 PID: 18062 Comm: trinity-subchil Not tainted 3.16.0+ #34
0000000000000009 00000000903774d1 ffff880244e06c00 ffffffff9a7f1e37
0000000000000000 ffff880244e06c38 ffffffff9a0791dd ffff880244fce180
0000000000000003 ffff880244e06d58 ffff880244e06ef8 0000000000000000
Call Trace:
<NMI> [<ffffffff9a7f1e37>] dump_stack+0x4e/0x7a
[<ffffffff9a0791dd>] warn_slowpath_common+0x7d/0xa0
[<ffffffff9a07930a>] warn_slowpath_null+0x1a/0x20
[<ffffffff9a17ca1e>] irq_work_queue_on+0x11e/0x140
[<ffffffff9a10a2c7>] tick_nohz_full_kick_cpu+0x57/0x90
[<ffffffff9a186cd5>] __perf_event_overflow+0x275/0x350
[<ffffffff9a184f80>] ? perf_event_task_disable+0xa0/0xa0
[<ffffffff9a01a4cf>] ? x86_perf_event_set_period+0xbf/0x150
[<ffffffff9a187934>] perf_event_overflow+0x14/0x20
[<ffffffff9a020386>] intel_pmu_handle_irq+0x206/0x410
[<ffffffff9a0b54d3>] ? arch_vtime_task_switch+0x63/0x130
[<ffffffff9a01937b>] perf_event_nmi_handler+0x2b/0x50
[<ffffffff9a007b72>] nmi_handle+0xd2/0x390
[<ffffffff9a007aa5>] ? nmi_handle+0x5/0x390
[<ffffffff9a0d131b>] ? lock_release+0xab/0x330
[<ffffffff9a008062>] default_do_nmi+0x72/0x1c0
[<ffffffff9a0c925f>] ? cpuacct_account_field+0xcf/0x200
[<ffffffff9a008268>] do_nmi+0xb8/0x100
Lets fix this by restoring the use of local irq work for the nohz local
kick.
Reported-by: Catalin Iacob <iacobcatalin@gmail.com>
Reported-and-tested-by: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Benjamin Herrenschmidt pointed out that I further missed modifying
update_vsyscall after the wall_to_mono value was changed to a
timespec64. This causes issues on powerpc32, which expects a 32bit
timespec.
This patch fixes the problem by properly converting from a timespec64 to
a timespec before passing the value on to the arch-specific vsyscall
logic.
[ Thomas is currently on vacation, but reviewed it and wanted me to send
this fix on to you directly. ]
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer and time updates from Thomas Gleixner:
"A rather large update of timers, timekeeping & co
- Core timekeeping code is year-2038 safe now for 32bit machines.
Now we just need to fix all in kernel users and the gazillion of
user space interfaces which rely on timespec/timeval :)
- Better cache layout for the timekeeping internal data structures.
- Proper nanosecond based interfaces for in kernel users.
- Tree wide cleanup of code which wants nanoseconds but does hoops
and loops to convert back and forth from timespecs. Some of it
definitely belongs into the ugly code museum.
- Consolidation of the timekeeping interface zoo.
- A fast NMI safe accessor to clock monotonic for tracing. This is a
long standing request to support correlated user/kernel space
traces. With proper NTP frequency correction it's also suitable
for correlation of traces accross separate machines.
- Checkpoint/restart support for timerfd.
- A few NOHZ[_FULL] improvements in the [hr]timer code.
- Code move from kernel to kernel/time of all time* related code.
- New clocksource/event drivers from the ARM universe. I'm really
impressed that despite an architected timer in the newer chips SoC
manufacturers insist on inventing new and differently broken SoC
specific timers.
[ Ed. "Impressed"? I don't think that word means what you think it means ]
- Another round of code move from arch to drivers. Looks like most
of the legacy mess in ARM regarding timers is sorted out except for
a few obnoxious strongholds.
- The usual updates and fixlets all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (114 commits)
timekeeping: Fixup typo in update_vsyscall_old definition
clocksource: document some basic timekeeping concepts
timekeeping: Use cached ntp_tick_length when accumulating error
timekeeping: Rework frequency adjustments to work better w/ nohz
timekeeping: Minor fixup for timespec64->timespec assignment
ftrace: Provide trace clocks monotonic
timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
seqcount: Add raw_write_seqcount_latch()
seqcount: Provide raw_read_seqcount()
timekeeping: Use tk_read_base as argument for timekeeping_get_ns()
timekeeping: Create struct tk_read_base and use it in struct timekeeper
timekeeping: Restructure the timekeeper some more
clocksource: Get rid of cycle_last
clocksource: Move cycle_last validation to core code
clocksource: Make delta calculation a function
wireless: ath9k: Get rid of timespec conversions
drm: vmwgfx: Use nsec based interfaces
drm: i915: Use nsec based interfaces
timekeeping: Provide ktime_get_raw()
hangcheck-timer: Use ktime_get_ns()
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
By caching the ntp_tick_length() when we correct the frequency error,
and then using that cached value to accumulate error, we avoid large
initial errors when the tick length is changed.
This makes convergence happen much faster in the simulator, since the
initial error doesn't have to be slowly whittled away.
This initially seems like an accounting error, but Miroslav pointed out
that ntp_tick_length() can change mid-tick, so when we apply it in the
error accumulation, we are applying any recent change to the entire tick.
This approach chooses to apply changes in the ntp_tick_length() only to
the next tick, which allows us to calculate the freq correction before
using the new tick length, which avoids accummulating error.
Credit to Miroslav for pointing this out and providing the original patch
this functionality has been pulled out from, along with the rational.
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The existing timekeeping_adjust logic has always been complicated
to understand. Further, since it was developed prior to NOHZ becoming
common, its not surprising it performs poorly when NOHZ is enabled.
Since Miroslav pointed out the problematic nature of the existing code
in the NOHZ case, I've tried to refactor the code to perform better.
The problem with the previous approach was that it tried to adjust
for the total cumulative error using a scaled dampening factor. This
resulted in large errors to be corrected slowly, while small errors
were corrected quickly. With NOHZ the timekeeping code doesn't know
how far out the next tick will be, so this results in bad
over-correction to small errors, and insufficient correction to large
errors.
Inspired by Miroslav's patch, I've refactored the code to try to
address the correction in two steps.
1) Check the future freq error for the next tick, and if the frequency
error is large, try to make sure we correct it so it doesn't cause
much accumulated error.
2) Then make a small single unit adjustment to correct any cumulative
error that has collected over time.
This method performs fairly well in the simulator Miroslav created.
Major credit to Miroslav for pointing out the issue, providing the
original patch to resolve this, a simulator for testing, as well as
helping debug and resolve issues in my implementation so that it
performed closer to his original implementation.
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Reported-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In the GENERIC_TIME_VSYSCALL_OLD update_vsyscall implementation,
we take the tk_xtime() value, which returns a timespec64, and
store it in a timespec.
This luckily is ok, since the only architectures that use
GENERIC_TIME_VSYSCALL_OLD are ia64 and ppc64, which are both
64 bit systems where timespec64 is the same as a timespec.
Even so, for cleanliness reasons, use the conversion function
to assign the proper type.
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Tracers want a correlated time between the kernel instrumentation and
user space. We really do not want to export sched_clock() to user
space, so we need to provide something sensible for this.
Using separate data structures with an non blocking sequence count
based update mechanism allows us to do that. The data structure
required for the readout has a sequence counter and two copies of the
timekeeping data.
On the update side:
smp_wmb();
tkf->seq++;
smp_wmb();
update(tkf->base[0], tk);
smp_wmb();
tkf->seq++;
smp_wmb();
update(tkf->base[1], tk);
On the reader side:
do {
seq = tkf->seq;
smp_rmb();
idx = seq & 0x01;
now = now(tkf->base[idx]);
smp_rmb();
} while (seq != tkf->seq)
So if a NMI hits the update of base[0] it will use base[1] which is
still consistent, but this timestamp is not guaranteed to be monotonic
across an update.
The timestamp is calculated by:
now = base_mono + clock_delta * slope
So if the update lowers the slope, readers who are forced to the
not yet updated second array are still using the old steeper slope.
tmono
^
| o n
| o n
| u
| o
|o
|12345678---> reader order
o = old slope
u = update
n = new slope
So reader 6 will observe time going backwards versus reader 5.
While other CPUs are likely to be able observe that, the only way
for a CPU local observation is when an NMI hits in the middle of
the update. Timestamps taken from that NMI context might be ahead
of the following timestamps. Callers need to be aware of that and
deal with it.
V2: Got rid of clock monotonic raw and reorganized the data
structures. Folded in the barrier fix from Mathieu.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
All the function needs is in the tk_read_base struct. No functional
change for the current code, just a preparatory patch for the NMI safe
accessor to clock monotonic which will use struct tk_read_base as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The members of the new struct are the required ones for the new NMI
safe accessor to clcok monotonic. In order to reuse the existing
timekeeping code and to make the update of the fast NMI safe
timekeepers a simple memcpy use the struct for the timekeeper as well
and convert all users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Access to time requires to touch two cachelines at minimum
1) The timekeeper data structure
2) The clocksource data structure
The access to the clocksource data structure can be avoided as almost
all clocksource implementations ignore the argument to the read
callback, which is a pointer to the clocksource.
But the core needs to touch it to access the members @read and @mask.
So we are better off by copying the @read function pointer and the
@mask from the clocksource to the core data structure itself.
For the most used ktime_get() access all required data including the
@read and @mask copies fits together with the sequence counter into a
single 64 byte cacheline.
For the other time access functions we touch in the current code three
cache lines in the worst case. But with the clocksource data copies we
can reduce that to two adjacent cachelines, which is more efficient
than disjunct cache lines.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
cycle_last was added to the clocksource to support the TSC
validation. We moved that to the core code, so we can get rid of the
extra copy.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The only user of the cycle_last validation is the x86 TSC. In order to
provide NMI safe accessor functions for clock monotonic and
monotonic_raw we need to do that in the core.
We can't do the TSC specific
if (now < cycle_last)
now = cycle_last;
for the other wrapping around clocksources, but TSC has
CLOCKSOURCE_MASK(64) which actually does not mask out anything so if
now is less than cycle_last the subtraction will give a negative
result. So we can check for that in clocksource_delta() and return 0
for that case.
Implement and enable it for x86
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We want to move the TSC sanity check into core code to make NMI safe
accessors to clock monotonic[_raw] possible. For this we need to
sanity check the delta calculation. Create a helper function and
convert all sites to use it.
[ Build fix from jstultz ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
Provide a ktime_t based interface for raw monotonic time.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
timekeeping_clocktai() is not used in fast pathes, so the extra
timespec conversion is not problematic.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
No more users. Remove it
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Subtracting plain nsec values and converting to timespec is simpler
than the whole timespec math. Not really fastpath code, so the
division is not an issue.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
get_monotonic_boottime() is not used in fast pathes, so the extra
timespec conversion is not problematic.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
No more users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
Required for moving drivers to the nanosecond based interfaces.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
No more users.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
ktime based conversion function to map a monotonic time stamp to a
different CLOCK.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
No need to juggle with timespecs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
No need to juggle with timespecs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| | |
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| | |
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
Speed up the readout.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Provide a helper function which lets us implement ktime_t based
interfaces for real, boot and tai clocks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Speed up ktime_get() by using ktime_t based data. Text size shrinks by
64 bytes on x8664.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The ktime_t based interfaces are used a lot in performance critical
code pathes. Add ktime_t based data so the interfaces don't have to
convert from the xtime/timespec based data.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
We already have a function which does the right thing, that also makes
sure that the coming ktime_t based cached values are getting updated.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
struct timekeeper is quite badly sorted for the hot readout path. Most
time access functions need to load two cache lines.
Rearrange it so ktime_get() and getnstimeofday() are happy with a
single cache line.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| | |
No users outside of the core.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
To convert callers of the core code to timespec64 we need to provide
the proper interfaces.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Right now we have time related prototypes in 3 different header
files. Move it to a single timekeeping header file and move the core
internal stuff into a core private header.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Convert the core timekeeping logic to use timespec64s. This moves the
2038 issues out of the core logic and into all of the accessor
functions.
Future changes will need to push the timespec64s out to all
timekeeping users, but that can be done interface by interface.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Helper and conversion functions for timespec64.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With the plain nanoseconds based ktime_t we can simply use
ktime_divns() instead of going through loops and hoops of
timespec/timeval conversion.
Reported-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The non-scalar ktime_t implementation is basically a timespec
which has to be changed to support dates past 2038 on 32bit
systems.
This patch removes the non-scalar ktime_t implementation, forcing
the scalar s64 nanosecond version on all architectures.
This may have additional performance overhead on some 32bit
systems when converting between ktime_t and timespec structures,
however the majority of 32bit systems (arm and i386) were already
using scalar ktime_t, so no performance regressions will be seen
on those platforms.
On affected platforms, I'm open to finding optimizations, including
avoiding converting to timespecs where possible.
[ tglx: We can now cleanup the ktime_t.tv64 mess, but thats a
different issue and we can throw a coccinelle script at it ]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Rather then having two similar but totally different implementations
that provide timekeeping state to the hrtimer code, try to unify the
two implementations to be more simliar.
Thus this clarifies ktime_get_update_offsets to
ktime_get_update_offsets_now and changes get_xtime... to
ktime_get_update_offsets_tick.
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Provide a default stub function instead of having the extra
conditional. Cuts binary size on a m68k build by ~100 bytes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Create a module that allows udelay() to be executed to ensure that
it is delaying at least as long as requested (with a little bit of
error allowed).
There are some configurations which don't have reliably udelay
due to using a loop delay with cpufreq changes which should use
a counter time based delay instead. This test aims to identify
those configurations where timing is unreliable.
Signed-off-by: David Riley <davidriley@chromium.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We call hrtimer_enqueue_reprogram() only when we are in high resolution
mode now so we don't need to check that again in hrtimer_enqueue_reprogram().
Once the check is removed, hrtimer_enqueue_reprogram() turns to be an
useless wrapper over hrtimer_reprogram() and can be dropped.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1403393357-2070-6-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In lowres mode, hrtimers are serviced by the tick instead of a clock
event. It works well as long as the tick stays periodic but we must also
make sure that the hrtimers are serviced in dynticks mode targets,
pretty much like timer list timers do.
Note that all dynticks modes are concerned: get_nohz_timer_target()
tries not to return remote idle CPUs but there is nothing to prevent
the elected target from entering dynticks idle mode until we lock its
base. It's also prefectly legal to enqueue hrtimers on full dynticks CPU.
So there are two requirements to correctly handle dynticks:
1) On target's tick stop time, we must not delay the next tick further
the next hrtimer.
2) On hrtimer queue time. If the tick of the target is stopped, we must
wake up that CPU such that it sees the new hrtimer and recalculate
the next tick accordingly.
The point 1 is well handled currently through get_nohz_timer_interrupt() and
cmp_next_hrtimer_event().
But the point 2 isn't handled at all.
Fixing this is easy though as we have the necessary API ready for that.
All we need is to call wake_up_nohz_cpu() on a target when a newly
enqueued hrtimer requires tick rescheduling, like timer list timer do.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/3d7ea08ce008698e26bd39fe10f55949391073ab.1403507178.git.viresh.kumar@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In lowres mode, hrtimers are serviced by the tick instead of a clock
event. Now it works well as long as the tick stays periodic but we
must also make sure that the hrtimers are serviced in dynticks mode.
Part of that job consist in kicking a dynticks hrtimer target in order
to make it reconsider the next tick to schedule to correctly handle the
hrtimer's expiring time. And that part isn't handled by the hrtimers
subsystem.
To prepare for fixing this, we need __hrtimer_start_range_ns() to be
able to resolve the CPU target associated to a hrtimer's object
'cpu_base' so that the kick can be centralized there.
So lets store it in the 'struct hrtimer_cpu_base' to resolve the CPU
without overhead. It is set once at CPU's online notification.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1403393357-2070-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|