aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/lguest/lguest.c
Commit message (Collapse)AuthorAge
* lguest: move the initial guest page table creation code to the hostMatias Zabaljauregui2008-12-29
| | | | | | | | This patch moves the initial guest page table creation code to the host, so the launcher keeps working with PAE enabled configs. Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio: use LGUEST_VRING_ALIGN instead of relying on pagesizeRusty Russell2008-12-29
| | | | | | | | This doesn't really matter, since lguest is i386 only at the moment, but we could actually choose a different value. (lguest doesn't have a guarenteed ABI). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: fix example launcher compile after moved asm-x86 dir.Rusty Russell2008-10-30
| | | | Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* doc/x86: fix doc subdirsUwe Hermann2008-10-28
| | | | | | | | | | The Documentation/i386 and Documentation/x86_64 directories and their contents have been moved into Documentation/x86. Fix references to those files accordingly. Signed-off-by: Uwe Hermann <uwe@hermann-uwe.de> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* lguest: update commentryRusty Russell2008-08-25
| | | | Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: don't set MAC address for guest unless specifiedRusty Russell2008-08-12
| | | | | | | | | | | This shows up when trying to bridge: tap0: received packet with own address as source address As Max Krasnyansky points out, there's no reason to give the guest the same mac address as the TUN device. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Max Krasnyansky <maxk@qualcomm.com>
* lguest: turn Waker into a thread, not a processRusty Russell2008-07-28
| | | | | | | | | | | | | | lguest uses a Waker process to break it out of the kernel (ie. actually running the guest) when file descriptor needs attention. Changing this from a process to a thread somewhat simplifies things: it can directly access the fd_set of things to watch. More importantly, it means that the Waker can see Guest memory correctly, so /dev/vring file descriptors will work as anticipated (the alternative is to actually mmap MAP_SHARED, but you can't do that with /dev/zero). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Enlarge virtio ringsRusty Russell2008-07-28
| | | | | | | | | | With big packets, 128 entries is a little small. Guest -> Host 1GB TCP: Before: 8.43625 seconds xmit 95640 recv 198266 timeout 49771 usec 1252 After: 8.01099 seconds xmit 49200 recv 102263 timeout 26014 usec 2118 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Use GSO/IFF_VNET_HDR extensions on tun/tapRusty Russell2008-07-28
| | | | | | | | | | | | Guest -> Host 1GB TCP: Before 20.1974 seconds xmit 214510 recv 5 timeout 214491 usec 278 After 8.43625 seconds xmit 95640 recv 198266 timeout 49771 usec 1252 Host -> Guest 1GB TCP: Before: Seconds 9.98854 xmit 172166 recv 5344 timeout 172157 usec 251 After: Seconds 5.72803 xmit 244322 recv 9919 timeout 244302 usec 156 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Remove 'network: no dma buffer!' warningRusty Russell2008-07-28
| | | | | | | This warning can happen a lot under load, and it should be warnx not warn anwyay. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Adaptive timeoutRusty Russell2008-07-28
| | | | | | | | | | | | Since the correct timeout value varies, use a heuristic which adjusts the timeout depending on how many packets we've seen. This gives slightly worse results, but doesn't need tweaking when GSO is introduced. 500 usec 19.1887 xmit 561141 recv 1 timeout 559657 Dynamic (278) 20.1974 xmit 214510 recv 5 timeout 214491 usec 278 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Tell Guest net not to notify us on every packet xmitRusty Russell2008-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | virtio_ring has the ability to suppress notifications. This prevents a guest exit for every packet, but we need to set a timer on packet receipt to re-check if there were any remaining packets. Here are the times for 1G TCP Guest->Host with different timeout settings (it matters because the TCP window doesn't grow big enough to fill the entire buffer): Timeout value Seconds Xmit/Recv/Timeout None (before) 25.3784 xmit 7750233 recv 1 2500 usec 62.5119 xmit 207020 recv 2 timeout 207020 1000 usec 34.5379 xmit 207003 recv 2 timeout 207003 750 usec 29.2305 xmit 207002 recv 1 timeout 207002 500 usec 19.1887 xmit 561141 recv 1 timeout 559657 250 usec 20.0465 xmit 214128 recv 2 timeout 214110 100 usec 19.2583 xmit 561621 recv 1 timeout 560153 (Note that these values are sensitive to the GSO patches which come later, and probably other traffic-related variables, so take with a large grain of salt). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: net block unneeded receive queue update notificationsRusty Russell2008-07-28
| | | | | | | | | | Number of exits transmitting 10GB Guest->Host before: network xmit 7858610 recv 118136 After: network xmit 7750233 recv 1 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: wrap last_avail accesses.Rusty Russell2008-07-28
| | | | | | | | To simplify the transition to when we publish indices in the ring (and make shuffling my patch queue easier), wrap them in a lg_last_avail() macro. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: virtio-rng supportRusty Russell2008-07-28
| | | | | | | | This is a simple patch to add support for the virtio "hardware random generator" to lguest. It gets about 1.2 MB/sec reading from /dev/hwrng in the guest. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Support assigning a MAC addressMark McLoughlin2008-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | If you've got a nice DHCP configuration which maps MAC addresses to specific IP addresses, then you're going to want to start your guest with one of those MAC addresses. Also, in Fedora, we have persistent network interface naming based on the MAC address, so with randomly assigned addresses you're soon going to hit eth13. Who knows what will happen then! Allow assigning a MAC address to the network interface with e.g. --tunnet=bridge:eth0:00:FF:95:6B:DA:3D or: --tunnet=192.168.121.1:00:FF:95:6B:DA:3D which is pretty unintelligable, but ... (includes Rusty's minor rework) Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Don't leak /dev/zero fdMark McLoughlin2008-07-28
| | | | | Signed-off-by: Mark McLoughlin <markmc@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: fix verbose printing of device features.Rusty Russell2008-07-28
| | | | | | %02x is more appropriate for bytes than %08x. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: notify on emptyRusty Russell2008-05-30
| | | | | | | | This is the lguest implementation of the VIRTIO_F_NOTIFY_ON_EMPTY feature. It is currently only published for network devices, but it is turned on for everyone. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: make Launcher see device status updatesRusty Russell2008-05-02
| | | | | | | This brings us closer to Real Life, where we'd examine the device features once it's set the DRIVER_OK status bit. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio: de-structify virtio_block status byteRusty Russell2008-05-02
| | | | | | | | | Ron Minnich points out that a struct containing a char is not always sizeof(char); simplest to remove the structure to avoid confusion. Cc: "ron minnich" <rminnich@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: comment documentation update.Rusty Russell2008-03-27
| | | | | | | | | Took some cycles to re-read the Lguest Journey end-to-end, fix some rot and tighten some phrases. Only comments change. No new jokes, but a couple of recycled old jokes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Don't need comment terminator before disk section.Rusty Russell2008-03-27
| | | | Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Do not append space to guests kernel command linePaul Bolle2008-03-10
| | | | | | | | | | | | | | The lguest launcher appends a space to the kernel command line (if kernel arguments are specified on its command line). This space is unneeded. More importantly, this appended space will make Red Hat's nash script interpreter (used in a Fedora style initramfs) add an empty argument to init's command line. This empty argument will make kernel arguments like "init=/bin/bash" fail (because the shell will try to execute a script with an empty name). This could be considered a bug in nash, but is easily fixed in the lguest launcher too. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio: reset functionRusty Russell2008-02-04
| | | | | | | | | | | | | | | | | | | | | A reset function solves three problems: 1) It allows us to renegotiate features, eg. if we want to upgrade a guest driver without rebooting the guest. 2) It gives us a clean way of shutting down virtqueues: after a reset, we know that the buffers won't be used by the host, and 3) It helps the guest recover from messed-up drivers. So we remove the ->shutdown hook, and the only way we now remove feature bits is via reset. We leave it to the driver to do the reset before it deletes queues: the balloon driver, for example, needs to chat to the host in its remove function. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio: clarify NO_NOTIFY flag usageRusty Russell2008-02-04
| | | | | | | | | The other side (host) can set the NO_NOTIFY flag as an optimization, to say "no need to kick me when you add things". Make it clear that this is advisory only; especially that we should always notify when the ring is full. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio: simplify config mechanism.Rusty Russell2008-02-04
| | | | | | | | | | | | | Previously we used a type/len pair within the config space, but this seems overkill. We now simply define a structure which represents the layout in the config space: the config space can now only be extended at the end. The main driver-visible changes: 1) We indicate what fields are present with an explicit feature bit. 2) Virtqueues are explicitly numbered, and not in the config space. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: adapt launcher to per-cpunessGlauber de Oliveira Costa2008-01-30
| | | | | | | | | | This patch makes uses of pread() and pwrite() in lguest launcher to communicate the vcpu id to the lguest driver. The id is kept in a thread variable, which means we'll span in the future, vcpus as threads. But right now, only the infrastructure is out there. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Reboot supportBalaji Rao2008-01-30
| | | | | | | | | Reboot Implemented (Prevent fd leak, fix style and fix documentation --RR) Signed-off-by: Balaji Rao <balajirrao@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Fix uninitialized members in example launcherRusty Russell2007-11-18
| | | | | | Thanks valgrind! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* virtio: Force use of power-of-two for descriptor ring sizesRusty Russell2007-11-11
| | | | | | | | | | | | | The virtio descriptor rings of size N-1 were nicely set up to be aligned to an N-byte boundary. But as Anthony Liguori points out, the free-running indices used by virtio require that the sizes be a power of 2, otherwise we get problems on wrap (demonstrated with lguest). So we replace the clever "2^n-1" scheme with a simple "align to page boundary" scheme: this means that all virtio rings take at least two pages, but it's safer than guessing cache alignment. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: Fix lguest virtio-blk backend size computationAnthony Liguori2007-11-11
| | | | | | | | This seems like an obvious typo but it's worked in the past because the virtio blk frontend just ignores the length field on completion. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: documentation updateRusty Russell2007-10-25
| | | | | | | | Went through the documentation doing typo and content fixes. This patch contains only comment and whitespace changes. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* lguest: example launcher header cleanup.Rusty Russell2007-10-25
| | | | | | | | | Now the kernel headers are clean for userspace export, we don't need to typedef kernel types before including them. We also don't need pci_ids.h (that was from an earlier virtio draft). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Use "struct boot_params" in example launcherRusty Russell2007-10-23
| | | | | | | Now that the "struct boot_params" is userspace accessible, we don't need magic numbers. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Loading bzImage directly.Rusty Russell2007-10-23
| | | | | | | Now arch/i386/boot/compressed/head.S understands the hardware_platform field, we can directly execute bzImages. No more horrific unpacking code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Revert lguest magic and use hook in head.SRusty Russell2007-10-23
| | | | | | | | | | | Version 2.07 of the boot protocol uses 0x23C for the hardware_subarch field, that for lguest is "1". This allows us to use the standard boot entry point rather than the "GenuineLguest" string hack. The standard entry point also clears the BSS and copies the boot parameters and commandline for us, saving more code. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Example launcher handle guests not being ready for inputRusty Russell2007-10-23
| | | | | | | | | | | We currently discard console and network input when the guest has no input buffers. This patch changes that, so that we simply stop listening to that fd until the guest refills its input buffers. This is particularly important because hvc_console without interrupts does backoff polling and so often lose characters if we discard. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Update example launcher for virtioRusty Russell2007-10-23
| | | | | | | | Implements virtio-based console, network and block servers. The block server uses a thread so it's async, which is an improvement over the old synchronous implementation (but a little more complex). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Boot with virtual == physical to get closer to native Linux.Rusty Russell2007-10-23
| | | | | | | | | | | | | | | | | | | | 1) This allows us to get alot closer to booting bzImages. 2) It means we don't have to know page_offset. 3) The Guest needs to modify the boot pagetables to create the PAGE_OFFSET mapping before jumping to C code. 4) guest_pa() walks the page tables rather than using page_offset. 5) We don't use page_offset to figure out whether to emulate: it was always kinda quesationable, and won't work for instructions done before remapping (bzImage unpacking in particular). 6) We still want the kernel address for tlb flushing: have the initial hypercall give us that, too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Change example launcher to use unsigned long not u32Jes Sorensen2007-10-23
| | | | | | | | | | | | | Apply Clue 2x4 to lguest userland<->kernel handling code and the lguest launcher. Pointers are not to be passed in u32's! Basic rule of thumb: Anything passing u32's back and forth should be passing unsigned longs to be portable to 64 bit archs. For those who forgotten already, I repeat: NO POINTERS IN u32! Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Introduce guest mem offset, static link example launcherRusty Russell2007-10-23
| | | | | | | | | | | In order to avoid problematic special linking of the Launcher, we give the Host an offset: this means we can use any memory region in the Launcher as Guest memory rather than insisting on mmap() at 0. The result is quite pleasing: a number of casts are replaced with simple additions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Accept elf files that are valid but have sections that can not be mmap'ed ↵Ronald G. Minnich2007-10-23
| | | | | | | | | | for some reason. Plan9 kernel binaries don't neatly align their ELF sections to our page boundaries. Signed-off-by: Ronald G. Minnich <rminnich@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* Make lguest_launcher.h types userspace-friendlyRusty Russell2007-10-23
| | | | | | | | | lguest_launcher.h uses "u32" not "__u32", which sets a bad example. Fix that, and include <linux/types.h>. This means we need to use -I on the Launcher build line so types.h is found. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* turn err into errx in lguest call sitesGlauber de Oliveira Costa2007-10-23
| | | | | | | | | These two callsites should really be errx instead of err, since there is no errno associated with them in the moment they are issued. Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Glauber de Oliveira Costa <gcosta@redhat.com>
* Make asm-x86/bootparam.h includable from userspace.Rusty Russell2007-10-23
| | | | | | | | | | | To actually write a bootloader (or, say, the lguest launcher) currently requires duplication of these structures. Making them includable from userspace is much nicer. We merge the common userspace-required definitions of e820_32/64.h into e820.h for export. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* i386/x86_64: move headers to include/asm-x86Thomas Gleixner2007-10-11
| | | | | | | | Move the headers to include/asm-x86 and fixup the header install make rules Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* lguest example launcher truncates block device file to 0 length on problemsChris Malley2007-09-26
| | | | | | | | | The function should also use ftruncate64() rather than ftruncate() to prevent files over 4GB (not uncommon for a root filesystem) being zeroed. Signed-off-by: Chris Malley <mail@chrismalley.co.uk> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lguest: documentation VII: FIXMEsRusty Russell2007-07-26
| | | | | | | | Documentation: The FIXMEs Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* lguest: documentation IV: LauncherRusty Russell2007-07-26
| | | | | | | | Documentation: The Launcher Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>