1494N/A * Copyright (c) 2012, 2013, Oracle and/or its affiliates. All rights reserved. 1494N/A * Copyright (c) 2008-2010, 2013, Intel Corporation 1494N/A * Permission is hereby granted, free of charge, to any person obtaining a 1494N/A * copy of this software and associated documentation files (the "Software"), 1494N/A * to deal in the Software without restriction, including without limitation 1494N/A * the rights to use, copy, modify, merge, publish, distribute, sublicense, 1494N/A * and/or sell copies of the Software, and to permit persons to whom the 1494N/A * Software is furnished to do so, subject to the following conditions: 1494N/A * The above copyright notice and this permission notice (including the next 1494N/A * paragraph) shall be included in all copies or substantial portions of the 1494N/A * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 1494N/A * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 1494N/A * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL 1494N/A * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 1494N/A * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 1494N/A * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS 1494N/A * Eric Anholt <eric@anholt.net> 1494N/A * Zou Nan hai <nanhai.zou@intel.com> 1494N/A * Xiang Hai hao<haihao.xiang@intel.com> 1494N/A * 965+ support PIPE_CONTROL commands, which provide finer grained control 1494N/A * I915_GEM_DOMAIN_RENDER is always invalidated, but is 1494N/A * only flushed if MI_NO_WRITE_FLUSH is unset. On 965, it is 1494N/A * also flushed at 2d versus 3d pipeline switches. 1494N/A * I915_GEM_DOMAIN_SAMPLER is flushed on pre-965 if 1494N/A * MI_READ_FLUSH is set, and is always flushed on 965. 1494N/A * I915_GEM_DOMAIN_COMMAND may not exist? 1494N/A * I915_GEM_DOMAIN_INSTRUCTION, which exists on 965, is 1494N/A * invalidated when MI_EXE_FLUSH is set. 1494N/A * I915_GEM_DOMAIN_VERTEX, which exists on 965, is 1494N/A * invalidated with every MI_FLUSH. 1494N/A * On 965, TLBs associated with I915_GEM_DOMAIN_COMMAND 1494N/A * and I915_GEM_DOMAIN_CPU in are invalidated at PTE write and 1494N/A * I915_GEM_DOMAIN_RENDER and I915_GEM_DOMAIN_SAMPLER 1494N/A * are flushed at any MI_FLUSH. 1494N/A * Emits a PIPE_CONTROL with a non-zero post-sync operation, for 1494N/A * implementing two workarounds on gen6. From section 1.4.7.1 1494N/A * "PIPE_CONTROL" of the Sandy Bridge PRM volume 2 part 1: 1494N/A * [DevSNB-C+{W/A}] Before any depth stall flush (including those 1494N/A * produced by non-pipelined state commands), software needs to first 1494N/A * send a PIPE_CONTROL with no bits set except Post-Sync Operation != 1494N/A * [Dev-SNB{W/A}]: Before a PIPE_CONTROL with Write Cache Flush Enable 1494N/A * =1, a PIPE_CONTROL with any non-zero post-sync-op is required. 1494N/A * And the workaround for these two requires this workaround first: 1494N/A * [Dev-SNB{W/A}]: Pipe-control with CS-stall bit set must be sent 1494N/A * BEFORE the pipe-control with a post-sync op and no write-cache 1494N/A * And this last workaround is tricky because of the requirements on 1494N/A * that bit. From section 1.4.7.2.3 "Stall" of the Sandy Bridge PRM 1494N/A * "1 of the following must also be set: 1494N/A * - Render Target Cache Flush Enable ([12] of DW1) 1494N/A * - Depth Cache Flush Enable ([0] of DW1) 1494N/A * - Stall at Pixel Scoreboard ([1] of DW1) 1494N/A * - Depth Stall ([13] of DW1) 1494N/A * - Post-Sync Operation ([13] of DW1) 1494N/A * - Notify Enable ([8] of DW1)" 1494N/A * The cache flushes require the workaround flush that triggered this 1494N/A * one, so we can't use it. Depth stall would trigger the same. 1494N/A * Post-sync nonzero is what triggered this second workaround, so we 1494N/A * can't use that one either. Notify enable is IRQs, which aren't 1494N/A * really our business. That leaves only stall at scoreboard. 1494N/A /* Force SNB workarounds for PIPE_CONTROL flushes */ 1494N/A /* Just flush everything. Experiments have shown that reducing the 1494N/A * number of bits based on the write domains has little performance 1494N/A * Ensure that any following seqno writes only happen 1494N/A * when the render cache is indeed flushed. 1494N/A * TLB invalidate requires a post-sync write. 1494N/A * Ensure that any following seqno writes only happen when the render 1494N/A * Workaround: 4th PIPE_CONTROL command (except the ones with only 1494N/A * read-cache invalidate bits set) must have the CS_STALL bit set. We 1494N/A * don't try to be clever and just set it unconditionally. 1494N/A /* Just flush everything. Experiments have shown that reducing the 1494N/A * number of bits based on the write domains has little performance 1494N/A * TLB invalidate requires a post-sync write. 1494N/A /* Workaround: we must issue a pipe_control with CS-stall bit 1494N/A * set before a pipe_control command that has the state cache 1494N/A /* Stop the ring if it's running. */ 1494N/A /* G45 ring initialization fails to reset head to zero */ 1494N/A "ctl %08x head %08x tail %08x start %08x\n",
1494N/A "ctl %08x head %08x tail %08x start %08x\n",
1494N/A /* Initialize the ring. This must happen _after_ we've cleared the ring 1494N/A * registers with the above sequence (the readback of the HEAD registers 1494N/A * also enforces ordering), otherwise the hw might lose the new ring 1494N/A /* If the head is still not zero, the ring is dead */ 1494N/A "ctl %08x head %08x tail %08x start %08x\n",
1494N/A /*LINTED E_BAD_PTR_CAST_ALIGN*/ 1494N/A /* We need to disable the AsyncFlip performance optimisations in order 1494N/A * to use MI_WAIT_FOR_EVENT within the CS. It should already be 1494N/A * programmed to '1' on all products. 1494N/A * WaDisableAsyncFlipPerfMode:snb,ivb,hsw,vlv 1494N/A /* Required for the hardware to program scanline values for waiting */ 1494N/A /* From the Sandybridge PRM, volume 1 part 3, page 24: 1494N/A * "If this bit is set, STCunit will have LRA as replacement 1494N/A * policy. [...] This bit must be reset. LRA replacement 1494N/A * policy is not supported." 1494N/A /* This is not explicitly set for GEN6, so read the register. 1494N/A * see intel_ring_mi_set_context() for why we care. 1494N/A * TODO: consider explicitly setting the bit for GEN5 1494N/A/* NB: In order to be able to do semaphore MBOX updates for varying number 1494N/A * of rings, it's easiest if we round up each individual update to a 1494N/A * multiple of 2 (since ring updates must always be a multiple of 2) 1494N/A * even though the actual update only requires 3 dwords. 1494N/A * gen6_add_request - Update the semaphore mailbox registers 1494N/A * @ring - ring that is adding a request 1494N/A * @seqno - return seqno stuck into the ring 1494N/A * Update the mailbox registers in the *other* rings with the current seqno. 1494N/A * This acts like a signal in the canonical semaphore. 1494N/A * intel_ring_sync - sync the waiter to the signaller on seqno 1494N/A * @waiter - ring that is waiting 1494N/A * @signaller - ring which has, or will signal 1494N/A * @seqno - seqno which the waiter will block on 1494N/A /* Throughout all of the GEM code, seqno passed implies our current 1494N/A * seqno is >= the last seqno executed. However for hardware the 1494N/A * comparison is strictly greater than. 1494N/A /* If seqno wrap happened, omit the wait with no-ops */ 1494N/A /* For Ironlake, MI_USER_INTERRUPT was deprecated and apparently 1494N/A * incoherent with writes to memory, i.e. completely fubar, 1494N/A * so we need to use PIPE_NOTIFY instead. 1494N/A * However, we also need to workaround the qword write 1494N/A * incoherence by flushing the 6 PIPE_NOTIFY buffers out to 1494N/A * memory before requesting an interrupt. 1494N/A /* Workaround to force correct ordering between irq and seqno writes on 1494N/A * ivb (and maybe also on snb) by reading from a CS register (like 1494N/A * ACTHD) before reading the status page. */ 1494N/A /* The ring status page addresses are no longer next to the rest of 1494N/A * the ring registers as of gen7. 1494N/A /* Flush the TLB for this page */ 1494N/A DRM_ERROR(
"%s: wait for SyncFlush to complete for TLB invalidation timed out\n",
1494N/A /* It looks like we need to prevent the gt from suspending while waiting 1494N/A * for an notifiy irq, otherwise irqs seem to get lost on at least the 1494N/A/* Just userspace ABI convention to limit the wa batch bo to a resonable size */ 1494N/A /* Blit the batch (which has now all relocs applied) to the stable batch 1494N/A * scratch bo area (so that the CS never stumbles over its tlb 1494N/A /* Workaround an erratum on the i830 which causes a hang if 1494N/A * the TAIL pointer points to within the last 2 cachelines 1494N/A /* Disable the ring buffer. The ring must be idle at this point */ 1494N/A /* Consume this request in case we need more space than 1494N/A * is available and so need to prevent a race between 1494N/A * updating last_retired_head and direct reads of 1494N/A * I915_RING_HEAD. It also provides a nice sanity check. 1494N/A /* With GEM the hangcheck timer should kick us out of the loop, 1494N/A * leaving it early runs the risk of corrupting GEM state (due 1494N/A * to running on almost untested codepaths). But on resume 1494N/A * timers don't work yet, so prevent a complete hang in that 1494N/A * case by choosing an insanely large timeout. */ 1494N/A /* We need to add any requests required to flush the objects and ring */ 1494N/A /* Wait upon the last request to be completed */ 1494N/A /* Preallocate the olr before touching the ring */ 1494N/A /* Every tail move must follow the sequence below */ 1494N/A /* Disable notification that the ring is IDLE. The GT 1494N/A * will then assume that it is busy and bring it out of rc6. 1494N/A /* Clear the context id. Here be magic! */ 1494N/A /* Wait for the ring not to be idle, i.e. for it to wake up. */ 1494N/A /* Now that the ring is fully powered up, update the tail */ 1494N/A /* Let the ring send IDLE messages to the GT again, 1494N/A * and so let it sleep to conserve power when idle. 1494N/A * Bspec vol 1c.5 - video engine command streamer: 1494N/A * "If ENABLED, all TLBs will be invalidated once the flush 1494N/A * operation is complete. This bit is only valid when the 1494N/A * Post-Sync Operation field is a value of 1h or 3h." 1494N/A /* bit0-7 is the length on GEN6+ */ 1494N/A /* bit0-7 is the length on GEN6+ */ 1494N/A/* Blitter support (SandyBridge+) */ 1494N/A * Bspec vol 1c.3 - blitter engine command streamer: 1494N/A * "If ENABLED, all TLBs will be invalidated once the flush 1494N/A * operation is complete. This bit is only valid when the 1494N/A * Post-Sync Operation field is a value of 1h or 3h." 1494N/A /* Workaround batchbuffer to combat CS tlb bug. */ 1494N/A /* non-kms not supported on gen6+ */ 1494N/A /* Note: gem is not supported on gen5/ilk without kms (the corresponding 1494N/A * gem_init ioctl returns with -ENODEV). Hence we do not need to set up 1494N/A * the special gen5 functions. */ 1494N/A /* gen6 bsd needs a special wa for tail updates */