在linux中,进程调度发生在所有中断(定时器中断和其他中断)之后,或者当进程放弃CPU(通过调用显式schedule()函数)时。今天,我试图查看Linux源代码中发生上下文切换的地方(内核版本2.6.23)。(我想几年前检查了一下,但现在不确定。当时我正在看sparc拱门。)我从main_timer_handler(在arch / x86_64 / kernel / time.c中)中查找了它,但是找不到它。
最后我在./arch/x86_64/kernel/entry.S。中找到了它
ENTRY(common_interrupt)
XCPT_FRAME
interrupt do_IRQ
/* 0(%rsp): oldrsp-ARGOFFSET */
ret_from_intr:
cli
TRACE_IRQS_OFF
decl %gs:pda_irqcount
leaveq
CFI_DEF_CFA_REGISTER rsp
CFI_ADJUST_CFA_OFFSET -8
exit_intr:
GET_THREAD_INFO(%rcx)
testl $3,CS-ARGOFFSET(%rsp)
je retint_kernel
...(omit)
GET_THREAD_INFO(%rcx)
jmp retint_check
#ifdef CONFIG_PREEMPT
/* Returning to kernel space. Check if we need preemption */
/* rcx: threadinfo. interrupts off. */
ENTRY(retint_kernel)
cmpl $0,threadinfo_preempt_count(%rcx)
jnz retint_restore_args
bt $TIF_NEED_RESCHED,threadinfo_flags(%rcx)
jnc retint_restore_args
bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */
jnc retint_restore_args
call preempt_schedule_irq
jmp exit_intr
#endif
CFI_ENDPROC
END(common_interrupt)
ISR的末尾是对preempt_schedule_irq的调用!并且preempt_schedule_irq在kernel / sched.c中定义如下(它在中间调用schedule())。
/*
* this is the entry point to schedule() from kernel preemption
* off of irq context.
* Note, that this is called and return with irqs disabled. This will
* protect us against recursive calling from irq.
*/
asmlinkage void __sched preempt_schedule_irq(void)
{
struct thread_info *ti = current_thread_info();
#ifdef CONFIG_PREEMPT_BKL
struct task_struct *task = current;
int saved_lock_depth;
#endif
/* Catch callers which need to be fixed */
BUG_ON(ti->preempt_count || !irqs_disabled());
need_resched:
add_preempt_count(PREEMPT_ACTIVE);
/*
* We keep the big kernel semaphore locked, but we
* clear ->lock_depth so that schedule() doesnt
* auto-release the semaphore:
*/
#ifdef CONFIG_PREEMPT_BKL
saved_lock_depth = task->lock_depth;
task->lock_depth = -1;
#endif
local_irq_enable();
schedule();
local_irq_disable();
#ifdef CONFIG_PREEMPT_BKL
task->lock_depth = saved_lock_depth;
#endif
sub_preempt_count(PREEMPT_ACTIVE);
/* we could miss a preemption opportunity between schedule and now */
barrier();
if (unlikely(test_thread_flag(TIF_NEED_RESCHED)))
goto need_resched;
}
因此,我找到了调度发生的位置,但是我的问题是,“在上下文中实际发生上下文切换的地方是什么?”。对于上下文切换,应切换堆栈,mm设置,寄存器,并将PC(程序计数器)设置为新任务。在哪里可以找到该源代码?我遵循了schedule()-> context_switch()-> switch_to()。下面是context_switch函数,该函数调用switch_to()函数。(kernel / sched.c)
/*
* context_switch - switch to the new MM and the new
* thread's register state.
*/
static inline void
context_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next)
{
struct mm_struct *mm, *oldmm;
prepare_task_switch(rq, prev, next);
mm = next->mm;
oldmm = prev->active_mm;
/*
* For paravirt, this is coupled with an exit in switch_to to
* combine the page table reload and the switch backend into
* one hypercall.
*/
arch_enter_lazy_cpu_mode();
if (unlikely(!mm)) {
next->active_mm = oldmm;
atomic_inc(&oldmm->mm_count);
enter_lazy_tlb(oldmm, next);
} else
switch_mm(oldmm, mm, next);
if (unlikely(!prev->mm)) {
prev->active_mm = NULL;
rq->prev_mm = oldmm;
}
/*
* Since the runqueue lock will be released by the next
* task (which is an invalid locking op but in the case
* of the scheduler it's an obvious special-case), so we
* do an early lockdep release here:
*/
#ifndef __ARCH_WANT_UNLOCKED_CTXSW
spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
#endif
/* Here we just switch the register state and the stack. */
switch_to(prev, next, prev); // <---- this line
barrier();
/*
* this_rq must be evaluated again because prev may have moved
* CPUs since it called schedule(), thus the 'rq' on its stack
* frame will be invalid.
*/
finish_task_switch(this_rq(), prev);
}
'switch_to'是include / asm-x86_64 / system.h下的汇编代码。我的问题是,处理器是否在“ switch_to()”函数内切换到新任务?然后,是代码'barrier(); finish_task_switch(this_rq(),prev);'在其他时间运行?顺便说一句,这是在中断上下文中,所以如果to_switch()只是此ISR的结尾,谁来完成此中断?或者,如果finish_task_switch运行,新任务将如何占用CPU?如果有人可以向我解释和澄清事情,我将不胜感激。
上下文切换的几乎所有工作都是通过普通的SYSCALL / SYSRET机制完成的。该进程将其状态推送到当前正在运行的进程的“当前”堆栈中。调用do_sched_yield只会更改current的值,因此返回仅会恢复其他任务的状态。
抢占会变得更加棘手,因为它不会在正常边界发生。抢占代码必须保存和恢复所有任务状态,这很慢。这就是非RT内核避免进行抢占的原因。特定于Arch的switch_to代码保存所有上一个任务状态并设置下一个任务状态,以便SYSRET可以正确运行下一个任务。代码中没有魔术跳跃或任何内容,它只是为用户空间设置硬件。