[英]where is the context switching finally happening in the linux kernel source?
In linux, process scheduling occurs after all interrupts (timer interrupt, and other interrupts) or when a process relinquishes CPU(by calling explicit schedule() function).在 linux 中,进程调度发生在所有中断(定时器中断和其他中断)之后或当进程放弃 CPU 时(通过调用显式 schedule() 函数)。 Today I was trying to see where context switching occurs in linux source (kernel version 2.6.23)今天我想看看linux源(内核版本2.6.23)中上下文切换发生在哪里
(I think I checked this several years ago but I'm not sure now..I was looking at sparc arch then.) (我想我几年前检查过这个,但我现在不确定……当时我在看 sparc 拱门。)
I looked it up from the main_timer_handler(in arch/x86_64/kernel/time.c), but couldn't find it.我从 main_timer_handler(在 arch/x86_64/kernel/time.c 中)查找它,但找不到它。
Finally I found it in./arch/x86_64/kernel/entry.S.最后我在./arch/x86_64/kernel/entry.S中找到了。
ENTRY(common_interrupt)
XCPT_FRAME
interrupt do_IRQ
/* 0(%rsp): oldrsp-ARGOFFSET */
ret_from_intr:
cli
TRACE_IRQS_OFF
decl %gs:pda_irqcount
leaveq
CFI_DEF_CFA_REGISTER rsp
CFI_ADJUST_CFA_OFFSET -8
exit_intr:
GET_THREAD_INFO(%rcx)
testl $3,CS-ARGOFFSET(%rsp)
je retint_kernel
...(omit)
GET_THREAD_INFO(%rcx)
jmp retint_check
#ifdef CONFIG_PREEMPT
/* Returning to kernel space. Check if we need preemption */
/* rcx: threadinfo. interrupts off. */
ENTRY(retint_kernel)
cmpl $0,threadinfo_preempt_count(%rcx)
jnz retint_restore_args
bt $TIF_NEED_RESCHED,threadinfo_flags(%rcx)
jnc retint_restore_args
bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */
jnc retint_restore_args
call preempt_schedule_irq
jmp exit_intr
#endif
CFI_ENDPROC
END(common_interrupt)
At the end of the ISR is a call to preempt_schedule_irq. ISR 结束时调用 preempt_schedule_irq。 and the preempt_schedule_irq is defined in kernel/sched.c as below(it calls schedule() in the middle).并且 preempt_schedule_irq 在 kernel/sched.c 中定义如下(它在中间调用 schedule() )。
/*
* this is the entry point to schedule() from kernel preemption
* off of irq context.
* Note, that this is called and return with irqs disabled. This will
* protect us against recursive calling from irq.
*/
asmlinkage void __sched preempt_schedule_irq(void)
{
struct thread_info *ti = current_thread_info();
#ifdef CONFIG_PREEMPT_BKL
struct task_struct *task = current;
int saved_lock_depth;
#endif
/* Catch callers which need to be fixed */
BUG_ON(ti->preempt_count || !irqs_disabled());
need_resched:
add_preempt_count(PREEMPT_ACTIVE);
/*
* We keep the big kernel semaphore locked, but we
* clear ->lock_depth so that schedule() doesnt
* auto-release the semaphore:
*/
#ifdef CONFIG_PREEMPT_BKL
saved_lock_depth = task->lock_depth;
task->lock_depth = -1;
#endif
local_irq_enable();
schedule();
local_irq_disable();
#ifdef CONFIG_PREEMPT_BKL
task->lock_depth = saved_lock_depth;
#endif
sub_preempt_count(PREEMPT_ACTIVE);
/* we could miss a preemption opportunity between schedule and now */
barrier();
if (unlikely(test_thread_flag(TIF_NEED_RESCHED)))
goto need_resched;
}
So I found where the scheduling occurs, but my question is, "where in the source code does the actually context switching happen?".所以我找到了调度发生的位置,但我的问题是,“源代码中实际的上下文切换发生在哪里?”。 For context switching, the stack, mm settings, registers should be switched and the PC (program counter) should be set to the new task.对于上下文切换,应切换堆栈、mm 设置、寄存器,并将 PC(程序计数器)设置为新任务。 Where can I find the source code for that?我在哪里可以找到它的源代码? I followed schedule() --> context_switch() --> switch_to().我遵循了 schedule() --> context_switch() --> switch_to()。 Below is the context_switch function which calls switch_to() function.(kernel/sched.c)下面是 context_switch function 调用 switch_to() function.(kernel/sched.c)
/*
* context_switch - switch to the new MM and the new
* thread's register state.
*/
static inline void
context_switch(struct rq *rq, struct task_struct *prev,
struct task_struct *next)
{
struct mm_struct *mm, *oldmm;
prepare_task_switch(rq, prev, next);
mm = next->mm;
oldmm = prev->active_mm;
/*
* For paravirt, this is coupled with an exit in switch_to to
* combine the page table reload and the switch backend into
* one hypercall.
*/
arch_enter_lazy_cpu_mode();
if (unlikely(!mm)) {
next->active_mm = oldmm;
atomic_inc(&oldmm->mm_count);
enter_lazy_tlb(oldmm, next);
} else
switch_mm(oldmm, mm, next);
if (unlikely(!prev->mm)) {
prev->active_mm = NULL;
rq->prev_mm = oldmm;
}
/*
* Since the runqueue lock will be released by the next
* task (which is an invalid locking op but in the case
* of the scheduler it's an obvious special-case), so we
* do an early lockdep release here:
*/
#ifndef __ARCH_WANT_UNLOCKED_CTXSW
spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
#endif
/* Here we just switch the register state and the stack. */
switch_to(prev, next, prev); // <---- this line
barrier();
/*
* this_rq must be evaluated again because prev may have moved
* CPUs since it called schedule(), thus the 'rq' on its stack
* frame will be invalid.
*/
finish_task_switch(this_rq(), prev);
}
The 'switch_to' is an assembly code under include/asm-x86_64/system.h. 'switch_to' 是 include/asm-x86_64/system.h 下的汇编代码。 my question is, is the processor switched to the new task inside the 'switch_to()' function?我的问题是,处理器是否切换到“switch_to()”function 内的新任务? Then, are the codes 'barrier();然后,是代码'barrier(); finish_task_switch(this_rq(), prev);' finish_task_switch(this_rq(), prev);' run at some other time later?稍后在其他时间运行? By the way, this was in interrupt context, so if to_switch() is just the end of this ISR, who finishes this interrupt?顺便说一句,这是在中断上下文中,所以如果 to_switch() 只是这个 ISR 的结束,谁来完成这个中断? Or, if the finish_task_switch runs, how is CPU occupied by the new task?或者,如果finish_task_switch 运行,新任务如何占用CPU? I would really appreciate if someone could explain and clarify things to me.如果有人可以向我解释和澄清事情,我将不胜感激。
Almost all of the work for a context switch is done by the normal SYSCALL/SYSRET mechanism.几乎所有的上下文切换工作都是由正常的 SYSCALL/SYSRET 机制完成的。 The process pushes its state on the stack of "current" the current running process.该进程将其 state 推送到当前正在运行的进程的“当前”堆栈上。 Calling do_sched_yield just changes the value of current, so the return just restores the state of a different task.调用 do_sched_yield 只是改变了 current 的值,所以 return 只是恢复了不同任务的 state。
Preemption gets trickier, since it doesn't happen at a normal boundary.抢占变得更加棘手,因为它不会发生在正常边界上。 The preemption code has to save and restore all of the task state, which is slow.抢占代码必须保存和恢复所有任务 state,这很慢。 That's why non-RT kernels avoid doing preemption.这就是非 RT 内核避免抢占的原因。 The arch-specific switch_to code is what saves all the prev task state and sets up the next task state so that SYSRET will run the next task correctly.特定于架构的 switch_to 代码用于保存所有上一个任务 state 并设置下一个任务 state 以便 SYSRET 正确运行下一个任务。 There are no magic jumps or anything in the code, it is just setting up the hardware for userspace.代码中没有魔术跳转或任何东西,它只是为用户空间设置硬件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.