简体   繁体   English

linux kernel 源中的上下文切换最终发生在哪里?

[英]where is the context switching finally happening in the linux kernel source?

In linux, process scheduling occurs after all interrupts (timer interrupt, and other interrupts) or when a process relinquishes CPU(by calling explicit schedule() function).在 linux 中,进程调度发生在所有中断(定时器中断和其他中断)之后或当进程放弃 CPU 时(通过调用显式 schedule() 函数)。 Today I was trying to see where context switching occurs in linux source (kernel version 2.6.23)今天我想看看linux源(内核版本2.6.23)中上下文切换发生在哪里
(I think I checked this several years ago but I'm not sure now..I was looking at sparc arch then.) (我想我几年前检查过这个,但我现在不确定……当时我在看 sparc 拱门。)
I looked it up from the main_timer_handler(in arch/x86_64/kernel/time.c), but couldn't find it.我从 main_timer_handler(在 arch/x86_64/kernel/time.c 中)查找它,但找不到它。

Finally I found it in./arch/x86_64/kernel/entry.S.最后我在./arch/x86_64/kernel/entry.S中找到了。

    ENTRY(common_interrupt)
        XCPT_FRAME
        interrupt do_IRQ
        /* 0(%rsp): oldrsp-ARGOFFSET */
    ret_from_intr:
        cli
        TRACE_IRQS_OFF
        decl %gs:pda_irqcount
        leaveq
        CFI_DEF_CFA_REGISTER    rsp
        CFI_ADJUST_CFA_OFFSET   -8
    exit_intr:
        GET_THREAD_INFO(%rcx)
        testl $3,CS-ARGOFFSET(%rsp)
        je retint_kernel

...(omit)
        GET_THREAD_INFO(%rcx)
        jmp retint_check

    #ifdef CONFIG_PREEMPT
        /* Returning to kernel space. Check if we need preemption */
        /* rcx:  threadinfo. interrupts off. */
    ENTRY(retint_kernel)
        cmpl $0,threadinfo_preempt_count(%rcx)
        jnz  retint_restore_args
        bt  $TIF_NEED_RESCHED,threadinfo_flags(%rcx)
        jnc  retint_restore_args
        bt   $9,EFLAGS-ARGOFFSET(%rsp)  /* interrupts off? */
        jnc  retint_restore_args
        call preempt_schedule_irq
        jmp exit_intr
    #endif

        CFI_ENDPROC
    END(common_interrupt)

At the end of the ISR is a call to preempt_schedule_irq. ISR 结束时调用 preempt_schedule_irq。 and the preempt_schedule_irq is defined in kernel/sched.c as below(it calls schedule() in the middle).并且 preempt_schedule_irq 在 kernel/sched.c 中定义如下(它在中间调用 schedule() )。

/*  
 * this is the entry point to schedule() from kernel preemption
 * off of irq context.
 * Note, that this is called and return with irqs disabled. This will
 * protect us against recursive calling from irq. 
 */ 
asmlinkage void __sched preempt_schedule_irq(void)
{   
    struct thread_info *ti = current_thread_info();
#ifdef CONFIG_PREEMPT_BKL
    struct task_struct *task = current;
    int saved_lock_depth;
#endif
    /* Catch callers which need to be fixed */
    BUG_ON(ti->preempt_count || !irqs_disabled());

need_resched:
    add_preempt_count(PREEMPT_ACTIVE);
    /*
     * We keep the big kernel semaphore locked, but we
     * clear ->lock_depth so that schedule() doesnt
     * auto-release the semaphore:
     */
#ifdef CONFIG_PREEMPT_BKL
    saved_lock_depth = task->lock_depth;
    task->lock_depth = -1; 
#endif
    local_irq_enable();
    schedule();
    local_irq_disable();
#ifdef CONFIG_PREEMPT_BKL
    task->lock_depth = saved_lock_depth;
#endif
    sub_preempt_count(PREEMPT_ACTIVE); 

    /* we could miss a preemption opportunity between schedule and now */
    barrier();
    if (unlikely(test_thread_flag(TIF_NEED_RESCHED)))
        goto need_resched; 
}   

So I found where the scheduling occurs, but my question is, "where in the source code does the actually context switching happen?".所以我找到了调度发生的位置,但我的问题是,“源代码中实际的上下文切换发生在哪里?”。 For context switching, the stack, mm settings, registers should be switched and the PC (program counter) should be set to the new task.对于上下文切换,应切换堆栈、mm 设置、寄存器,并将 PC(程序计数器)设置为新任务。 Where can I find the source code for that?我在哪里可以找到它的源代码? I followed schedule() --> context_switch() --> switch_to().我遵循了 schedule() --> context_switch() --> switch_to()。 Below is the context_switch function which calls switch_to() function.(kernel/sched.c)下面是 context_switch function 调用 switch_to() function.(kernel/sched.c)

/*
 * context_switch - switch to the new MM and the new
 * thread's register state.
 */
static inline void
context_switch(struct rq *rq, struct task_struct *prev,
           struct task_struct *next)
{
    struct mm_struct *mm, *oldmm;

    prepare_task_switch(rq, prev, next);
    mm = next->mm;
    oldmm = prev->active_mm;
    /*
     * For paravirt, this is coupled with an exit in switch_to to
     * combine the page table reload and the switch backend into
     * one hypercall.
     */
    arch_enter_lazy_cpu_mode();

    if (unlikely(!mm)) {
        next->active_mm = oldmm;
        atomic_inc(&oldmm->mm_count);
        enter_lazy_tlb(oldmm, next);
    } else
        switch_mm(oldmm, mm, next);

    if (unlikely(!prev->mm)) {
        prev->active_mm = NULL;
        rq->prev_mm = oldmm;
    }
    /*
     * Since the runqueue lock will be released by the next
     * task (which is an invalid locking op but in the case
     * of the scheduler it's an obvious special-case), so we
     * do an early lockdep release here:
     */
#ifndef __ARCH_WANT_UNLOCKED_CTXSW
    spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
#endif

    /* Here we just switch the register state and the stack. */
    switch_to(prev, next, prev);   // <---- this line

    barrier();
    /*
     * this_rq must be evaluated again because prev may have moved
     * CPUs since it called schedule(), thus the 'rq' on its stack
     * frame will be invalid.
     */
    finish_task_switch(this_rq(), prev);
}

The 'switch_to' is an assembly code under include/asm-x86_64/system.h. 'switch_to' 是 include/asm-x86_64/system.h 下的汇编代码。 my question is, is the processor switched to the new task inside the 'switch_to()' function?我的问题是,处理器是否切换到“switch_to()”function 内的新任务? Then, are the codes 'barrier();然后,是代码'barrier(); finish_task_switch(this_rq(), prev);' finish_task_switch(this_rq(), prev);' run at some other time later?稍后在其他时间运行? By the way, this was in interrupt context, so if to_switch() is just the end of this ISR, who finishes this interrupt?顺便说一句,这是在中断上下文中,所以如果 to_switch() 只是这个 ISR 的结束,谁来完成这个中断? Or, if the finish_task_switch runs, how is CPU occupied by the new task?或者,如果finish_task_switch 运行,新任务如何占用CPU? I would really appreciate if someone could explain and clarify things to me.如果有人可以向我解释和澄清事情,我将不胜感激。

Almost all of the work for a context switch is done by the normal SYSCALL/SYSRET mechanism.几乎所有的上下文切换工作都是由正常的 SYSCALL/SYSRET 机制完成的。 The process pushes its state on the stack of "current" the current running process.该进程将其 state 推送到当前正在运行的进程的“当前”堆栈上。 Calling do_sched_yield just changes the value of current, so the return just restores the state of a different task.调用 do_sched_yield 只是改变了 current 的值,所以 return 只是恢复了不同任务的 state。

Preemption gets trickier, since it doesn't happen at a normal boundary.抢占变得更加棘手,因为它不会发生在正常边界上。 The preemption code has to save and restore all of the task state, which is slow.抢占代码必须保存和恢复所有任务 state,这很慢。 That's why non-RT kernels avoid doing preemption.这就是非 RT 内核避免抢占的原因。 The arch-specific switch_to code is what saves all the prev task state and sets up the next task state so that SYSRET will run the next task correctly.特定于架构的 switch_to 代码用于保存所有上一个任务 state 并设置下一个任务 state 以便 SYSRET 正确运行下一个任务。 There are no magic jumps or anything in the code, it is just setting up the hardware for userspace.代码中没有魔术跳转或任何东西,它只是为用户空间设置硬件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM