在讲调度域的初始化之前我们不得不先说一下多CPU启动的一些基本知识点,不过我们的目的是聚焦于调度域,所以只会讲与调度相关的内容。
possible,present,online和active四大状态分别记录在__cpu_possible_mask,set_cpu_present,__cpu_online_mask和__cpu_active_mask这几个变量中。下面四个函数用于设置这四个状态。
static inline void
set_cpu_possible(unsigned int cpu, bool possible)
if (possible)
cpumask_set_cpu(cpu, &__cpu_possible_mask);
cpumask_clear_cpu(cpu, &__cpu_possible_mask);
static inline void
set_cpu_present(unsigned int cpu, bool present)
if (present)
cpumask_set_cpu(cpu, &__cpu_present_mask);
cpumask_clear_cpu(cpu, &__cpu_present_mask);
static inline void
set_cpu_online(unsigned int cpu, bool online)
if (online)
cpumask_set_cpu(cpu, &__cpu_online_mask);
cpumask_clear_cpu(cpu, &__cpu_online_mask);
static inline void
set_cpu_active(unsigned int cpu, bool active)
if (active)
cpumask_set_cpu(cpu, &__cpu_active_mask);
cpumask_clear_cpu(cpu, &__cpu_active_mask);
possible标记一个可能存在的CPU。第一个启动的CPU称为boot CPU,该CPU肯定是存在的,否则系统无法启动,所以boot CPU初始化的时候会直接设置possible状态,当然对于boot CPU来说,它的present,online,active状态也是直接设置即可。
start_kernel->boot_cpu_init
void __init boot_cpu_init(void)
int cpu = smp_processor_id();
/* Mark the boot cpu "present", "online" etc for SMP and UP case */
set_cpu_online(cpu, true);
set_cpu_active(cpu, true);
set_cpu_present(cpu, true);
set_cpu_possible(cpu, true);
然而对于其他CPU来说,需要在DTS中描述,否则boot CPU不知道其他CPU的存在。下面是从内核代码arch/arm64/boot/dts/arm/juno.dts中摘录的关于CPU拓扑的描述:
cpus {
#address-cells = <2>;
#size-cells = <0>;
cpu-map {
cluster0 {
core0 {
cpu = <&A57_0>;
core1 {
cpu = <&A57_1>;
cluster1 {
core0 {
cpu = <&A53_0>;
core1 {
cpu = <&A53_1>;
core2 {
cpu = <&A53_2>;
core3 {
cpu = <&A53_3>;
idle-states {
entry-method = "arm,psci";
CPU_SLEEP_0: cpu-sleep-0 {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x0010000>;
local-timer-stop;
entry-latency-us = <300>;
exit-latency-us = <1200>;
min-residency-us = <2000>;
CLUSTER_SLEEP_0: cluster-sleep-0 {
compatible = "arm,idle-state";
arm,psci-suspend-param = <0x1010000>;
local-timer-stop;
entry-latency-us = <400>;
exit-latency-us = <1200>;
min-residency-us = <2500>;
A57_0: cpu@0 {
compatible = "arm,cortex-a57","arm,armv8";
reg = <0x0 0x0>;
device_type = "cpu";
enable-method = "psci";
next-level-cache = <&A57_L2>;
clocks = <&scpi_dvfs 0>;
cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
A57_1: cpu@1 {
compatible = "arm,cortex-a57","arm,armv8";
reg = <0x0 0x1>;
device_type = "cpu";
enable-method = "psci";
next-level-cache = <&A57_L2>;
clocks = <&scpi_dvfs 0>;
cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
A53_0: cpu@100 {
compatible = "arm,cortex-a53","arm,armv8";
reg = <0x0 0x100>;
device_type = "cpu";
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
A53_1: cpu@101 {
compatible = "arm,cortex-a53","arm,armv8";
reg = <0x0 0x101>;
device_type = "cpu";
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
A53_2: cpu@102 {
compatible = "arm,cortex-a53","arm,armv8";
reg = <0x0 0x102>;
device_type = "cpu";
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
A53_3: cpu@103 {
compatible = "arm,cortex-a53","arm,armv8";
reg = <0x0 0x103>;
device_type = "cpu";
enable-method = "psci";
next-level-cache = <&A53_L2>;
clocks = <&scpi_dvfs 1>;
cpu-idle-states = <&CPU_SLEEP_0 &CLUSTER_SLEEP_0>;
A57_L2: l2-cache0 {
compatible = "cache";
A53_L2: l2-cache1 {
compatible = "cache";
该SoC有两个Cluster,Cluster0集成了两个A57 Core,Cluster1集成了四个A53。这里没有NUMA Node和超线程的定义,所以我们不在讨论NUMA Node和超线程。
我们先把设置其他CPU possible的状态列出来:
setup_arch->smp_init_cpus
void __init smp_init_cpus(void)
int i;
if (acpi_disabled)
of_parse_and_init_cpus();
* do a walk of MADT to determine how many CPUs
* we have including disabled CPUs, and get information
* we need for SMP init
acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
acpi_parse_gic_cpu_interface, 0);
if (cpu_count > nr_cpu_ids)
pr_warn("Number of cores (%d) exceeds configured maximum of %d - clipping\n",
cpu_count, nr_cpu_ids);
if (!bootcpu_valid) {
pr_err("missing boot CPU MPIDR, not enabling secondaries\n");
return;
* We need to set the cpu_logical_map entries before enabling
* the cpus so that cpu processor description entries (DT cpu nodes
* and ACPI MADT entries) can be retrieved by matching the cpu hwid
* with entries in cpu_logical_map while initializing the cpus.
* If the cpu set-up fails, invalidate the cpu_logical_map entry.
for (i = 1; i < nr_cpu_ids; i++) {
if (cpu_logical_map(i) != INVALID_HWID) {
if (smp_cpu_setup(i))
cpu_logical_map(i) = INVALID_HWID;
- 调用of_parse_and_init_cpus接口解析DTS获得CPU拓扑,该函数会将扫描到的CPU的硬件ID依次存放在__cpu_logical_map中,也就是说__cpu_logical_map是一个数组,以CPU number为索引,记录硬件ID。DTS中CPU节点的reg属性记录硬件ID。
- 调用smp_cpu_setup接口设置possible状态。遍历所支持的所有CPU number,如果硬件ID不是invalid,即在上一步成功从DTS解析到,那么这里就会调用smp_cpu_setup设置possible属性。
由此可见,DTS中定义的CPU节点对应的CPU会被设置为possible,也就说它们可能存在,但是是否真的存在,要它们自己为自己发生,boot CPU说了不算。
possible只是说CPU有可能存在,因为DTS定义了该CPU节点,但是此时CPU还不在位,之前在文章https://mp.csdn.net/mdeditor/84845448#中介绍过,Boot CPU先进入内核,其他CPU其实还在uboot下面做无聊的循环,Boot CPU会给其他CPU一个地址,让他们跳转到内核,此时其他CPU才算在位。
kernel_init->kernel_init_freeable->smp_prepare_cpus
void __init smp_prepare_cpus(unsigned int max_cpus)
int err;
unsigned int cpu;
unsigned int this_cpu;
init_cpu_topology();
this_cpu = smp_processor_id();
store_cpu_topology(this_cpu);
numa_store_cpu_info(this_cpu);
* If UP is mandated by "nosmp" (which implies "maxcpus=0"), don't set
* secondary CPUs present.
if (max_cpus == 0)
return;
* Initialise the present map (which describes the set of CPUs
* actually populated at the present time) and release the
* secondaries from the bootloader.
for_each_possible_cpu(cpu) {
if (cpu == smp_processor_id())
continue;
if (!cpu_ops[cpu])
continue;
err = cpu_ops[cpu]->cpu_prepare(cpu);
if (err)
continue;
set_cpu_present(cpu, true);
numa_store_cpu_info(cpu);
- cpu_ops[cpu]->cpu_prepare让其他CPU跳转到内核,但是此时这些跳转过来的CPU依然在做着无聊的循环,并没有做什么有意义的事情。
- 调用set_cpu_present设置在位状态。
present表明CPU已经跳转到内核了,但是他们依然在做着无意义的循环。下一步Boot CPU会让其他CPU开始执行必要的初始化,此时其他CPU会进入online状态,也就是说这些CPU在线了,可以干活了。
下面这条线是Boot CPU启动其他CPU的线:
smp_init->cpu_up->do_cpu_up(cpu, CPUHP_ONLINE)->_cpu_up
最终调用bringup_cpu函数启动其他CPU。
下面这条线是其他CPU启动的线:
secondary_startup->__secondary_switched->secondary_start_kernel->set_cpu_online
各个CPU启动的时候,会设置自己的online状态。
当CPU处于active的时候,说明它已经准备好了一切,可以参与进程调度了。
下面这条线创建per CPU的内核线程,该线程的主处理函数是cpuhp_should_run:
smp_init->cpuhp_threads_init
其他CPU启动完成后,会唤醒该进程,处理逻辑如下:
secondary_startup->__secondary_switched->secondary_start_kernel->cpu_startup_entry->cpuhp_online_idle->__cpuhp_kick_ap_work
被唤醒的进程处理如下:
cpuhp_thread_fun->cpuhp_ap_online->cpuhp_up_callbacks
cpuhp_up_callbacks最终调用函数sched_cpu_activate设置CPU为active。
CPU启动流程以及状态机如下:
- Boot CPU启动,并设置Boot CPU的possible,present,online和active状态。
- Boot CPU解析DTS获取CPU拓扑,并将DTS定义的CPU节点对应的CPU设置possible状态。
- Boot CPU将其他CPU引导到内核,此时其他CPU尚不做任何有意义的工作,此时Boot CPU设置其他CPU的present状态。
- Boot CPU释放其他CPU,让其他CPU开始做初始化工作,其他CPU启动过程中会将自己的online状态设置。
- 各个CPU初始化完成后,会唤醒内核线程cpuhp_threads,该进程会设置CPU的active状态。
处于active状态的CPU一切就绪,可以参与进程调度了,然而参与调度之前还有一件事情要做,就是参与调度域的初始化,我们再下一篇文章详细介绍。
查看top帮助信息
不管linux还是unix,大多数命令都是支持man命令来查看帮助信息的。
语法是下面这样,进入到交互界面后,用法类似vi,然后按「q」可以退出,输入「?」再输入关键字,可以查询相关关键字:
man top
帮助信息回显:
TOP(1) ...
浏览器打开
start_kernel()函数在init/main.c文件里。
内核的初始化程序在start_kernel这个函数中,可以在线查看这些代码: start_kernel。通过阅读start_kernel代码,可以大致了解到内核在初始化的时候,做了以下工作:1. lockdep_init():初始化内核依赖关系表,初始化hash表
boot_init_st...
浏览器打开
ARM上电。执行BOOTLOADER
bootloader加载kernel 。传递参数给kernel 然后执行kernel
设置一些寄存器,初始化一些状态等等。然后跳到head.s执行
head.s已经属于kernel的部分了
head.s主要是硬件相关的部分,解压kernel等等。最终跳转到
start_kernel里面执行
start_kernel
浏览器打开
idle进程,也就是swapper进程,其pid是0,是所有进程的祖先。每个cpu 都有一个idle进程。
bootcpu 进入idle进程的flow如下:
start_kernel –> rest_init –> cpu_startup_entry->do_idle
nobootcpu(也就是剩下的cpu)
secondary_startup –> __secondary_switched –>
浏览器打开