Home About Me

How /proc/iomem Is Built and What It Really Shows

0. What /proc/iomem is showing

This node exposes how the memory tracked by memblock is actually being used. One thing stands out immediately: the addresses listed here are not virtual addresses. They are physical addresses, and they line up with the address space described in the device tree.

For example:

spring:/ # cat /proc/iomem
00208000-00208fff : 208000.qcom,ipcc qcom,ipcc@208000
00400000-00bfffff : 400000.pinctrl pinctrl@400000
01400000-015effff : 1400000.clock-controller clock-controller@1400000
01628000-01629fff : 1628000.qcom,msm-eud eud_base
0162a000-0162afff : 162b000.hsphy eud_enable_reg
0162b000-0162b113 : 162b000.hsphy hsusb_phy_base

...

82a00000-864fffff : System RAM
85200000-85efffff : reserved
8b41c000-8b7fffff : System RAM
9b800000-bb7fffff : System RAM
a0010000-a1dcffff : Kernel code
a1dd0000-a249ffff : reserved
a24a0000-a331ffff : Kernel data
a7fff000-a7ffffff : reserved
af20b000-af27afff : reserved

The interesting question is how the kernel gathers this information and turns it into the contents of /proc/iomem.

1. Building the resource tree: request_standard_resources

After bootmem_init completes, the kernel calls request_standard_resources. This is the key step where the memory-related resource tree is assembled.

Its job can be summarized like this:

  • Attach the regions from memblock.memory under the iomem_resource tree. This resource tree is organized as an inverted tree.
  • Use request_resource to register concrete resources into the bus address space chain.
  • While walking memblock.memory, check whether kernel_code and kernel_data fall inside a given memory region. If they do, register them beneath that region.

The implementation is:

static void __init request_standard_resources(void)
{
    struct memblock_region *region;
    struct resource *res;
    unsigned long i = 0;
    size_t res_size;

    //内核代码段的起始位置
    kernel_code.start   = __pa_symbol(_stext);
    //内核代码段的结束位置
    kernel_code.end     = __pa_symbol(__init_begin - 1);
    //内核数据段的起始位置
    kernel_data.start   = __pa_symbol(_sdata);
    //内核数据段的结束位置
    kernel_data.end     = __pa_symbol(_end - 1);

    // memblock.memory的数量
    num_standard_resources = memblock.memory.cnt;
    res_size = num_standard_resources * sizeof(*standard_resources);
    // 在物理内存中分配空间
    standard_resources = memblock_alloc(res_size, SMP_CACHE_BYTES);
    if (!standard_resources)
        panic("%s: Failed to allocate %zu bytes\n", __func__, res_size);

    // 遍历memblock的每个内存区域
    for_each_mem_region(region) {
        res = &standard_resources[i++];
        // 判断是否为保留区域
        if (memblock_is_nomap(region)) {
            res->name  = "reserved";
            res->flags = IORESOURCE_MEM;
        } else {
            // 如果不是保留区域,标记为'System RAM'
            res->name  = "System RAM";
            res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
        }
        // 将页帧编号转换为物理地址
        res->start = __pfn_to_phys(memblock_region_memory_base_pfn(region));
        res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;

        // 将内存区域注册到iomem_resource
        request_resource(&iomem_resource, res);

        // 如果kernel_code和kernel_data的地址范围在当前的内存区域,将其作为子资源注册到对应的System RAM
        if (kernel_code.start >= res->start &&
            kernel_code.end <= res->end)
            request_resource(res, &kernel_code);
        if (kernel_data.start >= res->start &&
            kernel_data.end <= res->end)
            request_resource(res, &kernel_data);
#ifdef CONFIG_KEXEC_CORE
        /* Userspace will find "Crash kernel" region in /proc/iomem. */
        if (crashk_res.end && crashk_res.start >= res->start &&
            crashk_res.end <= res->end)
            request_resource(res, &crashk_res);
#endif
    }
}

A few things are happening here.

First, the kernel computes the physical address range of its own code and data segments:

  • kernel_code.start to kernel_code.end
  • kernel_data.start to kernel_data.end

Then it allocates an array of standard_resources, one entry per memblock.memory region.

As each memblock region is processed:

  • If memblock_is_nomap(region) is true, that region is marked as reserved.
  • Otherwise, it is marked as System RAM and flagged with IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY.
  • The region boundaries are converted from PFN-based values into physical addresses.
  • The region is inserted into the global iomem_resource tree through request_resource(&iomem_resource, res).

After that, the kernel checks whether its code segment and data segment live inside the current RAM resource. If they do, they are registered as child resources under that System RAM node. If CONFIG_KEXEC_CORE is enabled, the crash kernel area is handled the same way.

So this step is really the construction of the resource hierarchy. The visible pieces include:

  • kernel code and kernel data
  • platform device resources
  • crash kernel, when present

2. How /proc/iomem gets registered

The /proc/iomem entry itself is registered in kernel/resource.c.

2.1 ioresources_init

static int __init ioresources_init(void)
{
    proc_create_seq_data("ioports", 0, NULL, &resource_op, &ioport_resource);
    proc_create_seq_data("iomem", 0, NULL, &resource_op, &iomem_resource);
    return 0;
}
__initcall(ioresources_init);

The important points here are straightforward:

  1. proc_create_seq_data creates entries in /proc. - The name "iomem" becomes /proc/iomem. - The last argument, &iomem_resource, tells the kernel which data structure backs this file.

  2. __initcall(ioresources_init) ensures this setup runs during kernel initialization, so /proc/iomem exists once the system finishes booting.

3. Where the data comes from: iomem_resource

The file is backed by the global iomem_resource object:

struct resource iomem_resource = {
    .name  = "PCI mem",
    .start = 0,
    .end   = -1,
    .flags = IORESOURCE_MEM,
};

This structure acts as the root node of the physical memory resource tree.

Its role is to represent the full memory resource namespace managed by the kernel. All memory-related regions—System RAM, reserved ranges, device register windows, and other physical address resources—are registered beneath this root with interfaces such as request_resource().

That is why /proc/iomem can present a single global view of the system's physical address layout.

4. How the contents of /proc/iomem are produced

When /proc/iomem is read, the kernel walks the iomem_resource tree and formats each resource into text.

4.1 Traversing the resource tree

The traversal logic is centered around r_start() and r_next():

static void *r_start(struct seq_file *m, loff_t *pos)
{
    struct resource *p = PDE_DATA(file_inode(m->file));
    loff_t l = 0;

    read_lock(&resource_lock);
    for (p = p->child; p && l < *pos; p = next_resource(p))
        l++;
    return p;
}

static void *r_next(struct seq_file *m, void *v, loff_t *pos)
{
    struct resource *p = v;
    (*pos)++;
    return (void *)next_resource(p);
}

What these functions do:

  • r_start() finds the starting node for the current read position.
  • r_next() advances through the next resource in the tree.

The read lock protects the resource tree while it is being traversed.

4.2 Formatting each entry

Once a resource node is selected, r_show() prints it:

static int r_show(struct seq_file *m, void *v)
{
    struct resource *root = PDE_DATA(file_inode(m->file));
    struct resource *r = v;
    unsigned long long start, end;

    start = r->start;
    end = r->end;

    seq_printf(m, "%08llx-%08llx : %s\n", start, end, r->name ? r->name : "<BAD>");
    return 0;
}

seq_printf emits three pieces of information for each resource:

  • start address
  • end address
  • resource name

So each line follows this format:

00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved

That is the direct origin of the textual output seen in /proc/iomem.

5. Putting it together

The complete flow is:

  • During boot, ioresources_init() creates /proc/iomem.
  • The data behind it comes from the global iomem_resource tree.
  • System RAM, reserved regions, kernel code/data, and device register ranges are inserted into that tree through request_resource() or through resource registration triggered by device tree parsing.
  • When userspace reads /proc/iomem, the kernel walks that tree and prints each resource's physical address range and label.

So /proc/iomem is not a synthetic guess or a virtual-memory view. It is a structured dump of the kernel's physical resource tree. That is exactly why it is so useful when debugging memory layout, reserved regions, or address-space conflicts.