ARM Cortex-Mx Quickstart

4 August 2012

By now, almost everyone’s managed to acquire a bajillion cheap ARM dev boards, and there are always more coming.  As these chips get cheaper, and available in more hacker-friendly packages, they’re going to overtake chips and boards like the AVR and the Arduino, it’s just a matter of time.  Unfortunately, the ARM ecosystem isn’t as simple to work with as the AVR or PIC ecosystem.  With both of these 8-bitters, you’re working in the manufacturer’s walled garden.  ARM manufacturers, on the other hand, are free to glue whatever crazy stuff they want onto the ARM core.  As a result, peripheral sets and their use vary widely, even within the same manufacturer.  Efforts like the CMSIS ameliorate this effect to some degree, but this isn’t a cake-walk.  Add on to that the difficulties in simply getting a toolchain up and running, and anyone with mere Arduino experience is lost (even more experienced 8-bit devs are going to have some tough going).

So, here’s a short guide to getting a toolchain for the STM32 series up and running quickly.  These instructions will be specific to Linux, but should translate to OSX or Windows fairly easily.

First, I downloaded the Linaro Bare-Metal ARM toolchain (http://www.linaro.org/downloads/).  It’s important to get the Bare-Metal toolchain.  Other toolchains generate binaries that aren’t capable of running sans operating system. The bare-metal toolchain is at the bottom of the page.  This package contains the compiler, linker, debugger, and other tools used to turn source code into machine code.  I used the precompiled package from Linaro because it’s a simple off-the-shelf method to get up and running quickly.  Unlike Codesourcery/Sourcery Tools, the Linaro toolchain supports the Cortex M4 with FPU right out of the box.

In the past, I have attempted to compile their own toolchain (https://github.com/esden/summon-arm-toolchain/), but this is not a simple process.

In Linux, I just extracted the tar.gz file to the folder I wanted it to live, and added that folder to my PATH.

Next comes OpenOCD.  OpenOCD is an open source debugger/flash utility for lots and lots of different chips.  I had to get the latest dev version from the git repo, (http://openocd.sourceforge.net/repos/), as the latest stable release does not include STLink code.  If a version beyond 0.5.0 is released by the time you’re reading this, download that instead.

Installation was pretty easy.  I installed the dependencies using “sudo apt-get build-dep openocd”, then ran “./bootstrap”, followed by “./configure –enable-maintainer-mode –enable-stlink”, followed by “make” and “make install”.

Once I had the toolchain and debugger up and running, I downloaded some software to compile.  This github repository (https://github.com/nabilt/STM32F4-Discovery-Firmware) includes the test firmware provided by ST, with a Makefile capable of building it using GCC. Chibios (http://chibios.org/dokuwiki/doku.php) also includes demo files for all the discovery boards.

The first time I attempted to build the STM32F4 firmware, it didn’t quite work out.  Due to the varied nature of the ecosystem, a separate linker script is needed for each chip.  Furthermore, different compilers need different sections to their linker scripts, so it might not be possible to simply yank one from somewhere else.  The linker script tells the linker where different memory sections map into real, physical memory on the chip.

Here’s the linker script I eventually got working for my STM32F4-Discovery:

/* Entry Point */
ENTRY(Reset_Handler)

/* Highest address of the user mode stack */
_estack = 0x20020000;    /* end of 128K RAM on AHB bus*/

/* Generate a link error if heap and stack don't fit into RAM */
_Min_Heap_Size = 0;      /* required amount of heap  */
_Min_Stack_Size = 0x400; /* required amount of stack */

/* Specify the memory areas */
MEMORY
{
  FLASH (rx)      : ORIGIN = 0x08000000, LENGTH = 1024K
  RAM (xrw)       : ORIGIN = 0x20000000, LENGTH = 192K
  MEMORY_B1 (rx)  : ORIGIN = 0x60000000, LENGTH = 0K
}

/* Define output sections */
SECTIONS
{
  /* The startup code goes first into FLASH */
  .isr_vector :
  {
    . = ALIGN(4);
    KEEP(*(.isr_vector)) /* Startup code */
    . = ALIGN(4);
  } >FLASH

  /* The program code and other data goes into FLASH */
  .text :
  {
    . = ALIGN(4);
    *(.text)           /* .text sections (code) */
    *(.text*)          /* .text* sections (code) */
    *(.rodata)         /* .rodata sections (constants, strings, etc.) */
    *(.rodata*)        /* .rodata* sections (constants, strings, etc.) */
    *(.glue_7)         /* glue arm to thumb code */
    *(.glue_7t)        /* glue thumb to arm code */
  *(.eh_frame)

    KEEP (*(.init))
    KEEP (*(.fini))

    . = ALIGN(4);
    _etext = .;        /* define a global symbols at end of code */
    _exit = .;
  } >FLASH

   .ARM.extab   : { *(.ARM.extab* .gnu.linkonce.armextab.*) } >FLASH
    .ARM : {
    __exidx_start = .;
      *(.ARM.exidx*)
      __exidx_end = .;
    } >FLASH

  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array*))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  } >FLASH
  .init_array :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT(.init_array.*)))
    KEEP (*(.init_array*))
    PROVIDE_HIDDEN (__init_array_end = .);
  } >FLASH
  .fini_array :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(.fini_array*))
    KEEP (*(SORT(.fini_array.*)))
    PROVIDE_HIDDEN (__fini_array_end = .);
  } >FLASH

  /* used by the startup to initialize data */
  _sidata = .;

  /* Initialized data sections goes into RAM, load LMA copy after code */
  .data : AT ( _sidata )
  {
    . = ALIGN(4);
    _sdata = .;        /* create a global symbol at data start */
    *(.data)           /* .data sections */
    *(.data*)          /* .data* sections */

    . = ALIGN(4);
    _edata = .;        /* define a global symbol at data end */
  } >RAM

  /* Uninitialized data section */
  . = ALIGN(4);
  .bss :
  {
    /* This is used by the startup in order to initialize the .bss secion */
    _sbss = .;         /* define a global symbol at bss start */
    __bss_start__ = _sbss;
    *(.bss)
    *(.bss*)
    *(COMMON)

    . = ALIGN(4);
    _ebss = .;         /* define a global symbol at bss end */
    __bss_end__ = _ebss;
  } >RAM

  /* User_heap_stack section, used to check that there is enough RAM left */
  ._user_heap_stack :
  {
    . = ALIGN(4);
    PROVIDE ( end = . );
    PROVIDE ( _end = . );
    . = . + _Min_Heap_Size;
    . = . + _Min_Stack_Size;
    . = ALIGN(4);
  } >RAM

  /* MEMORY_bank1 section, code must be located here explicitly            */
  /* Example: extern int foo(void) __attribute__ ((section (".mb1text"))); */
  .memory_b1_text :
  {
    *(.mb1text)        /* .mb1text sections (code) */
    *(.mb1text*)       /* .mb1text* sections (code)  */
    *(.mb1rodata)      /* read-only data (constants) */
    *(.mb1rodata*)
  } >MEMORY_B1

  /* Remove information from the standard libraries */
  /DISCARD/ :
  {
    libc.a ( * )
    libm.a ( * )
    libgcc.a ( * )
  }

  .ARM.attributes 0 : { *(.ARM.attributes) }
}

Makefiles themselves are beyond the scope of this post (and beyond the scope of my brain, in a lot of ways), but taking a peek at the Makefile used in the STM32F4 Demonstration firmware from above, there are  a boatload of dependencies from outside the project.  These are all CMSIS drivers, or required startup code.  This is a big part of why ARM chips are more difficult to use than ARM or PIC chips.  Fortunately, most manufacturers release some boilerplate startup code for their chips.  As long as it’s included in the makefile, everything will work as it should.

And those are the basics of going from nothing, to compiling demonstration code.  Right now, I’m looking heavily at the ChibiOS platform to develop some projects.  This abstracts a lot of the hardware stuff, although it isn’t nearly as simple as Arduino for getting up and running.  Hopefully with the upcoming release of the Arduino Due, there will be some development in making a really easy-to-use ARM platform.