IAR Information Center for Arm

Important information
New features
Known problems
Program corrections
User guide corrections
Miscellaneous
Release history

Important information

IAR Embedded Workbench for Arm 9.30 uses an updated version of arm_mve.h.
- The new version can be included from both C and C++ (the version used in 9.20 could only be included from C).
- The new version does not affect the status of IAR extensions (the version used in 9.20 always enabled IAR extensions).
- The names of the built-in MVE intrinsics have been updated to match the new arm_mve.h. As long as your application only uses the publicly visible names (the ones defined in arm_mve.h) nothing changes, but if you refer directly to the built-in intrinsic functions, you must adapt to the new names.
Limitations of compiler support for Armv8.1-M
There are some limitations to the support for Armv8.1-M in this release, to be addressed in a future release:
- MVE: The Cortex-M vector extension can be utilized by means of intrinsic functions.
  There is currently no support for auto-vectorization. The header file cannot be included from C++ code. Some optimizations of MVE code are missing, in particular when a VCMP instruction is followed by VPST they should be combined into a VPT instruction.
- LOB: Limited matching of low-overhead loops.
  A candidate loop must have a loop counter that is counted down by 1 until it reaches zero. The presence of an outer loop can lead to an inner loop not being matched. Other optimizations (most notably loop unrolling) can lead to a loop not being matched.
- CMSE: The Armv8.1-M register FPCXT_NS is not restored by a non-secure entry function.
- _Float16: Some operations on _Float16 (such as plus and minus) are implemented by converting the input operands to float, using the corresponding instruction for float, and converting the result to _Float16.
- The conditional instructions from Armv8-A are not utilized (CSEL, CSET, CNEG).
Characteristics for the toolset in AARCH64 mode:
- The toolset, by default, supports generating code and data that is situated in one space that has a maximum size of 4 Gbytes. This is because static data will be accessed with an addressing mode that reaches +/- 4 Gbytes. Mechanisms exist to jump/call between such spaces.
- The produced code, by default, runs in execution state EL1.
Limitations:
- Big-endian code is not supported.
- Position-independent code is not supported.
- V8-A AARCH32 is not fully implemented. The 32-bit implementation is currently based on v7-A.
Compiler MISRA C C:1998/C:2004 support is removed in version 9.10
The Compiler MISRA C is still available through the compiler command line but will be removed in a future release.
Refer to IAR C-STAT for full MISRA C support.
Changes in implementation of CMSIS intrinsics in version 8.20
The implementation of the CMSIS intrinsic interface is no longer based on IAR's intrinsics.h. As a consequence of that some intrinsics that was previously declared when the CMSIS header was included are no longer declared.

Examples of these intrinsics include __LDREX(), __STREX() and __enable_interrupt().
Changed size of wchar_t in version 8.10 and later
Object files following the ARM ABI has a runtime attribute indicating the size of wchar_t.

In EWARM version 7.80 and earlier, the size of wchar_t was 2 bytes wide and the runtime attribute was set accordingly.

In EWARM version 8.10 and later, wchar_t is 4 bytes wide.
If you have implemented the time() function, you must rename it into __time32(). For more information see the Development guide.
A special note on CMSIS integration:

If your application source code includes CMSIS header files explicitly, then you should not select Project>Options...>General Options>Library Configuration>Use CMSIS. Some of the Cortex-M application examples include CMSIS source files explicitly. Do not select the option Use CMSIS in these projects.
Deprecated features
- --interwork
  Future versions of the IAR C/C++ Compiler for ARM will assume --interwork when generating code for the ARMv4T architecture. There will be no option to generate non-interworking code for ARMv4T.

New features

New supported architectures
- The support for Armv8-M has been extended with instructions and intrinsic functions for CDE (Custom Datapath Extension).
- The support for Armv8-A has been extended to cover up to revision 4 (Armv8.4-A). This also enables support for Armv8-R AArch64 (needed for Cortex-R82 support)
- The support for Cortex-R82 also includes a variant without FPU, Cortex-R82.no_fp.
Diagnostic message Ta023 has now been promoted to a soft error: Error[Ta023]: Call to a non __ramfunc function from within a __ramfunc
The main use for __ramfunc is together with a flash driver (or similar), where you need to make sure that no code in flash is called by the driver. For other purposes it is fairly easy to place a function in RAM, using, for example, a named section. If you still want to use __ramfunc for other purposes, you can use --diag_supress to eliminate the error. (This was also needed before, to eliminate the warning.)
Libraries and library selection has changed
- With the introduction of the new Libc++ library which supports C++17, the C++ library has been split into three parts. For more information, see the section Prebuilt runtime libraries in the IAR C/C++ Development Guide.
- The support for Cortex-R82 and Cortex-R82.no_fp (currently the only supported 64-bit Arm configuration without an FPU) means that v has been added to many library names, to indicate FPU support. When linking for Cortex-R82.no_fp, libraries without v in the name are selected. (This is consistent with how libraries with/without FPU support are named for 32-bit Arm).
- In C++17, some functionality that was deprecated in C++14 is now removed. You can define the preprocessor symbol _LIBCPP_ENABLE_CXX17_REMOVED_FEATURES to enable support for these features when using the Libc++ library.
- The composition and naming conventions of some prebuilt library files for C++ library functions have changed.
The compiler now defines the preprocessor symbols __STDC_IEC_559__ and __STDC_IEC_559_COMPLEX__, in compliance with the IEC 60559 floating-point standard in the C standard. They were mistakenly not defined in earlier versions of the compiler.
Additional GNU C language extensions
- The GCC typeof operator is now supported. Example: typeof(expr) var;
- The GCC C extension "Cast to Union Type" is now supported. Example: z = (union foo) x;
These compiler options have been added:
- --libc++
  Makes the compiler and linker use the Libc++ library.
- --no_normalize_file_macros
  Disables normalization of paths in the symbols __FILE__ and __BASE_FILE__.
- --warn_about_incomplete_constructors
  Makes the compiler warn about constructors that do not initialize all members.
- --warn_about_missing_field_initializers
  Makes the compiler warn about fields without explicit initializers.
These predefined preprocessor symbols have been added:
- __LIBCPP
  Defined when the Libc++ library is used.
- _LIBCPP_ENABLE_CXX17_REMOVED_FEATURES
  Adds support for deprecated C++17 features.
- #include_next
  Searches for a file only in the directories on the search path that follow the directory in which the current source file is found.
#pragma once has been added. It prevents a header file from being processed more than once.
The GCC-style attribute transparent_union is no longer supported.
New warning Pa205 "implicit conversion from float to double" for systems with a single-precision only FPU. The warning is an alert that a double-precision library implementation will be used for the operation.
C++ designated initializers and incomplete constructors now complies to the C++20 standard. See the documentation on the compiler options --warn_about_missing_field_initializers and --warn_about_incomplete_constructors for more information.

Known problems

[EWARM-10092] The CMSIS DSP library for the Cortex-M55 MVE processor defines the vector type f16x8_t. f16x8_t is a standard float16x8_t vector, but with an alignment of 2 (float16x8_t has an alignment of 8). The IAR C/C++ Compiler for Arm version 9.30 does not support an alignment of 2 for vectors. Attempting to access a float16_t pointer that is not 4-byte aligned for load from/store to a f16x8_t vector will result in a bus fault error.
[EWARM-9298] In some cases, when using NEON intrinsics with vector parameters, the compiler might generate an internal error.
[EWARM-7572, TPB-3377] The compiler can be extremely slow when compiling code that contains structs with hundreds of fields.
[EWARM-6667, TPB-3086] The compiler can cluster variables that are initialized by copy and zero-initialized variables with static storage duration. When the total size of the variables initialized by copy is small compared to the total size of the zero-initialized variables, and if compressed initializers are not used, this can create a significant size overhead.
[EWARM-5239, EW25660] Passing a parameter of type va_list to a C++ function, where the caller is defined in one object file and the callee in another, will result in a linker error if one of the two objects is built with EWARM 7.20 (or newer) and the other is built with EWARM 7.10 (or older).
[EWARM-4824, EW24720] MISRA-C:2004 rule 9.1 will not find all used uninitialized local variables.

Program corrections

[EWARM-10021, TPB-3652] On optimization level High and above, the analysis in the compiler can fail to infer all accesses to fields in struct variables. This can make the compiler erroneously generate an internal error.
[EWARM-10009, TPB-3650] The compiler can generate incorrect exception tables in some switch statements. This can result in crashes or worse if the switch statement is exited via an exception. The problem only occurs when there are destructors to be run for some objects local to blocks in the switch statement.
[EWARM-9994, TPB-3646] The compiler can enter an infinite loop for code that uses a character pointer to write to an un-initialized boolean variable.
[EWARM-9985, TPB-3645]
On optimization levels Medium and High, the compiler can generate incorrect code when a function declared __weak accesses module-local variables with static storage duration. The error is not confined to the weak function, and can trigger in any function that accesses one or more of the module-local variables that are accessed by the weak function. A triggering example is shown below.
```
        static int a = 0;
        int b = 0;
        extern int c, d, e;
        extern void g(void);
        __weak int h(void)
        {
          return a;
        }
        void f(void)
        {
          a = c / e;
          b = d * a;
          g();
        }
        
```
[EWARM-9980, TPB-3648] On optimization levels Medium and High, the compiler can generate incorrect code for arrays that are accessed with different element sizes in the same function (typically char and another scalar type). For an array, for example int test[NUM], the compiler can fail to identify that test and (char*)test refer to the same array, and treat them as two separate entities when optimizing, changing the order of loads and stores of one of them without regard to the other one.

[EWARM-9977, TPB-3644]

On optimization level High, the compiler can generate incorrect code for functions that contain nested loops or if statements, where a single field in a struct pointed to by a pointer is updated several times, and where one of the branches in the nested statements changes the value of the pointer. A triggering example is shown below. In this example, the compiler will erroneously insert a read of msg->error after the assignment msg=0.

        typedef struct message_s
        {
          uint32_t error;
          uint32_t id;
          void *   handle;
        } message_t;
        message_t *notify(void * handle, message_t * msg)
        {
          if (msg->handle == handle)
          {
            if (msg->id == 36)
            {
              msg->error = 0;
            }
            else if (!f(msg))
            {
              msg->error = 12;
            }
            else
            {
              msg = 0;
            }
          }
          else
          {
            msg->error = 22;
          }
          return msg;
        }

[EWARM-9974, TPB-3638] The compiler generates incorrect Dwarf .debug_aranges sections. There should be internal padding to ensure that the tuples are aligned correctly, but there isn't.
.debug_aranges sections can be used by debuggers to accelerate lookup in some cases. The IAR C-SPY Debugger is not affected by this problem.
[EWARM-9869, TPB-3631] On optimization levels Medium and High, the compiler can generate incorrect initialization of C++ variables with static storage duration declared as inline. The problem can occur when the constructor contains only simple assignments and is placed in one individual module.

[EWARM-9849, TPB-3629]

In a template class with an integer template parameter, referring to the class itself using an enumeration constant that is the same as the template parameter can fail.
For example:

        template<int ABC>
        struct A {
          enum { DEF = ABC };
          template<int BAR> static void foo() {}
          static void FFF() {
            using C = A<DEF>;
            C::foo<0>(); // Error here
          }
        };

[EWARM-9784, TPB-3619] On optimization level High, when enabling exceptions, the compiler can generate incorrect code due to an error in the static-to-auto conversion optimization.

This code can trigger the issue:

// Let problematic_variable be a static variable.
        // Let c1, c2, c3 and c4 are bool.
        void problematic_function()
        {
          if (c1)
          {
            switch (problematic_variable)
            {
            case 0:
              problematic_variable = 0;
              break;
            case 1:
              if (c2)
              {
                problematic_variable = 1;
              }
              else if (c3)
              {
                if (c4)
                {
                  // exeptions wil be handled by jump-on-exception
                  function_call();
                }
                else
                {
                  problematic_variable = 2;
                }
              }
              else
              {
                problematic_variable = 3;
              }
              break;
            }
          }
          return;
        }

Static-to-auto conversion can place the frequently used static variable problematic_variable into a temporary variable temp for a limited period of time. A version of the code during the optimization process could be:

        void problematic_function()
        {
          temp = problematic_variable; // added by static to auto convention
          if (c1)
          {
            switch (temp)
            {
            case 0:
              temp = 0;                       // added by static to auto convention
              break;
            case 1:
              if (c2)
              {
                temp = 1;                   // added by static to auto convention
              }
              else if (c3)
              {
                if (c4)
                {
                  problematic_variable = temp;  // added by static to auto convention
                  // exeptions wil be handled by jump-on-exception
                  function_call();
                }
                else
                {
                  temp = 2;               // added by static to auto convention
                }
              }
              else
              {
                temp = 3;                  // added by static to auto convention
              }
              break;
            }
          }
          temp = problematic_variable; // added by static to auto convention
          problematic_variable = temp; // added by static to auto convention
          return;
        }

Due to the two instructions added at the end, most of the code in the switch statement is wrongly optimized away.

[EWARM-9780] When compiling C++ code for Cortex-M0 and including the <atomic> header, the compiler issues the message "Error[Pe020]: identifier is undefined" 44 times, instead of simply reporting that C++ atomic is not supported for Cortex-M0.

[EWARM-9774, TPB-3618]

On optimization level High, the compiler can generate incorrect code due to an error in the static-to-auto conversion optimization. The problem can trigger for loops iterating over an array where the current array element is assigned multiple times during one iteration, as in the example below. The effect of the error, is that the memory location immediately after the array is read and possibly modified.

        static _Bool x[2] = { 1, 0 };
        void f(void)
        {
          static int y[2];
          for(unsigned int i = 0u; i < 2u; i++)
          {
            switch(y[i])
            {
            case 1:
              if(!x[i])
              {
                y[i] = 2;
              }
              break;
            case 2:
              if(!x[i])
              {
                y[i] = 1;
              }
              break;
            default:
              y[i] = 0;
              break;
            }
          }
        }

[EWARM-9718, TPB-3609] An if statement that has a floating-point compare with a floating-point constant that is subnormal or very close to subnormal (one or two epsilons above), produces incorrect code.
[EWARM-9709] The source file memset_s.c incorrectly refers to memcpy_s in an error message.

[EWARM-9694, TPB-3610]

On optimization level High, the compiler can generate incorrect code for loops with multiple exits that read from the same array, indexed both by the loop counter and the loop counter plus one.

        uint32_t test_func(uint32_t i)
        {
          uint32_t k;
          for (k = i + 1; k < (TEST_BUFF_LEN - 1); k = k + 1) {
            if (test_var_buff[k] == test_var_buff[k+1]) {
              return 2;
            }
          }
          return 3;
        }

[EWARM-9614, TPB-3596]
The compiler can terminate with an internal error when processing the declaration of a multi-dimensional array of a forward-declared enum type that is later defined.
Example:
```
        enum E;
        extern enum E arr[3][2];
        enum E { a, b };
        
```
[EWARM-9251, TPB-3583] The compiler cannot leverage constant values in auto variable struct members. This causes missed optimization opportunities for inlining and loop unrolling.
[EWARM-9066] In AArch64 mode, for functions declared _task or noreturn, the compiler incorrectly preserves callee-saved registers (X19-X29 and (D8-D15), and in the case of _noreturn, also the link register (LR/X30).
[EWARM-7413] In threaded applications, the linker option --manual_dynamic_initialization now also suppresses automatic initialization of the main thread's thread local variables, to avoid unnecessary dependence on the table driven initialization for applications that want entirely manual initialization.

User guide corrections

None.

Miscellaneous

Available workarounds for device erratas:
- CVE-2021-35465
  A VLLDM instruction Security Vulnerability affects Arm Cortex-M33 r0p0 to r1p0, Arm Cortex-M35P r0, Arm Cortex-M55 r0p0 to r1p0, and Arm China STAR-MC1 (STAR SE configuration).
  On these processors, any Armv8-M Secure software that uses FPU or Helium instructions and that calls Non-secure functions might be affected.
  
  In EWARM 9.20.1 and later, the compiler avoids this vulnerability. The workaround is enabled by default, but can be disabled using the command line option --enable_hardware_workaround no-fix-cmse-cve-2021-35465.
- ARM Cortex-M3 errata 463764
  Core might freeze for SLEEPONEXIT single instruction ISR. More information is available on infocenter.arm.com.
  Workaround generated for functions with attribute __irq with iccarm --enable_hardware_workaround=arm463764. Supported from EWARM 5.41.
- ARM Cortex-M3 errata 602117
  LDRD with base in list might result in incorrect base register when interrupted or faulted. From EWARM 5.20.3 the compiler/library avoids the LDRD instruction with the base register in list.
- ARM Cortex-M3 errata 752419
  ARM Cortex-M4 errata 752770
  Interrupted loads to SP can cause erroneous behaviour. From EWARM 6.21 the compiler/library does not generate LDR SP instructions with writeback to Rn. Otherwise we allow the extra reads because the stack resides in RAM where multiple reads are acceptable.
- ARM Cortex-M4 errata 776924
  VDIV or VSQRT instructions might not complete correctly when very short ISRs are used. IAR recommends the second workaround proposed by Arm: "Ensure that every interrupt service routine contains more than 2 instructions in addition to the exception return instruction." The background is that the compiler is unaware of interrupts since the Cortex-M architecture does not distinguish between ordinary functions and interrupt functions.
- ARM Cortex-M7 errata 833872
  Flag setting instructions inside an IT block might cause incorrect execution of subsequent instructions. From EWARM 7.40, the compiler will the skip the IT transformation on this particular code pattern.
- ARM Cortex-M3 errata 838469
  ARM Cortex-M4 errata 838869
  Store immediate overlapping exception return operation might vector to incorrect interrupt. Follow the guidelines in the errata and implement the workaround proposed by ARM by using __DSB(void) in applicable cases.
- Functional problem Core.1 in NXP device LPC2478: Incorrect update of the Abort Link register in Thumb state.
  Workaround generated with iccarm --enable_hardware_workaround=NXP_Core.1
- Functional problem in Stellaris devices: Non-word-aligned write to SRAM can cause an incorrect value to be loaded. More information is available on the Stellaris web site at www.ti.com/stellaris.
  Workaround generated with iccarm --enable_hardware_workaround=LM3S_NWA_SRAM_Write
- Functional problem in Freescale Semiconductors MC9328MX1 (i.MX1), masks 0L44N, 1L44N, and 2L44N:
  The LDM instruction will in some cases not load the second register correctly. Workaround generated with iccarm --enable_hardware_workaround=920t-ldm2
  NOTE: The libraries in the current EWARM version are not built with this workaround. Use EWARM 6.50.6 and linker option --enable_hardware_workaround=920t-ldm2 to use libraries built with this hardware workaround.
RTOS Threads and TLS
The inc\c\DLib_Threads.h header file contains support for locks and thread-local storage (TLS) variables. This is useful for implementing thread support. For more information, see the header file.
va_args
The implementation of va_args functions has changed in IAR Embedded Workbench for ARM 7.20.1. It is no longer possible to compile the output of the preprocessor from an earlier version of the compiler. The original source code must be preprocessed again, using IAR Embedded Workbench for ARM 7.20.1.

Release history

release history