IAR Information Center for Arm

Important information
New features
Known problems
Program corrections
User guide corrections
Miscellaneous
Release history

Important information

IAR Embedded Workbench for Arm 9.30 uses an updated version of arm_mve.h.
- The new version can be included from both C and C++ (the version used in 9.20 could only be included from C).
- The new version does not affect the status of IAR extensions (the version used in 9.20 always enabled IAR extensions).
- The names of the built-in MVE intrinsics have been updated to match the new arm_mve.h. As long as your application only uses the publicly visible names (the ones defined in arm_mve.h) nothing changes, but if you refer directly to the built-in intrinsic functions, you must adapt to the new names.
Limitations of compiler support for Armv8.1-M
There are some limitations to the support for Armv8.1-M in this release, to be addressed in a future release:
- MVE: The Cortex-M vector extension can be utilized by means of intrinsic functions.
  There is currently no support for auto-vectorization. The header file cannot be included from C++ code. Some optimizations of MVE code are missing, in particular when a VCMP instruction is followed by VPST they should be combined into a VPT instruction.
- LOB: Limited matching of low-overhead loops.
  A candidate loop must have a loop counter that is counted down by 1 until it reaches zero. The presence of an outer loop can lead to an inner loop not being matched. Other optimizations (most notably loop unrolling) can lead to a loop not being matched.
- CMSE: The Armv8.1-M register FPCXT_NS is not restored by a non-secure entry function.
- _Float16: Some operations on _Float16 (such as plus and minus) are implemented by converting the input operands to float, using the corresponding instruction for float, and converting the result to _Float16.
- The conditional instructions from Armv8-A are not utilized (CSEL, CSET, CNEG).
Characteristics for the toolset in AARCH64 mode:
- The toolset, by default, supports generating code and data that is situated in one space that has a maximum size of 4 Gbytes. This is because static data will be accessed with an addressing mode that reaches +/- 4 Gbytes. Mechanisms exist to jump/call between such spaces.
- The produced code, by default, runs in execution state EL1.
Limitations:
- Big-endian code is not supported.
- Position-independent code is not supported.
- V8-A AARCH32 is not fully implemented. The 32-bit implementation is currently based on v7-A.
Compiler MISRA C C:1998/C:2004 support is removed in version 9.10
The Compiler MISRA C is still available through the compiler command line but will be removed in a future release.
Refer to IAR C-STAT for full MISRA C support.
Changes in implementation of CMSIS intrinsics in version 8.20
The implementation of the CMSIS intrinsic interface is no longer based on IAR's intrinsics.h. As a consequence of that some intrinsics that was previously declared when the CMSIS header was included are no longer declared.

Examples of these intrinsics include __LDREX(), __STREX() and __enable_interrupt().
Changed size of wchar_t in version 8.10 and later
Object files following the ARM ABI has a runtime attribute indicating the size of wchar_t.

In EWARM version 7.80 and earlier, the size of wchar_t was 2 bytes wide and the runtime attribute was set accordingly.

In EWARM version 8.10 and later, wchar_t is 4 bytes wide.
If you have implemented the time() function, you must rename it into __time32(). For more information see the Development guide.
A special note on CMSIS integration:

If your application source code includes CMSIS header files explicitly, then you should not select Project>Options...>General Options>Library Configuration>Use CMSIS. Some of the Cortex-M application examples include CMSIS source files explicitly. Do not select the option Use CMSIS in these projects.
Deprecated features
- --interwork
  Future versions of the IAR C/C++ Compiler for ARM will assume --interwork when generating code for the ARMv4T architecture. There will be no option to generate non-interworking code for ARMv4T.

New features

None

Known problems

[EWARM-10547, TPB-3703] When the packed attribute is used, the compiler can in some cases emit:
Warning[Pa039]: use of address of unaligned structure member
even if the structure member is of size 1.
[EWARM-10377] When running, the compiler uses both a compiler and a linker license slot.
[EWARM-10359] When compiling code containing uses of the MVE intrinsics vcaddq_rot90_m_f32, vcaddq_rot90_m_s32, vcaddq_rot90_m_u32, vcaddq_rot270_m_f32, vcaddq_rot270_m_s32, or vcaddq_rot270_m_u32, the compiler can attempt to use the same register as both the first and the third register; such code is CONSTRAINED_UNPREDICTABLE and generates an internal error.
[EWARM-9298] In some cases, when using NEON intrinsics with vector parameters, the compiler might generate an internal error.
Workaround: If an internal error is caused by the usage of a Neon intrinsic, it can in some cases help to switch to an optimization level higher than None.

[EWARM-8778, TPB-3522] Constant-evaluation of some constexpr aggregate copy operations can result in error Pe028 ("expression must have a constant value").

Example:

        struct X {
          constexpr X(int i) : i(i) { }
          constexpr X(X const &x) : i(x.i) { }
          int i;
        };
        struct A {
          X m[1];
        };
        constexpr X x{1};
        constexpr A a1{x};
        constexpr A a2 = a1; // an error is issued for the copy
        assignment

[EWARM-7572, TPB-3377] The compiler can be extremely slow when compiling code that contains structs with hundreds of fields.
Workaround: Avoid structs with many fields.
[EWARM-6667, TPB-3086] The compiler can cluster variables that are initialized by copy and zero-initialized variables with static storage duration. When the total size of the variables initialized by copy is small compared to the total size of the zero-initialized variables, and if compressed initializers are not used, this can create a significant size overhead.
Workaround: Specify use of compressed initializers in the linker configuration file or turn off static clustering on the module.
[EWARM-5239, EW25660] Passing a parameter of type va_list to a C++ function, where the caller is defined in one object file and the callee in another, will result in a linker error if one of the two objects is built with EWARM 7.20 (or newer) and the other is built with EWARM 7.10 (or older).
[EWARM-4824, EW24720] MISRA-C:2004 rule 9.1 will not find all used uninitialized local variables.

Program corrections

[EWARM-10704, TPB-3724] On optimization level High, the compiler can generate incorrect code when a multi-dimensional array inside a nested loop is indexed both by the loop variable in one of the loops, and an element in a local array that in turn is indexed by the loop variable in another loop. The problem only triggers if this exact element access occurs at least twice in the same inner loop. A triggering example is shown below. These triggering conditions are the two accesses to v[i][index[j]]:

        
        #include <stdio.h>
        #include <stdint.h>
        
        uint8_t v[2][2] = {0};
        
        void test(void)
        {
          uint8_t i, j;
          uint8_t index[8] = {0, 1};
          uint8_t status = 0xE2;
        
          for (i = 0; i < 2; i++)
          {
            for (j = 0; j < 8; j++)
            {
              if (status & (1 << j))
              {
                v[i][index[j]] = 8;
              }
              else
              {
                v[i][index[j]] = 9;
              }
            }
          }
        }
        
        int main(void)
        {
          test();
          printf("%d %d\n", v[0][0], v[1][0]);
        }

[EWARM-10693, TPB-3725] For bitfields, the compiler fails to honor requests for downgraded debug information.
[EWARM-10660, TPB-3716, TPB-3722]
The compiler might crash if it encounters a dynamic initialization of a lambda which is sufficiently complex. A lambda is sufficiently complex when, for example, it captures a (sufficiently) non-trivial class by value. A dynamic initialization of a lambda occurs when, for example, it is captured by value by another lambda. For example:
```
        
        struct b {
          ~b(); // sufficiently non-trivial class
        };
        
        void e() {
          auto d = [arg = b{}] {};
          [d] {}; // dynamic initialization of sufficiently complex lambda d
        }
```

[EWARM-10597, TPB-3708]

On optimization level High, the compiler can generate incorrect code when variables with static lifetime read or written in functions might call outside the translation unit. The observable effect is that changes to the variable are moved across calls to external functions that can call back to the module via the externally visible functions. In this example, the update of DataCount in the loop in DataInit() is moved to a position after the call to UseData(), even though the call might access DataCount by calling any of the functions accessible outside of the translation unit.

        
        #include <stdbool.h>
        #include <string.h>
        #include <stdint.h>
        #include <stddef.h>
        
        
        static const uint32_t Data[] = {1u, 2u, 3u, 4u, 5u, 20u, 6u, 0u /* Terminator */};
        static size_t DataCount = 0u; // will be updated within DataInit
        
        extern void SomeExternalFunction(void);
        extern bool UseData(void);
        
        
        static bool DataIsValid(size_t Index)
        {
            if (Data[Index] != 0u)
            {
                return true;
            }
            else
            {
                return false;
            }
        }
        
        
        bool DataInit(void)
        {
            // Count Data
            for (DataCount = 0u; DataIsValid(DataCount); DataCount++)
            {
                // Scan Data
            }
        
            return UseData();
            // UseData() is an external function that is allowed to access Data by using 
        }
        
        
        static const uint32_t *GetDataPointer(size_t Index)
        {
            if (Index > DataCount)
            {
                Index = DataCount - 1u; // use last Value
            }
        
            return &Data[Index];
        }
        
        
        static size_t GetDataIndex(void)
        {
            return 5;  // this index should resolve to 20u in the defined array
        }
        
        
        static uint32_t GetData(uint32_t ua)
        {
            uint32_t DataIndex = GetDataIndex();
        
            const uint32_t *p_Data = GetDataPointer(DataIndex);
        
            SomeExternalFunction(); // Nothing to do with the data, but required to trigger the bug!
        
            return *p_Data;
        }
        
        
        bool CheckData(uint32_t uAddress)
        {
            // The general purpose of this module is to Access Data at some Address.
            // This Demo instead ignores the address and instead uses a fixed entry within the Data Array.
            uint32_t Value = GetData(uAddress);
        
            if (Value > 10u)
            {
                return true;    // this is expexted to happen
            }
            else
            {
                return false;   // this actually happens
            }
        }
        
        
        int main()
        {
            DataInit();
        }

[EWARM-10488, TPB-3697] The compiler unnecessarily enforces the C++20 rules for designated initializers in C++, that is, not accepting array initializers, chained initializers, and field initializers that are not in field order.

[EWARM-10482, TPB-3696]

The compiler can terminate with an internal error ("Access violation") in cases involving assignment of a compound literal to a class containing a multi-dimensional character array, where the initializer for the array is a string literal.

Example:

        
        struct R {
          char para2[32][5];
        };
        
        struct S {
          R r;
        };
        
        struct T {
          T();
          S s;
        };
        
        T::T()
        {
          s = {{{""}}};
        }

[EWARM-10372, TPB-3679]

On optimization level High, the compiler can generate incorrect code when fields in static structs are read or written in functions that are visible outside the module. The observable effect is that changes to the field are moved across calls to external functions that can call back to the module via the externally visible functions. In this example, the update value.a = 42 is moved to a position after the call to call_external() even though call_external() might access value.a by calling read_value() .

        
        #include <stdint.h>
        #include <stdio.h>
        
        static struct {
          int a;
          int b;
        } value;
        
        int predicate(int x);
        void call_external(void);
        
        int read_value(void)
        {
          return value.a;
        }
        
        void  do_stuff(void)
        {
          if(predicate(value.a))
          {
            value.a = 42;
            call_external();
          }
          else
          {
            value.a = 43;
          }
        }

[EWARM-10365, TPB-3677]
On optimization level High, the compiler can generate incorrect code for expressions inside loops where the expression is a sum of two linear expressions in the loop variable. In this example, the compiler will miscompile the expression (scale - y) * O1 + y * O2 as it is the sum of (scale-y)*O1 and y*O2.
```
        
        for (int32_t y = 0; y < scale; y++) {
          const int32_t T0 = ((scale - y) * O1 + y * O2) / scale_square;
           O[j] = T0;
            j += O_cols;
        }
```
[EWARM-10308] The compiler does not guarantee correct alignment of the stack pointer when a non-secure state is entered (via a function pointer). If the stack pointer is not correctly aligned, an alignment fault might occur.
[EWARM-10303, TPB-3673] In rare cases, where access to data is via a pointer and the pointer variable is updated, the compiler might generate code that does not write to, or read from, the pointed to values correctly.
[EWARM-10285, TPB-3671] Some pragma directives, for example #pragma call_graph_root, can only be used on a definition. Trying to use one of these pragma directives in another situation results in an internal error instead of the desired diagnostic.
[EWARM-10257] The double precision versions of isnan() and isinf() for 32-bit Arm, only check the upper 32 bits of the floating point to deduce whether it is NaN or Inf.
[EWARM-10219, TPB-3662, TPB-3666] In some C++ cases involving multi-file compilation, templates, and in-class field initializers, the compiler can terminate with an internal error:
assertion failed at: "trans_copy.c", line 1403 in prepare_for_trans_unit_copy.

User guide corrections

The chrono::steady_clock class in not available in Libc++.
New intrinsic function: __get_return_address() returns the return address of the current function. This is the value that was in the link register (LR) when the current function was called. In this context, an inlined function is never "current", instead the function that it was inlined into is the current function.

Miscellaneous

Available workarounds for device erratas:
- CVE-2021-35465
  A VLLDM instruction Security Vulnerability affects Arm Cortex-M33 r0p0 to r1p0, Arm Cortex-M35P r0, Arm Cortex-M55 r0p0 to r1p0, and Arm China STAR-MC1 (STAR SE configuration).
  On these processors, any Armv8-M Secure software that uses FPU or Helium instructions and that calls Non-secure functions might be affected.
  
  In EWARM 9.20.1 and later, the compiler avoids this vulnerability. The workaround is enabled by default, but can be disabled using the command line option --enable_hardware_workaround no-fix-cmse-cve-2021-35465.
- ARM Cortex-M3 errata 463764
  Core might freeze for SLEEPONEXIT single instruction ISR. More information is available on infocenter.arm.com.
  Workaround generated for functions with attribute __irq with iccarm --enable_hardware_workaround=arm463764. Supported from EWARM 5.41.
- ARM Cortex-M3 errata 602117
  LDRD with base in list might result in incorrect base register when interrupted or faulted. From EWARM 5.20.3 the compiler/library avoids the LDRD instruction with the base register in list.
- ARM Cortex-M3 errata 752419
  ARM Cortex-M4 errata 752770
  Interrupted loads to SP can cause erroneous behaviour. From EWARM 6.21 the compiler/library does not generate LDR SP instructions with writeback to Rn. Otherwise we allow the extra reads because the stack resides in RAM where multiple reads are acceptable.
- ARM Cortex-M4 errata 776924
  VDIV or VSQRT instructions might not complete correctly when very short ISRs are used. IAR recommends the second workaround proposed by Arm: "Ensure that every interrupt service routine contains more than 2 instructions in addition to the exception return instruction." The background is that the compiler is unaware of interrupts since the Cortex-M architecture does not distinguish between ordinary functions and interrupt functions.
- ARM Cortex-M7 errata 833872
  Flag setting instructions inside an IT block might cause incorrect execution of subsequent instructions. From EWARM 7.40, the compiler will the skip the IT transformation on this particular code pattern.
- ARM Cortex-M3 errata 838469
  ARM Cortex-M4 errata 838869
  Store immediate overlapping exception return operation might vector to incorrect interrupt. Follow the guidelines in the errata and implement the workaround proposed by ARM by using __DSB(void) in applicable cases.
- Functional problem Core.1 in NXP device LPC2478: Incorrect update of the Abort Link register in Thumb state.
  Workaround generated with iccarm --enable_hardware_workaround=NXP_Core.1
- Functional problem in Stellaris devices: Non-word-aligned write to SRAM can cause an incorrect value to be loaded. More information is available on the Stellaris web site at www.ti.com/stellaris.
  Workaround generated with iccarm --enable_hardware_workaround=LM3S_NWA_SRAM_Write
- Functional problem in Freescale Semiconductors MC9328MX1 (i.MX1), masks 0L44N, 1L44N, and 2L44N:
  The LDM instruction will in some cases not load the second register correctly. Workaround generated with iccarm --enable_hardware_workaround=920t-ldm2
  NOTE: The libraries in the current EWARM version are not built with this workaround. Use EWARM 6.50.6 and linker option --enable_hardware_workaround=920t-ldm2 to use libraries built with this hardware workaround.
RTOS Threads and TLS
The inc\c\DLib_Threads.h header file contains support for locks and thread-local storage (TLS) variables. This is useful for implementing thread support. For more information, see the header file.
va_args
The implementation of va_args functions has changed in IAR Embedded Workbench for ARM 7.20.1. It is no longer possible to compile the output of the preprocessor from an earlier version of the compiler. The original source code must be preprocessed again, using IAR Embedded Workbench for ARM 7.20.1.

Release history

release history