007 14s exchange

Video Link: C++ Weekly - Ep 14 Standard Library Gems: next and exchange

Optimization Case

https://gitee.com/harmonyos/OpenArkCompiler/blob/master/src/maple_ir/src/mir_lower.cpp

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
BlockNode *MIRLower::LowerBlock(BlockNode &block) {
  auto *newBlock = mirModule.CurFuncCodeMemPool()->New<BlockNode>();
  BlockNode *tmp = nullptr;
  if (block.GetFirst() == nullptr) {
    return newBlock;
  }
  StmtNode *nextStmt = block.GetFirst();
  ASSERT(nextStmt != nullptr, "nullptr check");
  do {
    StmtNode *stmt = nextStmt;
    nextStmt = stmt->GetNext();
    // ...
  } while (nextStmt != nullptr);
  return newBlock;
}

其中do {} while中

1
2
    StmtNode *stmt = nextStmt;
    nextStmt = stmt->GetNext();

可以替换为

1
    StmtNode *stmt = std::exchange(nextStmt, stmt->GetNext());

从实招来，替换之后个人是有些犹豫的，很不直观，改为assign或许更能接受一些。个人觉得最好的语法应该恰似注释中的表达：

1
2
3
4
// a, b = b, a;
std::swap(a, b);
// a, b = b, c;
a = std::exchange(b, c);

这个得看c++语言的演进了，那就再对比下，使用std::exchange能带来性能上的好处。std::exchange可能实现如下^[1]：

1
2
3
4
5
6
7
template<class T, class U = T>
T exchange(T& obj, U&& new_value)
{
    T old_value = std::move(obj);
    obj = std::forward<U>(new_value);
    return old_value;
}

与不使用std::exchange的方式相比，应该通常不会去写std::move，示例如下：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
#include <utility>

int g_next = 0;
int next() __attribute__((noinline));
int next() {
  return g_next++;
}

int g_value;
int exchange_test(std::size_t count) __attribute__((noinline));
int exchange_test(std::size_t count) {
  int a = 0;
  for (std::size_t i = 0; i < count; i += 2) {
    int tmp = a;
    a = next();
    g_value += tmp + a;
  }
  return a;
}

int exchange_test2(std::size_t count) __attribute__((noinline));
int exchange_test2(std::size_t count) {
  int a = 0;
  for (std::size_t i = 0; i < count; i += 2) {
    int tmp = std::exchange(a, next());
    g_value += tmp + a;
  }
  return a;
}

std::size_t cnt;
int main(int argc, char* argv[]) {
  return exchange_test(cnt) + exchange_test2(cnt);
}

gcc9.3 -std=c++17 -O2生成汇编(godbolt)如下：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
exchange_test(unsigned long):
        test    rdi, rdi
        je      .L6
        xor     ecx, ecx
        xor     eax, eax
.L5:
        mov     esi, eax
        call    next()
        lea     edx, [rsi+1+rax]
        add     rcx, 2
        mov     DWORD PTR g_value[rip], edx
        cmp     rdi, rcx
        ja      .L5
        ret
.L6:
        xor     eax, eax
        ret
exchange_test2(unsigned long):
        xor     eax, eax
        test    rdi, rdi
        je      .L11
        xor     ecx, ecx
.L10:
        mov     esi, eax
        call    next()
        lea     edx, [rsi+1+rax]
        add     rcx, 2
        mov     DWORD PTR g_value[rip], edx
        cmp     rdi, rcx
        ja      .L10
        ret
.L11:
        ret

唯一的区别在于非std::exchange场景下，多了一条清零指令xor eax, eax。使用clang10指令数便一样多了。

结论

std::exchange表达方式并不非常清晰，性能上也没有明显的提升，且使用场景只有少数循环中需要备份历史，所以个人不是很推荐使用。

若上述场景中，a类型占用空间较大，需要显示std::move来避免拷贝，此时可以用std::exchange。或在泛型编程中，无法确定a类型时，使用std::exchange可以比较有效的减少啰嗦代码。

参考资料

[1] std::exchange. https://zh.cppreference.com/w/cpp/utility/exchange

Contents

Optimization Case

结论

参考资料