이름공간
변수

번환의 단계

cppreference.com

C++ 소스 파일은 정확히 아래의 단계와 같이 컴파일러에서 처리됩니다.

단계 1

1) The individual bytes of the source code file are mapped (in implementation-defined manner) to the characters of the basic source character set. In particular, OS-dependent end-of-line indicators are replaced by newline characters. The basic source character set consists of 96 characters:
a) 5개의 공백문자: (space, horizontal tab, vertical tab, form feed, new-line)
b) 10개의 숫자: '0'부터 '9'까지
c) 52개의 문자: 'a'부터 'z', 그리고 'A'부터 'Z'까지
d) 29개의 특수문자: _ { } [ ] # ( ) < > % : ; . ? * + - / ^ & | ~ ! = , \ " '
2) Any source file character that cannot be mapped to a character in the basic source character set is replaced by its universal character name (escaped with \u or \U) or by some implementation-defined form that is handled equivalently.
3) Trigraph sequences are replaced by corresponding single-character representations.
(until C++17)

단계 2

1) Whenever backslash appears at the end of a line (immediately followed by the newline character), both backslash and newline are deleted, combining two physical source lines into one logical source line. This is a single-pass operation; a line ending in two backslashes followed by an empty line does not combine three lines into one. If a universal character name (\uXXX) is formed in this phase, the behavior is undefined.
2) If a non-empty source file does not end with a newline character after this step (whether it had no newline originally, or it ended with a backslash), the behavior is undefined (until C++11)a terminating newline character is added (since C++11).

단계 3

1) 소스 파일은 주석, 일련의 공백 문자 (space, horizontal tab, new-line, vertical tab, and form-feed), 그리고 아래와 같은 전처리 토큰들로 분해되어 집니다.:
a) <iostream>"myfile.h"같은 헤더 이름들(#include 이후에 적혀있는 것들만)
b) 식별자들
c) 전처리 숫자들
d) character and string literals , including user-defined (since C++11)
e) operators and punctuators (including alternative tokens), such as +, <<=, new, <%, ##, or and
f) individual non-whitespace characters that do not fit in any other category
2) Any transformations performed during phases 1 and 2 between the initial and the final double quote of any raw string literal are reverted.
(since C++11)
3) 각 주석은 공백 문자 하나로 바뀌어집니다.

Newlines are kept, and it's unspecified whether non-newline whitespace sequences may be collapsed into single space characters.

단계 4

1) 전처리기가 실행됩니다.
2) #include 명령어로 불러진 각 파일들을 재귀적으로 단계 1부터 단계 4까지 거치게 합니다.
3) 이 단계의 끝에서, 모든 전처리 명령어는 소스 파일에서 제거됩니다.

단계 5

1) 소스에 있는 character literalsstring literals의 모든 문자들을 실행 가능한 문자 집합으로 변환합니다. (which may be a multibyte character set such as UTF-8, as long as the 96 characters of the basic source character set listed in phase 1 have single-byte representations).
2) Escape sequences and universal character names in character literals and non-raw string literals are expanded and converted to the execution character set. If the character specified by a universal character name isn't a member of the execution character set, the result is implementation-defined, but is guaranteed not to be a null (wide) character.

Note: the conversion performed at this stage can be controlled by command line options in some implementations: gcc and clang use -finput-charset to specify the encoding of the source character set, -fexec-charset and -fwide-exec-charset to specify the encodings of the execution character set in the string and character literals that don't have an encoding prefix (since C++11).

단계 6

Adjacent string literals are concatenated.

단계 7

Compilation takes place: each preprocessing token is converted to a token. The tokens are syntactically and semantically analyzed and translated as a translation unit.

단계 8

Each translation unit is examined to produce a list of required template instantiations, including the ones requested by explicit instantiations. The definitions of the templates are located, and the required instantiations are performed to produce instantiation units.

단계 9

Translation units, instantiation units, and library components needed to satisfy external references are collected into a program image which contains information needed for execution in its execution environment.

Notes

Some compilers don't implement instantiation units (also known as template repositories or template registries) and simply compile each template instantiation at Phase 7, storing the code in the object file where it is implicitly or explicitly requested, and then the linker collapses these compiled instantiations into one at Phase 9.

참조

  • C++11 standard (ISO/IEC 14882:2011):
  • 2.2 Phases of translation [lex.phases]
  • C++98 standard (ISO/IEC 14882:1998):
  • 2.1 Phases of translation [lex.phases]

See also

C documentation for phases of translation
Morty Proxy This is a proxified and sanitized view of the page, visit original site.