Creating and Enforcing a Code Formatting Standard with clang-format

I’ve worked on many programming teams which nominally have a “programming style guide.” This guide is often written down and placed in a place that developers rarely look. In almost every case the style guide is ignored, code reviews devolve into style arguments, and a multitude of styles develop inside of the source repository.

The MongoDB team has provided us with an excellent formatting principle:

A formatting process which is both manual and insufficient is doomed to be abandoned.

Instead of relying on programmers to follow a set of rules, we should automate the process to make it as simple and impersonal as possible. Luckily, the clang team has created a wonderful tool that we can leverage: clang-format.

By using clang-format, we can create a list of style rules, enable programmers to quickly reformat their code, and create formatting checks that run on our build servers to ensure compliance.

Table of Contents:

  1. Thinking About Style Guidelines
  2. Generating Your Config
  3. Running clang-format
  4. Disabling Formatting on a Piece of Code
  5. Integrating With Your Editor
  6. A Note on Versions
  7. Follow-on Articles
  8. Further Reading
  9. My Style Guide

Thinking About Style Guidelines

Before you begin this adventure, it’s important to have some style guidelines in mind for your team. You can use the handy programming style guide that your team has ignored, or you can review style guides and rules from around the web and decide which rules your team should adopt. A list of style guides that you can refer to is found in the next section.

Style is a contentious topic. Don’t get dragged into unnecessary arguments. Whether spaces or tabs are used is ultimately unimportant, and once a tool is impersonally updating everyone’s code it won’t matter at all. Appoint someone as the ultimate decision maker if there is no clear winner for a style guideline.

Also, be aware that some style rules cannot be checked by the tool. Consider whether such a rule is ultimately important and enforce it in other ways.

Generating Your Config

While you can build your clang-format configuration from scratch, it’s much easier to start with an existing style and make your modifications.

You can see the options enabled for each of the default styles by using this command:

clang-format --style=llvm -dump-config

You can override the style argument to match any of the default style sets:

Once you’ve selected a file as your baseline, dump the contents to a .clang-format file as a starting baseline:

clang-format --style=llvm -dump-config > .clang-format

The .clang_format file is where we will keep our custom style definition. When running clang-format in the future, we will specify -style=file so that clang-format knows to use our custom rules.

Now that you have a baseline, review the style options and tweak them for your project. You’ll want to check the formatting style rules against your code to make sure the output works as expected.

You can also use an online .clang-format builder for a more interactive experience. Many style options in the interactive builder use live examples to let you compare different settings. Note that some options that are available in the latest clang-format build may not be available in the online builder.

Running clang-format

Running clang-format is relatively simple. Let’s discuss some important options before we get into the details.

Style

The style argument is used to determine the style rules that clang-format will apply. You can use any of the styles described above, or -style=fileto tell clang-format that it must use your .clang-format file.

In-place editing

By default, clang-format will display formatting discrepancies as shell output. I prefer to have clang-format update the files directly. This behavior is enabled using the -i option.

Fallback Style

When getting started with clang-format, it’s easy to get into situations where clang-format cannot find your style file. In this situation, it will fallback to the LLVM style. You can determine the exact style to use as a fallback using the -fallback-style=<style> switch. Setting <style> to none will cause clang-format to fail if your file can’t be located:

-fallback-style=none

Re-formatting files

At the time this article was written, we need to supply a list of files, as clang-format will not run recursively over your source tree. In the next post I will provide some sample wrapper scripts for clang-format.

Here’s a command to get you started, which fill find all C and C++ files in the current directory tree:

find . -iname *.h -o -iname *.c -o -iname *.cpp -o -iname *.hpp \
    | xargs clang-format -style=file -i -fallback-style=none

Note the arguments used with clang-format: I have applied in-place editing, indicated that the style rules are in my .clang-format file, and that I want clang-format to fail if the style file is not found.

Disabling Formatting on a Piece of Code

There are certainly situations where we don’t want clang-format to override the existing formatting. Perhaps formatting rules cannot be created to allow the desired format, or a block is specially formatted for readability reasons.

You can use comments in your code to disable clang-format from modifying a section of code:

// clang-format off
void unformatted_code:
// clang-format on

Block-style comments also work:

/* clang-format off */
void unformatted_code:
/* clang-format on */

Note the space in between the comment start (//) and clang-format. This space is required for the comment to be successfully detected.

Integrating With Your Editor

There are clang-format integrations for vim, emacs, BBEdit, and Visual Studio described in the clang-format documentation. You can also find a Sublime Text Package on Package Control.

A Note on Versions

You’ll need to ensure that everyone uses the same version of clang-format, or eventually you will run into configuration mismatches and output differences. This can be enforced by your dependency system, by including a binary in your repository, or by using scripts to check versions before running clang-format. Ideally, your team will have a formatting process that runs on your continuous integration server. This version should be considered canonical by the team.

Follow-on Articles

I shared wrapper scripts that I use with clang-format and our strategy for ensuring formatting compliance on our projects.

More information on automating quality enforcement processes for our software can be found in the Automated Software Quality Enforcement course.

Further Reading

Automated Software Quality Enforcement

The pressure on software teams is increasing – we’re expected to produce increasingly complex systems with smaller teams and on tight deadlines. We need to invest in processes that increase our team’s effectiveness so we can stay afloat. Our course teaches you how you can leverage tooling and automation to increase your team’s effectiveness – and to catch errors as early as possible.

Learn More on the Course Page

My Style Guide

Here are my current style settings. These options are available for clang-format version 12.0.1, with some commented-out values placed in preparation for clang-format 13. I have also commented each configuration option so that I recall exactly what they mean and what my options specify.

---
# Updated for clang-format 12.0.1, some commented values are there
# for when we update to clang-format 13
Language:        Cpp
Standard: Latest #Cpp20
# BasedOnStyle:  LLVM
# The extra indent or outdent of access modifiers (e.g., public)
AccessModifierOffset: -2
# Align parameters on the open bracket
# someLongFunction(argument1,
#                  argument2);
AlignAfterOpenBracket: Align
# Align array column and right justify the columns
#AlignArayOfStructures: Right
# Do not align equals signs of consecutive assignments
AlignConsecutiveAssignments: None
# Do not align the value of consecutive macros
AlignConsecutiveMacros: None
# Do not align the colons of consecutive bitfields
AlignConsecutiveBitFields: None
# Do not align the variable names of consecutive declarations
AlignConsecutiveDeclarations: None
# Align escaped newlines in macros - as far left as possible
AlignEscapedNewlinesLeft: Left
# Horizontally align operands of binary and ternary expressions
# Keeping the operand on the right edge of the upper line
AlignOperands:   Align
# Do not align consecutive comments that follow a line of code
AlignTrailingComments: false
# If a function call or braced initializer list doesn’t fit on a line,
# allow putting all arguments onto the next line, even if BinPackArguments is false.
AllowAllArgumentsOnNextLine: true
# If a constructor definition with a member initializer list doesn’t fit on a
# single line, allow putting all member initializers onto the next line, if
# `ConstructorInitializerAllOnOneLineOrOnePerLine` is true. Note that this parameter
# has no effect if `ConstructorInitializerAllOnOneLineOrOnePerLine` is false.
AllowAllConstructorInitializersOnNextLine: true
# If the function declaration doesn’t fit on a line, allow putting all
# parameters of a function declaration onto the next line even if BinPackParameters is false.
AllowAllParametersOfDeclarationOnNextLine: true
# Short blocks (e.g., empty while loop, or a for loop that just continues) are
# never merged into a single line
AllowShortBlocksOnASingleLine: Never
# Short case labels are not contracted into a single line
AllowShortCaseLabelsOnASingleLine: false
# Short enums are not contracted into a single line
AllowShortEnumsOnASingleLine: false
# Short functions are not contracted into a single line
AllowShortFunctionsOnASingleLine: None
# Short If Statements are not contracted into a single line
AllowShortIfStatementsOnASingleLine: Never
# Short lambdas are not contracted into a single line
AllowShortLambdasOnASingleLine: None
# short loops are not contracted to a single line
AllowShortLoopsOnASingleLine: false
# Do not break after the return type
AlwaysBreakAfterReturnType: None
# do not always break before multiline string literals
AlwaysBreakBeforeMultilineStrings: false
# Always break after a template declaration
AlwaysBreakTemplateDeclarations: Yes
# A vector of strings that should be interpreted as attributes/qualifiers instead of identifiers.
# This can be useful for language extensions or static analyzer annotations
AttributeMacros: ['__capability', '__unused']
# Function call arguments do not always have to have their own line if they don't
# fit on one line
BinPackArguments: true
# Function parameters do not always have to have their own line if they don't
# fit on one line
BinPackParameters: true
# Add one space on each side of the :
BitFieldColonSpacing: Both
# Configure each individual brace in BraceWrapping.
BreakBeforeBraces: Custom
BraceWrapping:
  # Opening brace under case label
  AfterCaseLabel: true
  # Class brace opens on the same line as the class name
  AfterClass:      true
  # Braces are under control statement
  AfterControlStatement: Always
  # Braces are under enum
  AfterEnum:       true
  # Braces are under function prototype
  AfterFunction:   true
  # Braces are under namespace
  AfterNamespace:  true
  # Braces are under struct keyword
  AfterStruct:     true
  # Braces are under union keyword
  AfterUnion:      true
  # Braces are under extern keyword
  AfterExternBlock: true
  # Braces are under catch keyword
  BeforeCatch:     true
  # else keyword is placed under if close brace
  BeforeElse:      true
  # Do not place a trailing while loop below the close brace
  BeforeWhile: false
  # Do not indent wrapped braces
  IndentBraces:    false
  # Empty function body braces are on multiple lines
  SplitEmptyFunction: true
  # Empty class/struct/union body braces are on multiple lines
  SplitEmptyRecord: true
  # empty namespace body braces are on multiple lines
  SplitEmptyNamespace: true
# For splitting long binary operations, break after the operator
BreakBeforeBinaryOperators: None
# Place concept declaration on a new line
BreakBeforeConceptDeclarations: true
# Break after the ternary operator - ?
BreakBeforeTernaryOperators: false
# Break constructor initializers after the colon and commas
BreakConstructorInitializers: AfterColon
# Break inheritance list after the colon and comma
BreakInheritanceList: AfterColon
# Allow breaking long string literals into multiple lines
BreakStringLiterals: true
# Max Width of a line when formatting
ColumnLimit:     100
# A regular expression that describes comments with special meaning,
# which should not be split into lines or otherwise changed.
CommentPragmas:  '^ IWYU pragma:'
# Each namespace declaration is placed on a new line
CompactNamespaces: false
# Do not require initializers to be on their own lines when breaking
ConstructorInitializerAllOnOneLineOrOnePerLine: false
# The number of characters to use for indentation of constructor initializer
# lists as well as inheritance lists.
ConstructorInitializerIndentWidth: 4
# Indent width for line continuations.
ContinuationIndentWidth: 4
# format braced lists as best suited for C++11 braced lists
Cpp11BracedListStyle: true
# Analyze the formatted file for the most used line ending (\r\n or \n).
# UseCRLF is only used as a fallback if none can be derived.
DeriveLineEnding: true
# Do not read the file to derive pointer alignment requirements. Uses PointerAlignment value.
DerivePointerAlignment: false
# Do not completely disable formatting
DisableFormat:   false
# Remove all empty lines after access modifiers
#EmptyLineAfterAccessModifier: Never
#  Add empty line only when access modifier starts a new logical block.
# Logical block is a group of one or more member fields or functions.
EmptyLineBeforeAccessModifier: LogicalBlock
# add missing namespace end comments for short namespaces and fixes invalid existing ones.
FixNamespaceComments: true
# A vector of macros that should be interpreted as foreach loops instead of as function calls.
ForEachMacros:   [ foreach, Q_FOREACH, BOOST_FOREACH ]
# Sort each #include block separately (blocks of includes are separated by empty lines)
IncludeBlocks:   Preserve
# Regular expressions denoting the different #include categories used for ordering #includes.
IncludeCategories:
  - Regex:           '^"(llvm|llvm-c|clang|clang-c)/'
    Priority:        2
    SortPriority:    0
    CaseSensitive:   false
  - Regex:           '^(<|"(gtest|gmock|isl|json|catch2|cmocka)/)'
    Priority:        3
    SortPriority:    0
    CaseSensitive:   false
  - Regex:           '.*'
    Priority:        1
    SortPriority:    0
    CaseSensitive:   false
# Specify a regular expression of suffixes that are allowed in the file-to-main-include mapping.
# use this regex of allowed suffixes to the header stem.
# A partial match is done, so that: - “” means “arbitrary suffix” - “$” means “no suffix”
IncludeIsMainRegex: '

# Specify a regular expression for files being formatted that are allowed to be considered
# “main” in the file-to-main-include mapping.
IncludeIsMainSourceRegex: ''
# access modifiers are indented (or outdented) relative to the record members,
# respecting the AccessModifierOffset
#IndentAccessModifiers: false
# Do not indent case blocks one level from case label
IndentCaseBlocks: false
# Do indent case labels within a switch block
IndentCaseLabels: true
# Use AfterExternBlock's indenting rule
IndentExternBlock: AfterExternBlock
# Goto labels are indented to proper level
IndentGotoLabels: true
# Indents preprocessor directives before the hash.
IndentPPDirectives: BeforeHash
# Indent requires clause in a template
IndentRequires: true
# Number of columns to use for indentation
IndentWidth:     4
# Indent if a function definition or declaration is wrapped after the type.
IndentWrappedFunctionNames: true
# Remove empty lines at the start of a block
KeepEmptyLinesAtTheStartOfBlocks: false
# Align lambda body relative to the start of the lambda signature
#LambdaBodyIndentation: Signature
# A regular expression matching macros that start a block.
MacroBlockBegin: ''
# A regular expression matching macros that end a block.
MacroBlockEnd:   ''
# Maximum number of consecutive empty lines to keep
MaxEmptyLinesToKeep: 1
# Don’t indent namespaces
NamespaceIndentation: None
# A vector of macros which are used to open namespace blocks
#NamespaceMacros: ''
PenaltyBreakAssignment: 2
PenaltyBreakBeforeFirstCallParameter: 19
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakString: 1000
PenaltyBreakTemplateDeclaration: 10
PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 60
# align pointers: int* ptr
PointerAlignment: Left
# align references like pointers
#ReferenceAlignment: Pointer
# Clang-format will attempt to reflow long comments
ReflowComments:  true
# Always have an ending namespace commment
#ShortNamespaceLines: 0
# Include sorting is alphabetical and case insensitive
SortIncludes: false   #CaseInsensitive
# using declarations will be  alphabetically sorted
SortUsingDeclarations: true
# Do not insert a space after a C-style cast
SpaceAfterCStyleCast: false
# Do not insert as pace after a logical not (!)
SpaceAfterLogicalNot: false
# Do not insert as pace after the template keyword
SpaceAfterTemplateKeyword: false
# Don't ensure spaces around pointer qualifiers, use PointerAlignment instead
SpaceAroundPointerQualifiers: Default
# Place spaces before assignment operators (=, +=, etc.)
SpaceBeforeAssignmentOperators: true
# Do not place a space befrore a case statement colon
SpaceBeforeCaseColon: false
# Do not place a space befrore a C++11 braced list
SpaceBeforeCpp11BracedList: false
# Do place a space between the constructor and the initializer colon
SpaceBeforeCtorInitializerColon: true
# Place a space between the class and the inheritance colon
SpaceBeforeInheritanceColon: true
# Never place a space between an item and following parens
SpaceBeforeParens: Never
# do not place a space before a range based for loop
SpaceBeforeRangeBasedForLoopColon: false
# do not place a space before square brackets []
SpaceBeforeSquareBrackets: false
# do not place a space in an empty block
SpaceInEmptyBlock: false
# Do not place a space in empty parens
SpaceInEmptyParentheses: false
# Spaces between end of the code and the start of a // line comment
SpacesBeforeTrailingComments: 1
# Remove spaces within <> : <int>
SpacesInAngles:  false #Never
# Do not add spaces in C-style cast parens
SpacesInCStyleCastParentheses: false
# Do not add spaces around if/for/while/switch conditions
SpacesInConditionalStatement: false
# Do not insert spaces inside container literals
SpacesInContainerLiterals: false
# Do not insert spaces after ( and before )
SpacesInParentheses: false
# Do not insert spaces after [ and before ]
SpacesInSquareBrackets: false
# Macros which are ignored in front of a statement, as if they were an attribute.
# StatementAttributeLikeMacros:
# A vector of macros that should be interpreted as complete statements.
# StatementMacros: ''
# The number of columns used for tab stops.
TabWidth:        4
# A vector of macros that should be interpreted as type declarations instead of as function calls.
#TypenameMacros: ''
# use \n for line breaks
UseCRLF: false
# Use tabs whenever we need to fill whitespace that spans at least from one tab stop to the next one.
UseTab:          Always
# A vector of macros which are whitespace-sensitive and should not be touched.
WhitespaceSensitiveMacros:
  - STRINGIZE
  - PP_STRINGIZE
  - BOOST_PP_STRINGIZE
  - NS_SWIFT_NAME
  - CF_SWIFT_NAME
...

9 Replies to “Creating and Enforcing a Code Formatting Standard with clang-format”

  1. Using clang-format automatically when build project (by using make) save a lot of time during review because coding style is always the same.Thanks for another great post

  2. Question – what is the default style if you do not provide anything when running clang-format on a C++ file?

    1. I believe I also answered you in email, but I’ll post here as well for others to see:

      Clang-format has a “-style=Google” style that is intended to match the Google C++ style guide, and you can enable the google-related checks in clang-tidy with “-checks=google-*”. Whether there is full coverage of the style guide, I am not sure – I’m not a user of that particular style. However, I think you will manage to get pretty close with that. Some conventions such as naming style may need to be manually enforced during code reviews.

  3. There is an error at line 174.

    Should be IncludeIsMainRegex: ‘$’

    The empty line should be deleted.

Share Your Thoughts

This site uses Akismet to reduce spam. Learn how your comment data is processed.