Rat comes with a set of predefined license matchers, that can be used on typical licenses. However, they will not always be sufficient. In such cases, you may configure a custom license matcher.
The simplest way to create a license check is to create an XML file describing the new license and add it to the processing with the additionalLicenseFiles option.
The second option is to define the custom license directly in the POM. Unlike earlier version (before 0.16) no custom implementations are required to define custom licenses.
There is a file that defines all of the standard licenses: default.xml
Please be aware that custom licenses need to have unique names, otherwise a warning is issued and your custom definitions are ignored in favour of the standard license definitions.
/** * Yet Another Software License, 1.0 * * Lots of text, specifying the users rights, and whatever ... */
A very easy way to search for such headers would be to scan for the string "Yet Another Software License, 1.0". And here's how you would do that in your POM:
<build> <plugins> ... <plugin> <groupId>org.apache.rat</groupId> <artifactId>apache-rat-plugin</artifactId> <version>0.16.1</version> <configuration> <licenses> <license> <family>YASL1</family> <notes></notes> <text>Yet Another Software License, 1.0</text> </license> </licenses> <families> <family> <id>YASL1</id> <name>Yet Another Software License</name> </family> </families> </configuration> </plugin> ... </plugins> </build>
This is very similar to the XML format for defining the configuration.
By default all POM defined licenses are considered approved, this is a change from pre 0.16 versions. If there are families that are defined in the pom but that should not be considered approved then a list of approved license families must be provided.
In the following example, we define YASL1 and BAD1 and then indicate that BAD1 is bad by specifying that YASL1 is good.
<build> <plugins> ... <plugin> <groupId>org.apache.rat</groupId> <artifactId>apache-rat-plugin</artifactId> <version>0.16.1</version> <configuration> <licenses> <license> <family>YASL1</family> <notes></notes> <text>Yet Another Software License, 1.0</text> </license> </licenses> <families> <family> <id>YASL1</id> <name>Yet Another Software License</name> </family> <family> <id>BAD1</id> <name>A Bad Sofware License</name> </family> </families> <approvedLicenses> <id>YASL1</id> </approvedLicenses> </configuration> </plugin> ... </plugins> </build>
When defining custom licenses, remember the following architecture constraints:
all - A collection of matchers in which all enclosed matchers have to be true for the matcher to report true.
<all> <text>This text is required</text> <text>as is this text, both have to trigger before 'all' will be true</text> </all>
any - A collection of matchers that will report true if any enclosed matcher is true.
<any> <text>This text will trigger a match all by itself</text> <text>So will this text.</text> </any>
copyright - A matcher that matches Copyright text. This uses regular expressions and so should only be used when looking for copyrights with specific patterns that are not caught by a standard text matcher. This matcher will match "(C)"
, "copyright"
, or "©"
. (text is not case sensitive). It will also match things like Copyright (c) joe 1995
as well as Copyright (C) 1995 joe
and Copyright (C) joe 1995
. Copyright has 3 child elements:
<copyright> <!-- this will match (c) 1995-1996 joe, or (c) joe 1995-1996 --> <start>1995</text> <end>1996</end> <owner>joe</owner> </copyright> <copyright> <!-- this will match (c) 1995 joe, or (c) joe 1995 --> <start>1995</text> <owner>joe</owner> </copyright> <copyright> <!-- this will match (c) nnnn joe, or (c) joe nnnn, where nnnn is a 4 digit year --> <owner>joe</owner> </copyright> <copyright> <!-- this will match (c) nnnn, where nnnn is a 4 digit year --> </copyright>
not - A matcher that wraps one matcher and negates its value. Not matchers require that the entire header be read before it can report true or false. This may significantly slow processing.
<not> <text>This text must not be present</text> </not>
regex - A matcher that matches a regex string.
<regex>[H|h]ello\s[W|w]orld</regex>
spdx - A matcher that matches SPDX tags. SPDX tags have the form: SPDX-License-Identifier: short-name
, where short-name matches the regex pattern "[A-Za-z0-9\.-]+".
spdx takes the short name as an argument.
<spdx>Apache-2.0</spdx>
<all> <any> <!-- HINT: any of the enclosed matchers will cause a match --> <all> <!-- must have both 'This text is required' and a copyright statement --> <text>This text is required</text> <copyright /> </all> <copyright> <!-- accept any file that has a copyright by joe --> <owner>joe</owner> </copyright> <!-- accept any file with "Hello World" --> <regex>[H|h]ello\s[W|w]orld</regex> <!-- accept any file with 'SPDX-License-Identifier: Apache-2.0' --> <spdx>Apache-2.0</spdx> </any> <!-- make sure the text 'This text must not be present' is not present --> <not> <text>This text must not be present</text> </not> </all>