Good practices when defining the scope of a code scan

In this post, we have compiled a few good practices to keep in mind when scanning a code base with CAST Highlight in order to let you consume the most consistent software analytics possible, depending on your use case (software health, open source detection for license compliance or vulnerability checks, etc.).

As Highlight performs a code analysis at the file level and doesn’t particularly take into account the logical links or dependencies between these files, all files are considered equal and as being part of the application. In order to provide accurate and consistent results, especially from a Software Composition standpoint, you’ll have to take a few minutes to prepare your code scan scope by using the file/folder exclusion features of the Local Agent.
Third Parties

If you want to identify open source or COTS packages, make sure they’re included in the folders you’ll scan (external libraries are generally grouped into a sub-folder named “third-party” or something similar, while the main code is often located under “src/main”).

Test files

Test classes should be excluded except if you want to scan them. But measuring software resiliency of your test files may be of poor interest, for instance.

Generated code

Generated code (e.g. *.t.ds, *.flow.js) should be excluded as well as they’re automatically produced by the system and the development team can’t really manage software health of this aspect of the code.

Environment-specific files

For more consistent results, SCM, build and deployment folders (e.g. .git, .svn) shouldn’t be part of the scope.

Dependency files

If you want to get insights like CVEs (security vulnerabilities) on frameworks and dependencies whose physical files are not part of the folder you’re scanning, make sure that the dependency files (e.g. pom.xml, build.gradle, package.json, .vcsproj, etc.) are there too.

To the extreme opposite case, if you scan your C:\ drive and all the folders and files it contains, Highlight will systematically scan files with the 40+ technologies it supports and will try to consolidate the different insights (software health, cloud readiness, open source origin, security vulnerabilities…) from there. As you can easily understand, the few minutes you’ll spend in defining your application scope will be saved later when consuming our analytics.