Good practices when defining the scope of a code scan
If you want to identify open source or COTS packages, make sure they’re included in the folders you’ll scan (external libraries are generally grouped into a sub-folder named “third-party” or something similar, while the main code is often located under “src/main”). Also, make sure you include third-party binaries (e.g. JARs, DLLs) if you want to retrieve related license, vulnerability and obsolescence information in Highlight. For better results on Software Composition Analysis, it is strongly recommended to scan the build output in complement of the source code.
Test classes should be excluded except if you want to scan them. But measuring software resiliency of your test files may be of poor interest, for instance. Test and sample files can also generate misidentification of OSS components during the Software Composition Analysis as they’re not really part of the application you’re scanning.
Generated code (e.g. *.t.ds, *.flow.js) should be excluded as well as they’re automatically produced by the system and the development team can’t really manage software health of this aspect of the code.
For more consistent results, SCM, build and deployment folders (e.g. .git, .svn, gradle, .circleci, .azure, .vscode, etc.) or files (e.g. .yaml, .gitignore, .gitmodules, Makefile, .npmignore, .checkstyle, build.xml, gradlew… this list is not exhaustive) shouldn’t be part of the scope.
If you want to get insights like CVEs (security vulnerabilities) on frameworks and dependencies whose physical files are not part of the folder you’re scanning, make sure that the dependency files (e.g. pom.xml, build.gradle, package.json, .vcsproj, etc.) are there too.