What is a Cluster?

Apache NetBeans Wiki Index

Note: These pages are being reviewed.

A cluster is a directory on disk. A cluster contains modules. If you are writing a small NetBeans-based application, you probably do not need to be too concerned about clusters, although you may encounter the concept if you need to bundle additional files (native executables, for example) with a module. Clusters become important if you are writing an extensible application (or multiple applications) of your own, where you are sharing some common modules between multiple applications.

The NetBeans launcher is passed a list of cluster directories on startup (see $NB_HOME/etc/netbeans.clusters in the IDE - the names in this file are relative paths from the IDE install directory - but they could also be absolute paths on disk). The launcher looks for the modules (JAR files) which it should load in those "cluster directories". A NetBeans-based application typically consists of, at a minimum, the platform cluster and at least one application-specific cluster which contains modules that implement the business logic of that application.

Cluster directories are not necessarily all located under the same parent directory. They just happen to be in a typical NetBeans IDE install.

The NetBeans platform expects cluster directories to have a certain minimal structure:

  • There will be a modules/ subdirectory containing module JAR files

  • There will be a config/Modules/ subdirectory containing XML files that describe if/when each module should be enabled

  • There will be an update_tracking/ subdirectory which contains metadata that allows the module system to determine if another version of each module is newer than, older than, or the same as the one in this cluster, using dates and checksums

A cluster may contain additional files and folders as needed. For example, it is common for modules which bundle 3rd-party libraries to include those JAR files in modules/ext/. A cluster can contain whatever other files a module needs at runtime - for example, a module that installs a mobile phone emulator would probably include the native emulator executable.

To include additional files in your cluster, simply create a directory release/ underneath your module’s project directory (not the src/ directory for your module, but its parent folder - the one that is your module project). Anything under $PROJECT/release/ will be copied into your cluster by the build process. To find the file at runtime, use InstalledFileLocator, e.g.

File emulatorBinary = InstalledFileLocator.getDefault().locate(
  "phone/bin/emulator.exe", "com.foo.my.module.code.name", false);

Suites vs. Clusters

The result of compiling a module suite is typically a cluster. A cluster is something the runtime understands; a suite is a a project you develop. For more information see the suite-versus-cluster FAQ.

Why Have Clusters?

Here’s the history of clusters:

  • Originally, NetBeans didn’t have "clusters" — there was just $NB_HOME/modules/, a bunch of JAR files, and some XML files saying what was enabled and what was not. You looked up the installation directory using System.getProperty("netbeans.home")

  • Modules are libraries - like any other library or DLL or .so used by applications - and it is normal for multiple applications to use the same copy of some library

  • Sun had a number of NetBeans-based applications. So might anyone creating a NetBeans Platform-based application. The platform is the same for all of them; so are some other parts depending on what modules those applications use.

  • Some operating systems will not allow you to distribute native OS packages that will clutter up a user’s disk with extra copies of files the user already has. The guidelines for Solaris, Debian, Ubuntu and other operating systems, all request or require that, if a library already exists on the target machine, you should use that library in-place, not install your own copy of it. If we wanted Ubuntu and Debian users to be able to type apt get netbeans, we needed to solve this problem for the NetBeans IDE and other NetBeans-based applications.

  • The platform, and other parts of NetBeans therefore should be able to be shared among multiple applications and used by them at the same time.

  • Therefore, a NetBeans-based application should not assume that all of its parts ("clusters" of modules which interdepend) are underneath the same directory on disk — the platform might be in one directory, while the Java modules are someplace else entirely.

  • Before this, a typical way to find a file underneath a NetBeans install was new File(System.getProperty("netbeans.home")) to get the NB install directory; then you could try to find a file somewhere under that directory.

  • If there is not necessarily an "install directory" at all, then you need something like InstalledFileLocator, which knows about the cluster directories being used in the running application, and can look in all of them. That is much cleaner than you having to write the code to figure out where all of those directories are and look in each one.

In short, while it is typical for all of the parts of an application to be under a common parent directory, that is neither required nor guaranteed.

What Does A Cluster Look Like?

Here is the structure of the (comparatively small - it contains only one module) ergonomics cluster in a NetBeans 6.9 development build.

  • ergonomics/ The cluster directory

    • .lastModified An empty file used as a timestamp so NetBeans can cache information about the cluster for performance, but know if its cache is out-of-date

    • config/ Contains metadata about module state

      • Modules/ Contains files which tell NetBeans some things about the module, mostly relating to if/when it should be enabled

        • org-netbeans-modules-ide-ergonomics.xml Metadata about the Ergonomics module, whose code-name is org.netbeans.modules.ide.ergonomics

    • modules/ Directory that contains the actual (multiple) module JAR files and any 3rd-party libraries they include

      • org-netbeans-modules-ide-ergonomics.jar This is the actual JAR file of the Ergonomics module’s classes

    • update_tracking/ Contains metadata about the module which is needed by Tools > Plugins

      • org-netbeans-modules-ide-ergonomics.xml Contains installation date, version and CRC checksums of module JAR and enablement data

In a larger cluster, all of the child directories described above would contain one file for each module (i.e. module JAR file, etc.).

Metadata

The metadata in $CLUSTER/config/Modules/$MODULE.xml is fairly simple and straightforward - it enables the NetBeans module-system to determine when a module should be loaded:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE module PUBLIC "-//NetBeans//DTD Module Status 1.0//EN"
                        "http://www.netbeans.org/dtds/module-status-1_0.dtd">
<module name="org.netbeans.modules.ide.ergonomics">
    <param name="autoload">false</param>
    <param name="eager">false</param>
    <param name="enabled">true</param>
    <param name="jar">modules/org-netbeans-modules-ide-ergonomics.jar</param>
    <param name="reloadable">false</param>
</module>

Similarly, the metadata in $CLUSTER/update_tracking/$MODULE.xml contains data about the module generated when it is installed:

<?xml version="1.0" encoding="UTF-8"?>
<module codename="org.netbeans.modules.ide.ergonomics">
    <module_version install_time="1266357743218" last="true"
                    origin="installer" specification_version="1.7">
        <file crc="3871934416"
              name="config/Modules/org-netbeans-modules-ide-ergonomics.xml"/>
        <file crc="1925067367"
              name="modules/org-netbeans-modules-ide-ergonomics.jar"/>
    </module_version>
</module>

This data allows the Tools > Plugins updater functionality to determine if the version of the module on an update server is a newer version than the copy which the user has installed, so that it can decide if it should offer an update. More importantly, since this is done with checksums, it can do this check without sending data about what is on the user’s machine to a remote server, users privacy is maintained.

Clusters and Compatibility

A cluster is a compatibility unit and has a version. It is set of modules that is developed by the same group of people, built and released at one time.

Most of the reasoning that lead to creation of the concept can be found in: Installation Structure