Minimize API visibility

3330 words | 18 minutes to read

When building libraries, there is always the need to differentiate between API designed for end-users, and API designed for our internal use. We call API public when we intend for it to be used by end-users, and we call it internal when it is intended for our use, and is not intended for end users. Java, as with most object-oriented languages, has many useful concepts, mostly centered around the visibility modifiers public, protected, private, and package-private (denoted in Java by not specifying any visibility modifier). This works well for many cases, but in Java it falls down when API spans multiple packages because there is no ‘library-internal’ or ‘module-internal’ concept, like there is in other languages. Despite this, our goal as Java API designers is always to have the highest degree possible of our internal API not leak to the end-user of our library, where leaking means one of the following is occurring:

One of the core API characteristics I presented earlier on this website was the concept of evolvability. I discuss this further in the design for extensibility best practice, but in this best practice we will dig into how we can limit our API surface area by being more conscientious towards our delineation of public and internal APIs whenever possible. In the course of my API design experience, I have achieved this largely by putting internal types into a separate package that is clearly documented as internal API. When I was working on the JDK itself, packages starting with com.sun.* were typically accepted as internal API. More recently in my role at Microsoft working on the Azure SDKs for Java, I introduced the convention of having an implementation package (or sub-package) for much the same thing. Tooling was configured such that these internal API packages were excluded from JavaDoc, not validated against breaking API change rules, and specific tooling was written that ensured types inside the implementation package did not leak out through our public APIs.

The visibility modifiers in Java are unfortunately very coarse-grained for what we need as API designers, and they fail to properly support all circumstances. This best practice attempts to provide more clarity for those circumstances, to provide a framework for determining how best to present an API that simultaneously provides all required APIs to the end-user whilst not providing any API that is intended for internal use only.

Dealing with Leaking Methods

Leaking methods is when an intentionally-public type has API on it that is not intended for use by end-users. This is commonly the case when a method (or constructor or field, but for brevity henceforth I will just say ‘method’ to mean all of the above) is made public or protected because it must be accessed by some internal code, commonly to set some property or configuration, before being handed to the user. Unfortunately, because this is a publicly-accessible API, the recipient of the type has similar freedoms to call this API and perhaps put the type into an inconsistent or illegal state. For example:

public class User {

    public String getName() { ... }
    public void setName() { ... }

    public int getScore() { ... }

    // this should not be visible to the user but due to design choices in our
    // API, it is, and the results could be catastrophic if the user calls it.
    public void resetScore() { ... }
}

There are a variety of ways that this problem can be tackled. Which one should be chosen depends a lot on the specific context of the API, so firstly we will enumerate the options and how they are implemented, and subsequently, we will dive into how we might land on the best choice of these options for a particular scenario.

Accepting Leaking Methods

The first option is to accept that we might just be best served to have leaking methods in our public API. The solutions outlined later in this document might be worse than the disease, as it were. Taking this approach requires little effort, but consideration should be given to how to lessen the probability and impact of the user calling these methods:

Introduce an Interface

If leaking methods is not acceptable, another choice is to introduce an interface that represents the type, with only the public API visible on it. This works well for returned model types, where the backing implementation of the interface is instantiated by the library and provided to the user, whereupon they are interacting with the interface only. It does not work well in situations where the user is required to provide an instance of a model type (because they are unable to instantiate an interface directly), nor does it work well in cases where the model type is round-tripping with a service and therefore is both user and service instantiated.

There are patterns that enable instantiating an interface without leaking the implementation. Most notably, the factory pattern allows the user to be unaware of the implementing class of a given interface, and only be presented with the interface API. In Java, we could collapse the factory pattern down into the interface itself, using a static factory method:

public interface ModelType {
    void setFoo(Foo foo);
    Foo getFoo();

    // static API allows the user to create an
    // instance of the internal type
    static ModelType newInstance() {
        return new ModelTypeImpl();
    }
}

// internal type pushed into an implementation package
public class ModelTypeImpl implements ModelType {
    public void setFoo(Foo foo) { ... }
    public Foo getFoo() { ... }

    // dangerous operation we don't want as part of our public API
    public void impl_formatServer() { ... }
}

By doing this, the user would never need to know of the existence of the ModelTypeImpl class or the dangerous functionality that it encapsulates. All public API would return or accept as a parameter the ModelType type, and users would proceed to instantiate an unknown implementation of ModelType by calling ModelType.newInstance().

In this approach, the internal type would not form part of the public API surface area and would be banished to an appropriate place within the implementation package. This would mean that the interface becomes the only public API and would take the place of the internal type in the package hierarchy.

There are challenges to consider with this approach:

Introduce Helper Types

Another option to consider is introducing helper types. I’ve personally not seen this pattern in the wild all that often, but it was something I experienced a lot when I was working on the JDK when I worked at Sun Microsystems / Oracle. Wrapping your head around this pattern, which I will dub the ‘Helper Pattern’ henceforth, is not the simplest of tasks, as there is quite a considerable amount of overhead in achieving the final result. I will firstly outline the steps required to implement this pattern, and then discuss the pros and cons.

The pattern starts by introducing a FooHelper type within your internal API space (i.e. within your implementation package). This class will take a form similar to the following:

// The helper class is named using the pattern <public-type-Name>Helper,
// so in this case we know this is a helper class for a `Foo` type in our public API
// surface
public final class FooHelper {
    // The FooAccessor interface is implemented within the Foo type, so that it can
    // access the non-public API within the Foo type.
    private static FooAccessor fooAccessor;

    // We don't want people to create instances of this class directly
    private FooHelper() { }

    public interface FooAccessor {
        // All APIs on FooAccessor map to API on the Foo type
        // that we want to access but we don't want to have as public API
        void doNonPublicTask(Bar bar);
        Baz computeExpensiveValue();
    }

    // setAccessor is called by the public type, in this case the `Foo` class will 
    // set this
    public static void setAccessor(final FooAccessor newAccessor) {
        if (fooAccessor != null) {
            throw new IllegalStateException();
        }

        fooAccessor = newAccessor;
    }

    public static FooAccessor getAccessor() {
        if (fooAccessor == null) {
            throw new IllegalStateException();
        }

        return fooAccessor;
    }

    // The following two methods are the API we want to access on Foo that we do
    // not want to have as public API. Note that it is static API, and that it takes
    // as its first argument the Foo on which we want to perform the actual
    // operation. There should be a 1:1 mapping between these methods and the API on
    // FooAccessor.
    public static void doNonPublicTask(Foo foo, Bar bar) {
        fooAccessor.doNonPublicTask(bar);
    }

    public static Baz computeExpensiveValue(Foo foo) {
        return fooAccessor.computeExpensiveValue();
    }
}

There is a lot of code here, and it still might not be overly clear how this helps anything! The next step is to introduce some additional code into our public Foo type, as shown below:

public class Foo {

    static {
        // In the static code block, we are configuring `FooHelper` as to how it can
        // access the private or package-private API inside `Foo`. By doing it here
        // the non-public API can be exposed to `FooHelper`, and from which it can 
        // be made available to other types elsewhere.
        FooHelper.setAccessor(new FooHelper.FooAccessor() {
            @Override
            public void doNonPublicTask(Bar bar) {
                foo.doNonPublicTask(bar);
            }

            @Override
            public Baz computeExpensiveValue() {
                return foo.computeExpensiveValue();
            }
        });
    }

    // the actual API can be made private or package-private,
    // and no longer needs to be public API
    void doNonPublicTask(Bar bar) {
        // impl goes here
    }

    Baz computeExpensiveValue() {
        // impl goes here
    }
}

After all of this code, we have now reached a point where we can use it! Anywhere in our code where we have access to the implementation package where the FooHelper type resides, we can now write code to FooHelper.doNonPublicTask(foo, bar) or FooHelper.computeExpensiveValue(foo).

The upside of this approach is that this API no longer needs to be public on the Foo type. The downside is all the extra machinery that is required to set up this pattern. Additionally, this pattern is not friendly across module boundaries, assuming that the location of the FooHelper type is not part of the public API scope and is therefore not exported to other modules. Concessions can be made in the module-info.java file to conditionally export the package containing the helper types to other friendly modules.

Choosing the Right Pattern

As should be obvious from the choices presented above, there do exist several possible patterns we could use - but it begs the question ‘which pattern is right for me?’ There is no ‘one size fits all’ answer to this, but we can at least identify some questions that we can ask ourselves that might guide us somewhat:

Having said all of this, the final and most critical point to make is: we are not required to choose one pattern for our entire client library! We can, and should, use all three of the approaches outlined above in a single client library as appropriate.

Dealing with Leaking Types

We leak a type when we introduce intentionally-public API that returns an internal type, or accepts as a parameter into it an internal type. In other words, we have a public API exposing something within our implementation package. The result of this is that users are now presented with API that may not meet our expectations in terms of API design, quality, documentation, and so on. Even worse, because we want to fully support Java modules, users may be unable to use this internal type, as it will not be exported from our module-info file. Finally, it is important to understand that leaking a single type may feel inconsequential, but this can often be the tip of an iceberg, as this type leaks other types, which themselves leak more types, until the entire internal API area is unintentionally being exposed to the user.

Fortunately, in the Azure SDK for Java build tooling we have checks that help to prevent this. Both APIView and our CheckStyle build step look for types that are within the implementation package being exposed through API on types that are not within the implementation package.

What should we do when we identify leaking types? The answer is obvious: stop the leak! How we go about this is very context-specific, and there are not many options available to us as API developers. Possible general solutions include:

If the leaking internal type is appearing within a method parameter type, one additional consideration is to remove the internal type argument from the public method parameter list, and instead take other parameters that can then be reconstituted within the implementation of the API to create the internal type (and therefore transparently to the user).

Conclusion

In this best practice we have discussed the ways in which we can fulfill our commitment on being restrained in our API. API design would be much easier if we could just make everything public and accessible by everyone, but that would make for an absolutely terrible user experience. In this best practice we therefore looked for ways that enable us to be careful that we only expose to users what is intended for their use. Any thing else that is available on our APIs that is not intended for their use exists to serve us as API designers, and that should be seen as a failing on our part. Whilst it is not always possible to completely eradicate these internal APIs from public scope, we should always treat them as a code smell that need to clear an extraordinarily high bar to be allowed to remain in our public API scope.