SWI-Prolog -- A review of C++ features used by the API

Documentation
- Reference manual
- Packages
  - A C++ interface to SWI-Prolog
    - A C++ interface to SWI-Prolog
      - Overview
        
        Blobs
        
        A review of C++ features used by the API
        
        How to define a blob using C++
        
        The life of a PlBlob
        
        C++ exceptions and blobs
        
        Sample PlBlob code (connection to database)
        
        Discussion of the sample PlBlob code
        
        Sample PlBlob code (wrapping a pointer)
        
        Discussion of the sample PlBlob code (wrapping a pointer)
        
        Identifying blobs by atoms

1.6.8.1 A review of C++ features used by the API

Some slightly obscure features of C++ are used with PlBlob and ContextType, and can easily cause subtle bugs or memory leaks if not used carefully.

When a C++ object is created, its memory is allocated (either on the stack or on the heap using new), and the constructors are called in this order:

the base class's constructor (possibly specified in the intialization list)
the constructors for all the fields (possibly specified by an initial value and/or being in the initialization list)
the object's constructor.

When the object is deleted (either by stack pop or the delete operator), the destructors are called in the reverse order.

There are special forms of the constructor for copying, moving, and assigning. The “copy constructor” has a signature Type(const Type& and is used when an object is created by copying, for example by assignment or passing the object on the stack in a function call. The “move constructor” has the signature Type(Type&& and is equivalent to the copy constructor for the new object followed by the destructor for the old object. (Assignment is usually allowed to default but can also be specified).

Currently, the copy and move constructors are not used, so it is best to explicitly mark them as not existing:

Type(const Type&) = delete;
Type(Type&&) = delete;
Type& operator =(const Type&) = delete;
Type& operator =(Type&&) = delete;

A constructor may throw an exception - good programming style is to not leave a “half constructed” object but to throw an exception. Destructors are not allowed to throw exceptions,^{17because
the destructor might be invoked by another exception, and C++ has no
mechanism for dealing with a second exception.} which complicates the API somewhat.

More details about constructors and destructors can be found in the FAQs for constructors and destructors.

Many classes or types have a constructor that simply assigns a default value (e.g., 0 for int) and the destructor does nothing. In particular, the destructor for a pointer does nothing, which can lead to memory leaks. To avoid memory leaks, the smart pointer std::unique_ptr^{18The
name “unique” is to distinguish this from a “shared” pointer.
A shared pointer can share ownership with multiple pointers and the
pointed-to object is deleted only when all pointers to the object have
been deleted. A unique pointer allows only a single pointer, so the
pointed-to object is deleted when the unique pointer is deleted.} can be used, whose destructor deletes its managed object. Note that std::unique_ptr does not enforce single ownership; it merely makes single ownership easy to manage and it detects most common mistakes, for example by not having copy constructor or assignment operator.

For example, in the following, the implicit destructor for p does nothing, so there will be a memory leak when a Ex1 object is deleted:

class Ex1 {
public:
  Ex1() : p(new int) { }
  int *p;
};

To avoid a memory leak, the code could be changed to this:

class Ex1 {
public:
  Ex1() p(new int) { }
  ~Ex1() { delete p; }
  int *p;
};

but it is easier to do the following, where the destructor for std::unique_ptr will free the memory:

class Ex1 {
public:
  Ex1() p(new int) { }
  std::unique_ptr<int> p;
};

The same concept applies to objects that are created in code - if a C++ object is created using new, the programmer must manage when its destructor is called. In the following, if the call to data->validate() fails, there will be a memory leak:

MyData *foo(int some_value) {
  MyData *data = new MyData(...);
  data->some_field = some_value;
  if (! data->validate() )
    throw std::runtime_error("Failed to validate data");
  return data;
}

Ths could fixed by adding delete data before throwing the runtime_error; but this doesn't handle the situation of data->validate() throwing an exception (which would require a catch/throw). Instead, it's easiser to use std::unique_ptr, which takes care of every return or exception path:

MyData *foo(int some_value) {
  std::unique_ptr<MyData> data(new MyData(...));
  data->some_field = some_value;
  if (! data->validate() )
    throw std::runtime_error("Failed to validate data");
  return data.release(); // don't delete the new MyData
}

The destructor for std::unique_ptr will delete the data when it goes out of scope (in this case, by return or throw) unless the std::unique_ptr::release() method is called.^{19The
call to unique_ptr<MYData>::release
doesn't call the destructor; it can be called using std::unique_ptr::get_deleter().}

In the code above, the throw will cause the unique_ptr’s destructor to be called, which will free the data; but the data will not be freed in the return statement because of the unique_ptr::release(). Using this style, a pointer to data on the heap can be managed as easily as data on the stack. The current C++ API for blobs takes advantage of this - in particular, there are two methods for unifying a blob:

PlTerm::unify_blob(const PlBlob* blob) - does no memory management
PlTerm::unify_blob(std::unique_std<PlBlob>* blob) - if unification fails or raises an error, the memory is automatically freed; otherwise the memory's ownership is transferred to Prolog, which may garbage collect the blob by calling the blob's destructor. Note that this uses a pointer to the pointer, so that PlTerm::unify_blob() can modify it.

unique_ptr allows specifying the delete function. For example, the following can be used to manage memory created with PL_malloc():

  std::unique_ptr<void, decltype(&PL_free)> ptr(PL_malloc(...), &PL_free);

or, when memory is allocated within a PL_*() function (in this case, using the Plx_*() wrapper for PL_get_nchars()):

  size_t len;
  char *str = nullptr;
  Plx_get_nchars(t, &len, &str.get(), BUF_MALLOC|CVT_ALL|CVT_WRITEQ|CVT_VARIABLE|REP_UTF8|CVT_EXCEPTION);
  std::unique_ptr<char, decltype(&PL_free)> _str(str, &PL_free);

The current C++ API assumes that the C++ blob is allocated on the heap. If the programmer wishes to use the stack, they can use std::unique_ptr to automatically delete the object if an error is thrown - PlTerm::unify_blob(std::unique_ptr<PlBlob>*) prevents the automatic deletion if unification succeeds.

A unique_ptr needs a bit of care when it is passed as an argument. The unique_ptr::get() method can be used to get the “raw” pointer; the delete must not be used with this pointer. Or, the unique_ptr::release() method can be used to transfer ownership without calling the object's destructor.

Using unique_ptr::release() is a bit incovenient, so instead the unique_ptr can be passed as a pointer (or a reference). This does not create a new scope, so the pointer must be assigned to a local variable. For example, the code for unify_blob() is something like:

bool PlTerm::unify_blob(std::unique_ptr<PlBlob>* b) const
{ std::unique_ptr<PlBlob> blob(std::move(*b));
  if ( !unify_blob(blob.get()) )
    return false;
  (void)blob.release();
  return true;
}

The line declaration for blob uses the “move constructor” to set the value of a newly scoped variable (std::move(*b) is a cast, so unique_ptr’s move constructor is used). This has the same effect as calling b->reset(), so from this point on, b has the value nullptr.

Alternatively, the local unique_ptr could be set by

std::unique_ptr<PlBlob> blob(b->release());

std::unique_ptr<PlBlob> blob;
blob.swap(*b);

If the call to PlTerm::unify_blob() fails or throws an exception, the virtual destructor for blob is called. Otherwise, the call to blob.release() prevents the destructor from being called - Prolog now owns the blob object and can call its destructor when the garbage collector reclaims it.