Project

General

Profile

Actions

Swdesignmisc » History » Revision 2

« Previous | Revision 2/12 (diff) | Next »
Shuvam Misra, 31/05/2021 06:51 PM


IDs, history tables, timestamps, etc

ID formats

Object instances in the database have unique IDs. All IDs internally generated by the web service, e.g. voucher-ID, user-ID, transaction-ID, JR-ID, etc., are all to be treated as opaque strings, not as integers.

The rules for ID syntax are:

  • opaque, meaningless string
  • between 1 and 50 octets long, variable length
  • one word, i.e. no spaces or special characters in the word, only letters and digits
  • case sensitive
  • may start with a digit or a letter
  • equality test possible, i.e. if two ID strings are identical as strings and refer to the objects of the same class, then they refer to the same instance
  • guaranteed unique, i.e. if two object instances of the same class exist in the system, they are guaranteed to have two separate ID strings
  • unique in time, i.e. an object ID is never re-used
  • ordering is meaningless, i.e. "greater than" or "less than" operation is meaningless

Note that this means that the web service may use integers as IDs, and all these rules will be fulfilled. But while representing them in JSON, the IDs will always be enclosed in double-quotes, even if they "look like" integers.

Examples of IDs:

  • "a0ae48b16f90db8f3c542f44f8103701"
  • "5"
  • "750294852039"

Examples of strings which are not IDs:

  • "a0ae48b16f90db8f3c542f4/f8103701" (there's a special character in it)
  • 5 (a digit without quotes is not a string)
  • "776,245,664" (commas are not permitted in IDs)
  • "62.6" (the period is not permitted)

History tables

Transaction tables which receive updates will also have history tables which "shadow" the main tables. This will ensure that, taken in an overall database context, all updates are non-destructive, because all before-images of records can be reconstructed by analysing the history tables.

History tables will contain:

  • previous images of updated records
  • deleted records

The following policies will be applied for maintenance of history information:

  • When a record is inserted in a main table, nothing is done to the history table. (This means that there is no history-related overhead of compute power or disk space for tables which are not modified much.)
  • When a record is updated in a main table, its previous image is first inserted into the history table and then the data is updated in the main table
  • When a record is physically deleted (i.e. using the SQL DELETE operation) from a main table, the to-be-deleted record is first copied into the history table
  • When truncate operation is performed on the table, all the rows in the current table will be inserted to its history table

These are the implementation specifications for implementing the history feature:

  • All main tables which are supposed to get the benefit of history tracking will now have corresponding history tables to "shadow" them. If the main table is called X, its shadow table will be called X_hist.
  • The history maintenance feature will be implemented using database triggers.
  • The shadow table will have identical schema to the main table, with two additional fields:

    • hist_when: timestamp specifying when this record was inserted in the shadow table
    • hist_op: will contain either "U", "D" or "T", indicating that this history replica is because of an u(pdate), d(eletion) or t(runcated)
  • The server-side application code will connect to the database using a specific database username and password meant for business operation. (Let's say that this username is appcode.) This username will not have any rights on the shadow tables other than INSERT rights. A separate, second database user (let's say, apphist) will have SELECT rights on the shadow table, for reports. And only the administrator will be able to modify the contents of shadow tables.

There are ready-made examples on the Net about how to implement this sort of history table with a Postgres database. One well-known example is https://wiki.postgresql.org/wiki/Audit_trigger Most of these examples actually use a single history table to store before-images from all primary tables. That approach works well too -- you don't need to have separate shadow tables for each primary table. In particular, a single combined history table is easily implemented in Postgres because of its powerful jsonb column type, which allows us to store any random data structure in one column, thus giving us flexible data structure storage.

Timestamps

As per JSON Schema specification for date-time, for timestamps values, we will always use the format defined in RFC 3339 e.g. 1990-12-31T23:59:50Z for a global UTC timezone timestamp or 2015-12-31T23:59:50+05:30 for a timezone-included timestamp. All timestamp information in web service parameters will be communicated in these formats. If an application is used by clients across various timezones, all timestamps sent to or received from web services must be in UTC, and must be stored on the servers in UTC. All conversion of those timestamps to local timezones to make it easier for humans to read, for instance, must be done in the client software.

Calling web services

When we write code to call web services, we need to take care of a few separate levels of checks. This has nothing to do with the programming language used, or the place from where we're calling. We may be making web service calls from one server-side daemon to another, or from a mobile app to a server. These checks need to be done nevertheless.

  • 1: Call timeout. Check if the call has timed out. Your code needs to have an explicit check for timeout of a request, and needs to check in your response whether the call has timed out.
  • 2: Call network error. This happens if the server crashed while it was processing your request, or if the network connection from your mobile phone or browser was lost after the request was sent out, or a host of other network related issues. This error needs to be tested for separately. For each programming language and framework, you need to check how that framework returns network errors, and check for that error. With mobile phones and browsers, it's easy to test this by making the request and bringing down wifi just before or just after the request goes out.
  • 3: Return status. All web service calls return some sort of status. In our Merce standard designs, we return a status attribute in the JSON data structure from the server, and this status may be either ok or error. The caller will get this only if there was no timeout or network error, and if the server was functioning correctly and gave a meaningful response with a status attribute. Therefore, this error needs to be checked after the first two.
  • 4: Data check. If the previous three checks are all ok and you have a valid response data structure from the server, then you need to check if the data in that response matches what you expect to get. The need for this is not obvious to beginners, but we often have a perfectly functioning server returning data which does not match what your calling code expects. This may happen due to (i) a bug in the server code, or (ii) a transient error in the server system which the server side code is not detecting correctly, like a database failure, or (iii) a difference between what you think the web service spec are supposed to be and what they actually are. All three are common reasons. Therefore, your code needs to check that the data has all the attributes which you need, and that the values in the data are not some garbage (like zero length string) or some different format (like getting a space-separated list of integers when you were expecting only one integer).

This is the list of four checks which your calling code needs to do, in this sequence, each time it calls a web service. This is the only way to build code which works reliably across network interfaces.

Updated by Shuvam Misra almost 4 years ago · 2 revisions