Immersive Analytics, Colour and Geoscience Visualisation
Over the last few months, I’ve been wrapping up my part-time second PhD, looking forward to new challenges. The research has focussed on interactive data visualisation and immersive analytics, principally developing a suite of software tools for visualisation and analysis of geoscientific data. They’re pretty much ‘discipline agnostic’ and could be adapted for a wide variety of data sources. A few publications are on the way describing them in detail, so here I’ll give a brief informal outline and update some details when they’re published.
Modelling and visualising planet-scale data is a complex business – a trade off between massive datasets, available compute resources, resolution, sparsity and a bunch of other factors. There are lots of specialist tools for doing this, but not many that are very intuitive or helpful for rapid immersive visualisation and exploration of data, to assist scientific inference. This is an area I’ve looked at in detail: developing tools for the inferential workflow, rather than pretty visualisations as a by-product.
A key element that struck me about visual inference (and this is where my first PhD in semiotics came in handy) is that it was quite poorly defined in some of the scientific literature, probably because it borders upon the philosophical: what does ‘visual inference’ mean and how do we do it? And, more particularly, do we know how it correlates data and scientific evidence? Can visual inference credibly assist decision-making? These questions underpin aspects of visual analytics, where visualisation systems create affordances and cognitive tools for assisting inference, in a kind of dialogical interplay.
My take on this is that the tools have to evolve and become dialogical: exploration of the design space for HCI can work hand-in-hand with data exploration, acknowledging the co-dependent constraints of the human sensorium and machine representations. Once you’re aware of these, then you can start asking interesting questions of data, with some confidence that the representations are reproducible and improveable, and the inferences they assist can point to visualisation as evidentiary and credible. Then they can be used to pose further scientific questions, which is the whole point.
A cardinal sin one comes across everywhere in science is the use of the rainbow colour map. Pretty isn’t it? Pretty useless as it transpires, as has been known for a long time. Odd that it still crops up as the default in lots of software interfaces used in scientific visualisation – such is the power of convention. It’s a great example of a non-dialogical process, where convention obscures a whole range of visualisation problems. The picture above is a screenshot of some volumetric visualisation software I started developing in 2016, when I started looking in depth at the usage of colour in visualisation. It’s a huge and complex subject – but only one aspect of a complex provenential chain of data wrangling, similar to this:
This diagram summarises some of the workflows I’ve been working on – where data is captured/generated/modelled, processed in a variety of ways, and finally sent to some kind of visualisation system (here an immersive fulldome display, but it could be MR/AR/VR systems or even just a desktop or mobile device.) All along this chain there are quality control and data transformation issues that need to be well understood. Some are not terribly well defined or simply left to convention – such as the end colourisation process. Fortunately there is a growing awareness around this that is becoming more cross-disciplinary, which is a good thing (for instance, where lessons from cinematography postproduction inform scientific viz, or where computer science, HCI, game programming and geosciences interact).
A common default in visualisation is the use of the RGB colour space – it’s linear, simple and comprehensible – and easy to represent. When we apply colour gradients or colour maps to data, then it is simple to grasp that they can be described by a path through colour space.
You might think that this means that there is a simple, linear relationship between data in a dataset and how it is mapped in RGB colourspace and how this displays on your screen. (That’s the lazy assumption behind the rainbow colour map – it’s simply a tour around the perimeter of the RGB cube or its HSL/HSV variants). But you’d be wrong. It’s much more complicated.
I wrote a few apps that help visualise colourspaces and paths through them. Here’s a few screenshots of early versions:
Whilst these are relatively intuitive, the main problem with them is that they are discontinuous and/or not perceptually uniform. That means that humans do not perceive colour and lightness differences in the same way that the colourspaces represent them. Human colour perception is very non-linear, so linear colourspaces don’t match the way we see things very well. They’re easy to compute, but they run serious risks of misrepresenting colourised data and thus invaliding inferences based upon them. Using HSL/HSV is no different to using RGB, as it is simply a reprojection into a hexcone or cylindrical coordinate system:
So, we need much better control of colour for scientific visualisation – and fortunately there are a range of answers available, somewhere amidst this field of interlinked concepts:
Uniform Colour Spaces (UCS), such as CIELAB, CIELCH, CIELUV come to the rescue! Each colour model has advantages and disadvantages, but the simplest to use (and reasonably accurate and easily reproducible) is CIELAB. Key aspects of CIELAB include its uniformity (where equal distances in colourspace equal perceived colour difference), the fact that it has a colour difference metric (∆E), and that it models the colour opponency function of human vision (red-green, yellow-blue).
Colourising data can have dramatic effect upon how it appears to us and what we might infer from it, so we need some tools that will help us explore colourspaces in a controlled way. Here’s one example: an app I’ve developed that can import some geoscience data and apply colour – and access all the MacOS level ICC Profiles:
Well, that’s nice, but it doesn’t really help one control interpolation in these colourspaces – it mainly avails one of the various mappings. We need much better control and better feedback. So I went further.
Firstly we need to be able to define gradient paths in CIELAB colourspace, using something like my LAB Colour Mixer, which gives clear colour difference metrics:
Then, visualise that colour gradient as a path through CIELAB, constrained by the sRGB gamut:
And then apply it in a controlled way to some data:
Great! Now we’re getting somewhere: perceptually accurate interpolation in a uniform colourspace, with metric feedback, using simple slider, buttons, mouse and interactive 3D GUIs. But, as we want to be able to work analytically with gradients, they need some sorts of sophisticated mechanisms to target precise ranges in data, to group items, to make others disappear, and so forth. So, the underlying OpenGL gradients, whilst appearing to be simple, are in fact multi-layered and complex:
The trick is to make this relatively intuitive and to hide the complexity from the user – whilst ensuring, of course, that it is accurate, not just some sort of hand-wavy compositing operation. That’s harder than you might think. Anyway, now that it’s done, we can connect all these different components together:
With a perceptually accurate colour map, we can prepare the colourmapped data for live interactive sharing to its companion application PDT (Planetary Data Tagger) – a 2.5D volumetric compositing app for global geoscience data. Here’s an early screenshot looking at seismic structure under Antarctica:
This can be reprojected in a variety of ways for fulldome (mono or stereoscopic), vizwall (flat or cylindrical), MR/VR/AR and so forth. How is this done? By using tools developed by the VJ community and computer game engines (such as Unity3D and Unreal Engine) – there’s a whole lot of technology that hasn’t been widely used by the scientific visualisation community, simply because it hasn’t formed part of the conventional pipeline and toolset. So it should be adapted and exploited for scientific visualisation.
This means we can start looking at perceptually accurate colourised global data, in interactive-real-time, in interesting new ways. Here’s a few example screenshots of software with the ability to tag and define regions of interest:
Or if you wish to do do global tomography:
But why stop there? I’ve long been interested in the planetary sciences, so here’s few views visualising and tagging some simulated data on other planets and moons:
The idea behind this is that one can select regions of interest and define arbitrary tags, which can be written out as simple text-file metadata. This can then be employed for subsetting datasets for further analysis using other routines or programs, or even for training Machine Learning models as it captures expert observations. Colourmaps can be written out as colour palette tables (.cpt files for GMT) or as JSON (for e.g. Cesium.js, GPlates, d3.js etc.). And it can all be done simply and rapidly – which is important. Some of the most onerous steps in scientific visualisation (beyond actual data capture and wrangling) involve using huge, monolithic applications that can take forever to learn properly or require good knowledge of programming languages like Python. My idea here is that it could be a whole lot simpler, especially for those first steps when you just want to reconnoitre your data, to help make some informed decisions about what might be interesting and worth further exploration.
The software is MacOS only at this stage, and I’ll make it available as open-source-as-possible, once my magnum opus has passed its last few hurdles.