Practical AI Trust & Ethics thoughts, part III

This blog post is the continuation of the little series on the topic of Practical AI Trust & Ethics that I have started previously in Practical AI Trust & Ethics thoughts, part I

In this post I want to focus on the processes and the origins – there has not been enough focus on them, especially since they are mostly down-to-earth and not-so-hypable by their nature of being already part of the regular IT processes.
Even though it touches on the “Trust in the process” section of the previous part of this series, this is a more gritty, a kind of an “old-school IT vision” of the concrete steps and consequences.

The complexity

Way beyond any established trust, there will be a need to follow through a sufficiently complex system decision, which might represent virtually no chances of even the lead programmer/designer determining which exact algorithm/pattern was used.
Sure thing that if one gets a positive outcome of the process, they mostly won’t mind how it came to this state, but in the case of the negative outcome, even the strongest trust sometimes won’t be able to hold the line, and the transparency shall be asked for.

Less complexity and more stability of the process will guarantee a better result with the clients and overall a better trust in the process.
We should not complicate or overdesign the solution in any way, but I strongly believe that for the AI solutions, it will be essential in order to keep the trust, to get the respective processes as simple and as explainable as possible.

Process Key Focus Points

When we are talking about the classical process of the application development, we are not very concerned about the transparency of it, because mostly it will be presented as a “black box” to the final consumer.
In the case of the AI solutions, in the modern times, expect a number of questions about the ethics of the process of development and not just about the final result.
Besides the ethics that I just have mentioned, the questions that I expect to be asked (and not limited to, of course) are:
– how many bias were put into the solution ?
– how are those bias are being addressed ?
– where does the original data come from ? This is a bigger topic and I will focus on it in the last part of this post.
I think it is totally illusionary of thinking that a solution can have no bias, they should be assumed, registered and fought against.

Regarding the security, the issue will be pretty much everywhere, starting with the product, with the process itself and of course with the results must be overall secure – think about privacy, but also think about the fact that those results will serve to feed the system and thus, if manipulated will mean a potential total loss of the control of the system.

Deployment is a process that has become significantly more visible with the rise of the Continuous Integration and Continuous Deployment, and making it count will be very important in the age of AI. Starting with the aspects of the security (making the deploys not alterable by those who are not authorised), and right up to ensuring that the deployment won’t affect the results (keeping the used version for each of the processes identifiable and traceable – think about the transparency of the results), especially in the way of the performance (as in bias that some people/races have a preferred performance of the solution).
The 3 main points that I see about the deployments are their safety, regularity and efficiency (they might cause a downtime or even wrong results, as you surely know).

Trust needs performance as time is essential. There must be some very pretty graphs around which show how trust and confidence disappear with the time that has been spent on waiting.
Performance will be gaining significance again, partially because of the biases – who wants to deal with giving the explanation that the AI solution does not give a different performance to different people? Ensuring that the infrastructure is fine and that the performance is as constant as possible will be the one of the keys to gain the trust in the results. We all know how superstitious and suspecting some people are :)

Data Shopping

I think that with the raise of AI we shall see the Data Shopping finally become an important topic, especially in the big organisations.
For the smaller organisations and for the diversity of the data, there will be business selling legal data (yes, there are some of them around – but this are should explode proportionally to the success and need for the data for kicking of with AI projects) and beyond this – the commercial organisations will want to buy data for training and testing their solutions.

There will be day when the origins of the data used for training will need to be revealed, and for that purpose even the trust between
Ensuring that the data obtained is legal and ethical are essential, cause nobody wants to be impacted by the need of retraining and re-aligning the models right from the beginning.

AI products require huge amounts of data for training and unlike in the traditional application where you can make up a couple of rows and you are good to go, you will need to use a big amount of training data WHILE developing the solutions.
Every dataset that will stay long enough on the market (external or internal for that matter) should be providing the biases that it brings and makes the application to inherit – such as the diversity or the statutory the selected & collected data. There might be advantages and disadvantages of using every particular data sets, and maybe we shall live long enough to see exciting strategies of mixing different data sets for different industries to obtain the best results possible.

The topic of Data Shopping will definitely bring closer the end of the loop with the decision of paying the data owners (and a big follow up question of who the owners of the data should be legally defined). Maybe there will be even a data-lending with the consequential payment to the respective people, there is already some interesting and exciting discussion in this extremely difficult topic. The ethics of the bought data will be viewed under the microscope and a lot of organisations will choose not to deal with this issue as long as possible.

Within big companies the rise of AI might bring whole departments responsible for the ethically and as less-biased as possible data to be “bought” internally by different departments, that are creating AI solutions. The quality of the data (TOP THING as always, right?) will define the quality of the AI solutions and the respective part of the trust that will be given to solution.
Maybe it will mean that we shall finally get the real Data Stewards … haha :)

To be honest I wish that for the development of the non-AI solution people would consider following the same process, as it would result in better, more fair and higher quality products.

Final part

In these little series I tried to list and explore some of the less popular issues that are not exactly under microscope of the current wave of the popular AI thinkers.
I think I might come back to this topic one day later, but for now – that’s all, folks!

Niko Neugebauer

SQL Server, Columnstore, Data Platform & Community

Practical AI Trust & Ethics thoughts, part III

The complexity

Process Key Focus Points

Data Shopping

Final part

Leave a Reply Cancel reply