Factory+: a connected, smart factory driven by data16 May 2022
In an article for The Manufacturer, Lindsay Lee, data scientist at the University of Sheffield AMRC, explains how experts like her are working side-by-side with engineers at the AMRC’s Factory 2050 to show manufacturers how to get the most out of big data.
‘So, what data do you want?’
When a data scientist enters a new field, this is often the first question they are asked. The most common response? Often, it is simply: ‘Well, I don’t know.’
The truth is, data scientists do not just enjoy working with data for data’s sake, but rather relish problem solving – the answers to which are usually found in the data. Without knowing what problem is waiting to be solved, how can we understand what data we need?
Safe and ethical AI
I am new to the field of manufacturing and my first six months at the University of Sheffield Advanced Manufacturing Research Centre (AMRC) has mostly involved understanding how data is captured, collected and stored, and what sort of data-related problems engineers are facing.
Working in the technology readiness level (TRL) scale between academia and industry means that everything the AMRC does has to work in production. For a data scientist, this puts a huge emphasis on the emerging field of safe and ethical artificial intelligence (AI), especially where high-risk decisions are to be made. We have a basic ethical understanding that everything we do has to be fair and unbiased - this means we consider the data that has not been collected as well as the data that has and recognise the implications of this. We are all aware that if you ‘collect’ the right data you can manipulate the results as you wish but, as a data scientist, we have an ethical obligation to make sure that the results are not simply a result of the data collection method.
Data scientists are also trained in a number of analysis techniques and are able to code this up so that we are not at the mercy of the available software and the limited options this can bring. For me, an interesting part of the safe and ethical movement is the push to move away from ‘black box’ analysis and ensuring that there is some sensible interpretation of the analysis, so that even a right answer hasn’t been found for the wrong reason. Ensuring interpretability of results is important for every step of the process. Why have we collected the data we have? What data can’t we collect? What different analyses have been conducted? How can the results be interpreted? What happens when things go wrong and how do you identify it?
As far as I am aware, there is currently no piece of software used in industry that completely covers this, only people. Those people are us, the data scientists. At the AMRC we are trying to tackle this with the Factory+ project.